@@ -5454,64 +5454,69 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
5454
5454
limit = limit , downcast = downcast )
5455
5455
5456
5456
_shared_docs ['replace' ] = ("""
5457
- Replace values given in 'to_replace' with 'value'.
5457
+ Replace values given in `to_replace` with `value`.
5458
+
5459
+ Values of the %(klass)s are replaced with other values dynamically.
5460
+ This differs from updating with ``.loc`` or ``.iloc``, which require
5461
+ you to specify a location to update with some value.
5458
5462
5459
5463
Parameters
5460
5464
----------
5461
- to_replace : str, regex, list, dict, Series, numeric, or None
5465
+ to_replace : str, regex, list, dict, Series, int, float, or None
5466
+ How to find the values that will be replaced.
5462
5467
5463
5468
* numeric, str or regex:
5464
5469
5465
- - numeric: numeric values equal to `` to_replace` ` will be
5466
- replaced with `` value` `
5467
- - str: string exactly matching `` to_replace` ` will be replaced
5468
- with `` value` `
5469
- - regex: regexs matching `` to_replace` ` will be replaced with
5470
- `` value` `
5470
+ - numeric: numeric values equal to `to_replace` will be
5471
+ replaced with `value`
5472
+ - str: string exactly matching `to_replace` will be replaced
5473
+ with `value`
5474
+ - regex: regexs matching `to_replace` will be replaced with
5475
+ `value`
5471
5476
5472
5477
* list of str, regex, or numeric:
5473
5478
5474
- - First, if `` to_replace`` and `` value` ` are both lists, they
5479
+ - First, if `to_replace` and `value` are both lists, they
5475
5480
**must** be the same length.
5476
5481
- Second, if ``regex=True`` then all of the strings in **both**
5477
5482
lists will be interpreted as regexs otherwise they will match
5478
- directly. This doesn't matter much for `` value` ` since there
5483
+ directly. This doesn't matter much for `value` since there
5479
5484
are only a few possible substitution regexes you can use.
5480
5485
- str, regex and numeric rules apply as above.
5481
5486
5482
5487
* dict:
5483
5488
5484
5489
- Dicts can be used to specify different replacement values
5485
5490
for different existing values. For example,
5486
- {'a': 'b', 'y': 'z'} replaces the value 'a' with 'b' and
5487
- 'y' with 'z'. To use a dict in this way the `` value` `
5488
- parameter should be `` None` `.
5491
+ `` {'a': 'b', 'y': 'z'}`` replaces the value 'a' with 'b' and
5492
+ 'y' with 'z'. To use a dict in this way the `value`
5493
+ parameter should be `None`.
5489
5494
- For a DataFrame a dict can specify that different values
5490
5495
should be replaced in different columns. For example,
5491
- {'a': 1, 'b': 'z'} looks for the value 1 in column 'a' and
5492
- the value 'z' in column 'b' and replaces these values with
5493
- whatever is specified in `` value`` . The `` value` ` parameter
5496
+ `` {'a': 1, 'b': 'z'}`` looks for the value 1 in column 'a'
5497
+ and the value 'z' in column 'b' and replaces these values
5498
+ with whatever is specified in `value`. The `value` parameter
5494
5499
should not be ``None`` in this case. You can treat this as a
5495
5500
special case of passing two lists except that you are
5496
5501
specifying the column to search in.
5497
5502
- For a DataFrame nested dictionaries, e.g.,
5498
- {'a': {'b': np.nan}}, are read as follows: look in column 'a'
5499
- for the value 'b' and replace it with NaN. The `` value` `
5503
+ `` {'a': {'b': np.nan}}`` , are read as follows: look in column
5504
+ 'a' for the value 'b' and replace it with NaN. The `value`
5500
5505
parameter should be ``None`` to use a nested dict in this
5501
5506
way. You can nest regular expressions as well. Note that
5502
5507
column names (the top-level dictionary keys in a nested
5503
5508
dictionary) **cannot** be regular expressions.
5504
5509
5505
5510
* None:
5506
5511
5507
- - This means that the `` regex` ` argument must be a string,
5508
- compiled regular expression, or list, dict, ndarray or Series
5509
- of such elements. If `` value`` is also ``None`` then this
5510
- **must** be a nested dictionary or `` Series`` .
5512
+ - This means that the `regex` argument must be a string,
5513
+ compiled regular expression, or list, dict, ndarray or
5514
+ Series of such elements. If `value` is also ``None`` then
5515
+ this **must** be a nested dictionary or Series.
5511
5516
5512
5517
See the examples section for examples of each of these.
5513
5518
value : scalar, dict, list, str, regex, default None
5514
- Value to replace any values matching `` to_replace` ` with.
5519
+ Value to replace any values matching `to_replace` with.
5515
5520
For a DataFrame a dict of values can be used to specify which
5516
5521
value to use for each column (columns not in the dict will not be
5517
5522
filled). Regular expressions, strings and lists or dicts of such
@@ -5521,45 +5526,50 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
5521
5526
other views on this object (e.g. a column from a DataFrame).
5522
5527
Returns the caller if this is True.
5523
5528
limit : int, default None
5524
- Maximum size gap to forward or backward fill
5525
- regex : bool or same types as `` to_replace` `, default False
5526
- Whether to interpret `` to_replace`` and/or `` value` ` as regular
5527
- expressions. If this is ``True`` then `` to_replace` ` *must* be a
5529
+ Maximum size gap to forward or backward fill.
5530
+ regex : bool or same types as `to_replace`, default False
5531
+ Whether to interpret `to_replace` and/or `value` as regular
5532
+ expressions. If this is ``True`` then `to_replace` *must* be a
5528
5533
string. Alternatively, this could be a regular expression or a
5529
5534
list, dict, or array of regular expressions in which case
5530
- `` to_replace` ` must be ``None``.
5531
- method : string, optional, {'pad', 'ffill', 'bfill'}
5532
- The method to use when for replacement, when `` to_replace` ` is a
5533
- scalar, list or tuple and `` value`` is None.
5535
+ `to_replace` must be ``None``.
5536
+ method : {'pad', 'ffill', 'bfill', `None` }
5537
+ The method to use when for replacement, when `to_replace` is a
5538
+ scalar, list or tuple and `value` is `` None`` .
5534
5539
5535
- .. versionchanged:: 0.23.0
5536
- Added to DataFrame
5540
+ .. versionchanged:: 0.23.0
5541
+ Added to DataFrame.
5542
+ axis : None
5543
+ .. deprecated:: 0.13.0
5544
+ Has no effect and will be removed.
5537
5545
5538
5546
See Also
5539
5547
--------
5540
- %(klass)s.fillna : Fill NA/NaN values
5548
+ %(klass)s.fillna : Fill NA values
5541
5549
%(klass)s.where : Replace values based on boolean condition
5550
+ Series.str.replace : Simple string replacement.
5542
5551
5543
5552
Returns
5544
5553
-------
5545
- filled : %(klass)s
5554
+ %(klass)s
5555
+ Object after replacement.
5546
5556
5547
5557
Raises
5548
5558
------
5549
5559
AssertionError
5550
- * If `` regex`` is not a ``bool`` and `` to_replace` ` is not
5560
+ * If `regex` is not a ``bool`` and `to_replace` is not
5551
5561
``None``.
5552
5562
TypeError
5553
- * If `` to_replace`` is a ``dict`` and `` value` ` is not a ``list``,
5563
+ * If `to_replace` is a ``dict`` and `value` is not a ``list``,
5554
5564
``dict``, ``ndarray``, or ``Series``
5555
- * If `` to_replace`` is ``None`` and `` regex` ` is not compilable
5565
+ * If `to_replace` is ``None`` and `regex` is not compilable
5556
5566
into a regular expression or is a list, dict, ndarray, or
5557
5567
Series.
5558
5568
* When replacing multiple ``bool`` or ``datetime64`` objects and
5559
- the arguments to `` to_replace` ` does not match the type of the
5569
+ the arguments to `to_replace` does not match the type of the
5560
5570
value being replaced
5561
5571
ValueError
5562
- * If a ``list`` or an ``ndarray`` is passed to `` to_replace` ` and
5572
+ * If a ``list`` or an ``ndarray`` is passed to `to_replace` and
5563
5573
`value` but they are not the same length.
5564
5574
5565
5575
Notes
@@ -5573,10 +5583,15 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
5573
5583
numbers *are* strings, then you can do this.
5574
5584
* This method has *a lot* of options. You are encouraged to experiment
5575
5585
and play with this method to gain intuition about how it works.
5586
+ * When dict is used as the `to_replace` value, it is like
5587
+ key(s) in the dict are the to_replace part and
5588
+ value(s) in the dict are the value parameter.
5576
5589
5577
5590
Examples
5578
5591
--------
5579
5592
5593
+ **Scalar `to_replace` and `value`**
5594
+
5580
5595
>>> s = pd.Series([0, 1, 2, 3, 4])
5581
5596
>>> s.replace(0, 5)
5582
5597
0 5
@@ -5585,6 +5600,7 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
5585
5600
3 3
5586
5601
4 4
5587
5602
dtype: int64
5603
+
5588
5604
>>> df = pd.DataFrame({'A': [0, 1, 2, 3, 4],
5589
5605
... 'B': [5, 6, 7, 8, 9],
5590
5606
... 'C': ['a', 'b', 'c', 'd', 'e']})
@@ -5596,20 +5612,24 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
5596
5612
3 3 8 d
5597
5613
4 4 9 e
5598
5614
5615
+ **List-like `to_replace`**
5616
+
5599
5617
>>> df.replace([0, 1, 2, 3], 4)
5600
5618
A B C
5601
5619
0 4 5 a
5602
5620
1 4 6 b
5603
5621
2 4 7 c
5604
5622
3 4 8 d
5605
5623
4 4 9 e
5624
+
5606
5625
>>> df.replace([0, 1, 2, 3], [4, 3, 2, 1])
5607
5626
A B C
5608
5627
0 4 5 a
5609
5628
1 3 6 b
5610
5629
2 2 7 c
5611
5630
3 1 8 d
5612
5631
4 4 9 e
5632
+
5613
5633
>>> s.replace([1, 2], method='bfill')
5614
5634
0 0
5615
5635
1 3
@@ -5618,20 +5638,24 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
5618
5638
4 4
5619
5639
dtype: int64
5620
5640
5641
+ **dict-like `to_replace`**
5642
+
5621
5643
>>> df.replace({0: 10, 1: 100})
5622
5644
A B C
5623
5645
0 10 5 a
5624
5646
1 100 6 b
5625
5647
2 2 7 c
5626
5648
3 3 8 d
5627
5649
4 4 9 e
5650
+
5628
5651
>>> df.replace({'A': 0, 'B': 5}, 100)
5629
5652
A B C
5630
5653
0 100 100 a
5631
5654
1 1 6 b
5632
5655
2 2 7 c
5633
5656
3 3 8 d
5634
5657
4 4 9 e
5658
+
5635
5659
>>> df.replace({'A': {0: 100, 4: 400}})
5636
5660
A B C
5637
5661
0 100 5 a
@@ -5640,45 +5664,87 @@ def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
5640
5664
3 3 8 d
5641
5665
4 400 9 e
5642
5666
5667
+ **Regular expression `to_replace`**
5668
+
5643
5669
>>> df = pd.DataFrame({'A': ['bat', 'foo', 'bait'],
5644
5670
... 'B': ['abc', 'bar', 'xyz']})
5645
5671
>>> df.replace(to_replace=r'^ba.$', value='new', regex=True)
5646
5672
A B
5647
5673
0 new abc
5648
5674
1 foo new
5649
5675
2 bait xyz
5676
+
5650
5677
>>> df.replace({'A': r'^ba.$'}, {'A': 'new'}, regex=True)
5651
5678
A B
5652
5679
0 new abc
5653
5680
1 foo bar
5654
5681
2 bait xyz
5682
+
5655
5683
>>> df.replace(regex=r'^ba.$', value='new')
5656
5684
A B
5657
5685
0 new abc
5658
5686
1 foo new
5659
5687
2 bait xyz
5688
+
5660
5689
>>> df.replace(regex={r'^ba.$':'new', 'foo':'xyz'})
5661
5690
A B
5662
5691
0 new abc
5663
5692
1 xyz new
5664
5693
2 bait xyz
5694
+
5665
5695
>>> df.replace(regex=[r'^ba.$', 'foo'], value='new')
5666
5696
A B
5667
5697
0 new abc
5668
5698
1 new new
5669
5699
2 bait xyz
5670
5700
5671
5701
Note that when replacing multiple ``bool`` or ``datetime64`` objects,
5672
- the data types in the `` to_replace` ` parameter must match the data
5702
+ the data types in the `to_replace` parameter must match the data
5673
5703
type of the value being replaced:
5674
5704
5675
5705
>>> df = pd.DataFrame({'A': [True, False, True],
5676
5706
... 'B': [False, True, False]})
5677
5707
>>> df.replace({'a string': 'new value', True: False}) # raises
5708
+ Traceback (most recent call last):
5709
+ ...
5678
5710
TypeError: Cannot compare types 'ndarray(dtype=bool)' and 'str'
5679
5711
5680
5712
This raises a ``TypeError`` because one of the ``dict`` keys is not of
5681
5713
the correct type for replacement.
5714
+
5715
+ Compare the behavior of ``s.replace({'a': None})`` and
5716
+ ``s.replace('a', None)`` to understand the pecularities
5717
+ of the `to_replace` parameter:
5718
+
5719
+ >>> s = pd.Series([10, 'a', 'a', 'b', 'a'])
5720
+
5721
+ When one uses a dict as the `to_replace` value, it is like the
5722
+ value(s) in the dict are equal to the `value` parameter.
5723
+ ``s.replace({'a': None})`` is equivalent to
5724
+ ``s.replace(to_replace={'a': None}, value=None, method=None)``:
5725
+
5726
+ >>> s.replace({'a': None})
5727
+ 0 10
5728
+ 1 None
5729
+ 2 None
5730
+ 3 b
5731
+ 4 None
5732
+ dtype: object
5733
+
5734
+ When ``value=None`` and `to_replace` is a scalar, list or
5735
+ tuple, `replace` uses the method parameter (default 'pad') to do the
5736
+ replacement. So this is why the 'a' values are being replaced by 10
5737
+ in rows 1 and 2 and 'b' in row 4 in this case.
5738
+ The command ``s.replace('a', None)`` is actually equivalent to
5739
+ ``s.replace(to_replace='a', value=None, method='pad')``:
5740
+
5741
+ >>> s.replace('a', None)
5742
+ 0 10
5743
+ 1 10
5744
+ 2 10
5745
+ 3 b
5746
+ 4 b
5747
+ dtype: object
5682
5748
""" )
5683
5749
5684
5750
@Appender (_shared_docs ['replace' ] % _shared_doc_kwargs )
0 commit comments