QST: How to solve pandas (2.2.0) "FutureWarning: Downcasting behavior in `replace` is deprecated" on a Series? #57734

buhtz · 2024-03-05T14:17:44Z

Research

I have searched the [pandas] tag on StackOverflow for similar questions.
I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

https://stackoverflow.com/q/77995105/4865723

Question about pandas

Hello,

and please take my apologize for asking this way. My stackoverflow
question [1] was closed for IMHO no good reason. The linked duplicates
do not help me [2]. And I was also asking on pydata mailing list [3] without response.

The example code below gives me this error using Pandas 2.2.0

FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version.
To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior,
set `pd.set_option('future.no_silent_downcasting', True)` s = s.replace(replace_dict)

I found several postings about this future warning. But my problem is I
don't understand why it happens and I also don't know how to solve it.

#!/usr/bin/python3
from pandas import Series
s = Series(['foo', 'bar'])
replace_dict = {'foo': 2, 'bar': 4}
s = s.replace(replace_dict)

I am aware of other questions and answers [2] but I don't know how to
apply them to my own code. The reason might be that I do not understand
the cause of the error.

The linked answers using astype() before replacement. But again: I
don't know how this could solve my problem.

Thanks in advance
Christian

[1] -- https://stackoverflow.com/q/77995105/4865723
[2] -- https://stackoverflow.com/q/77900971/4865723
[3] -- https://groups.google.com/g/pydata/c/yWbl4zKEqSE

The text was updated successfully, but these errors were encountered:

phofl · 2024-03-06T21:55:47Z

Just do

pandas.set_option("future.no_silent_downcasting", True)

as suggested on the stack overflow question

The series will retain object dtype in pandas 3.0 instead of casting to int64

buhtz · 2024-03-07T07:53:25Z

pandas.set_option("future.no_silent_downcasting", True)

But doesn't this just deactivate the message but doesn't modify the behavior.

To my understanding the behavior is the problem and need to get solved. Or not?
My intention is to extinguish the fire and not just turn off the fire alarm but let the house burn down.

jerome-white · 2024-03-13T11:55:36Z

I'm having this problem as well. I have the feeling it's related to .replace changing the types of the values (as one Stack Overflow commenter implied). Altering the original example slightly:

s = Series(['foo', 'bar'])
replace_dict = {'foo': '1', 'bar': '2'} # replacements maintain original types
s = s.replace(replace_dict)

makes the warning go away.

I agree with @buhtz in that setting the "future" option isn't really getting at the root of understanding how to make this right. I think the hard part for most of us who have relied on .replace is that we never thought of it as doing any casting -- it was replacing. Now the semantics seem to have changed. It'd be great to reopen this issue to clarify the thinking, intention, and direction so that we can come up with appropriate work-arounds.

phofl · 2024-03-13T11:57:45Z

s that we never thought of it as doing any casting

This is exactly the thing we are trying to solve. replace was previously casting your dtypes and will stop doing so in pandas 3

buhtz · 2024-03-13T12:02:17Z

This is exactly the thing we are trying to solve. replace was previously casting your dtypes and will stop doing so in pandas 3

But it is unclear how to replace and cast. E.g. when I have [0, 1] integers they stand for female and male.

df.gender = df.gender.astype(str)
df.gender = df.gender.replace({'0': 'male', '1': 'female'})

Is that the solution you have in mind? From a users perspective it is a smelling workaround.

The other way around is nearly not possible because I can not cast a str word to an integer.

print(df.gender)  # ['male', 'male', 'female']
df.gender = df.gender.astype(int)  # <-- ERROR
df.gender = df.gender.replace({'male': 0, 'female': 1})

What is wrong with casting in replace() ?

jerome-white · 2024-03-13T13:48:07Z

The other way around is nearly not possible because I can not cast a str word to an integer.

One alternative (although I realise a non .replace supported "alternative" may not be what was actually desired) is to use categoricals with .assign:

import pandas as pd

df = pd.DataFrame(['male', 'male', 'female'], columns=['gender']) # from the original example
genders = pd.Categorical(df['gender'])
df = df.assign(gender=genders.codes)

If semantically similar data is spread across multiple columns, it gets a little more involved:

import random
import numpy as np
import pandas as pd

def create_data(columns):
    genders = ['male', 'male', 'female']
    for i in columns:
        yield (i, genders.copy())
        random.shuffle(genders)

# Create the dataframe
columns = [ f'gender_{x}' for x in range(3) ]
df = pd.DataFrame(dict(create_data(columns)))

# Incorporate all relevant data into the categorical
view = (df
        .filter(items=columns)
	.unstack())
categories = pd.Categorical(view)
values = np.hsplit(categories.codes, len(columns))
to_replace = dict(zip(columns, values))

df = df.assign(**to_replace)

which I think is what the Categorical documentation is trying to imply.

See pandas-dev/pandas#57734

caballerofelipe · 2024-05-07T17:46:24Z

I got here, trying to understand what pd.set_option('future.no_silent_downcasting', True) does.

The message I get is from .fillna(), which is the same message for .ffill() and .bfill(). So I'm posting this here in case someone is looking for the same answer using the mentioned functions. This is the warning message I get:

FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version.
Call result.infer_objects(copy=False) instead.
To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`

Maybe the confusion arises from the way the message is phrased, I believe it's kind of confusing, it creates more questions than answers:

Do I need to do some downcasting?
Am I doing some downcasting somewhere where I am not aware?
When the messages stated Call result.infer_objects(copy=False) instead., is it telling me to call it before the function I'm trying to use, after? Is it telling me not to use the function? (I guess not since infer_objects should do something different than replace or one of the fill functions)
By using pd.set_option('future.no_silent_downcasting', True) am I removing the downcasting or am I making the downcasting not silent? Maybe both?

From what I understand, pd.set_option('future.no_silent_downcasting', True) removes the downcasting the functions do and if it needs to do some downcasting an error would be raised, but I would need to be corrected here if I'm wrong.

caballerofelipe · 2024-05-07T23:52:06Z

So... I did some digging and I think I have a better grasp of what's going on with this FutureWarning. So I wrote an article in Medium to explain what's happening. If you want to give it a read, here it is:

Deciphering the cryptic FutureWarning for .fillna in Pandas 2

Long story short, do:

with pd.option_context('future.no_silent_downcasting', True):
    # Do you thing with fillna, ffill, bfill, replace... and possible use infer_objects if needed

- In function `_cols_operation_balance_by_instrument_for_group` changed `prev_operation_balance[<colname>]` for `df.loc[prev_idx, <colname>]` as this is easier to understand, it shows that we are accessing the previous index value. - Implemented the usage of `with pd.option_context('future.no_silent_downcasting', True):` for `.fillna()` to avoid unexpected downcasting. See pandas-dev/pandas#57734 (comment) . Used throughout `cols_operation*` functions. - Removed usage of `DataFrame.convert_dtypes()` as it doesn't simplify dtypes, it only passes to a dtype that supports pd.NA. See pandas-dev/pandas#58543 . - Added `DataFrame.infer_objects()` when returning the ledger or `cols_operation*` functions to try to avoid objects if possible. - Changed the structure for `cols_operation*` functions: - Added a verification of `self._ledger_df`, if empty the function returns an empty DataFrame with the structure needed. Allows for less computing if empty. - The way the parameter `show_instr_accnt` creates a return with columns ['instrument', 'account'] is structured the same way on all functions. - Simplified how the empty ledger is created in `_create_empty_ledger_df`. - Changes column name 'balance sell profit loss' to 'accumulated sell profit loss'. - Minor code fixes. - Minor formatting fixes.

jerome-white · 2024-05-15T06:58:23Z

I feel like this thread is starting to become a resource. In that spirit:

I just experienced another case where .replace would have been amazing, but I now need an alternative: a column of strings that are meant to be floats, where the only "offending" values are empty strings (meant to be NaN's). Consider:

records = [
    {'a': ''},
    {'a': 12.3},
]
df = pd.DataFrame.from_records(records)

I would have first reached for .replace. Now I consider .filla, but that doesn't work either. Using .assign with .to_numeric does the trick:

In [1]: df.dtypes
Out[1]: 
a    object
dtype: object

In [2]: x = df.assign(a=lambda x: pd.to_numeric(x['a']))

In [3]: x
Out[3]: 
      a
0   NaN
1  12.3

In [4]: x.dtypes
Out[4]: 
a    float64
dtype: object

caballerofelipe · 2024-05-20T16:08:23Z

From your code:

x = df.assign(a=lambda x: pd.to_numeric(x['a']))

I would do it like this, it feels a little cleaner and easier to read:

df['a'] = pd.to_numeric(df['a'])

You said you wanted to use replace, if you want to use it, you can do this:

with pd.option_context('future.no_silent_downcasting', True):
    df2 = (df
           .replace('', float('nan')) # Replace empty string for nans
           .infer_objects()           # Allow pandas to try to "infer better dtypes"
           )

df2.dtypes

# a    float64
# dtype: object

A note about

Now I consider .filla, but that doesn't work either.

That would not work because .fillna fills na values but '' (empty string) is not na. (see Filling missing data).

Data-Salad · 2024-07-06T00:45:25Z

explicitly do the conversion in two steps and the future warning will go away.

In the first step, do the replace with the numbers as strings to match the original dtype
replace_dict = {'foo': '2', 'bar': '4'}

in the second step, convert the dtype to int
s = s.replace(replace_dict).astype(int)

This will run without the warning even when you have not suppressed warnings

daviewales · 2024-08-08T02:37:52Z

I got this because I was trying to filter a dataframe using the output from Series.str.isnumeric().
My dataframe contained NA values, so the resulting mask contained NA values.
Normally I use fillna(False) to get rid of these.

What I would normally do:

df = pd.DataFrame({'A': ['1', '2', 'test', pd.NA]})
mask = df['A'].str.isnumeric().fillna(False)

What I need to do now:

df = pd.DataFrame({'A': ['1', '2', 'test', pd.NA]})
with pd.option_context('future.no_silent_downcasting', True):
    mask = df['A'].str.isnumeric().fillna(False)

The mask still seems to work without casting it to boolean.

See the official deprecation notice in the release notes.

Note that if you don't mind either way, the original code still works, and will silently downcast the dtype (with a warning) until Pandas 3.0, then will switch to preserve the dtype after Pandas 3.0.

albertbuchard · 2024-10-29T17:17:32Z

It would be great if we could stop breaking changes.

Arnaudno · 2024-11-01T13:47:49Z

Hello Folks,
Thanks for your helpful previous answers.
Especially

explicitly do the conversion in two steps and the future warning will go away.

In the first step, do the replace with the numbers as strings to match the original dtype replace_dict = {'foo': '2', 'bar': '4'}

in the second step, convert the dtype to int s = s.replace(replace_dict).astype(int)

This will run without the warning even when you have not suppressed warnings

This works well when going trom string to int; but i struggle to go from string to bool :

df=pd.DataFrame({'a':['','X','','X','X']})

I want to replace '' with False & 'X' with True

Trying to go from string to bool directly

d_string= {'':'True','X':'False'}
s = df['a'].replace(d_string)
print (s)
0     True
1    False
2     True
3    False
4    False
Name: a, dtype: object

print(s.astype('bool')) # this doesn't work
0    True
1    True
2    True
3    True
4    True
Name: a, dtype: bool

going from string to int to bool works; but isn't there a better solution ? i must be missing something obvious right ?

d_int = {'':'0','X':'1'}
s = df['a'].replace(d_int)
print (s)
0    0
1    1
2    0
3    1
4    1
Name: a, dtype: object

print(s.astype('int'))
0    0
1    1
2    0
3    1
4    1
Name: a, dtype: int64

print(s.astype('int').astype('bool')) # --> this is the only solution i found using replace
 without triggering the downcasting warning
0    False
1     True
2    False
3     True
4     True
Name: a, dtype: bool

Data-Salad · 2024-11-01T18:06:49Z

Arnaudno,

You're doing more work than you need to. The boolean value of all strings except for empty strings is true (empty strings have a boolean value of false). So in your case, you don't need the replace at all. All you need is to convert the string to boolean and you will get the result you want.

df=pd.DataFrame({'a':['','X','','X','X']})
s = df['a'].astype('bool')
print(s)

and you will get

0 False
1 True
2 False
3 True
4 True
Name: a, dtype: bool

Arnaudno · 2024-11-02T12:49:53Z

Thank you very much @Data-Salad .

herzphi · 2024-12-21T10:07:19Z

Instead of .replace you can also use .map

ghost · 2024-12-21T16:42:43Z

Những chuyện này là sao. Nói hỗ trợ giúp đỡ. Nhưng làm văn bản. Công thức. Không có lời dẫn hay thuyết minh. Rồi cuối cùng những thông tin vừa xong lãi vỡ ra. Lại bị thay đổi 1 lần nữa. 3 cái diện thoại giờ nó dùng k khác gì thập niên 80. Thông tin cá nhân giờ thể xác thực. Đến số điện thoại. Hiện tại cũng không thể chứng minh nó là của mình. Nói không tin. Không hợp tác phối hợp. Vậy 1 tháng qua tôi trải qua những gì ai biết không??? Vào Th 7, 21 thg 12, 2024 lúc 17:08 herzphi ***@***.***> đã viết:

…

Instead of .replace you can also use .map — Reply to this email directly, view it on GitHub <#57734 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKUN6KFMDA6CMZOAEQRIQ232GU4Z5AVCNFSM6AAAAABEHHIQSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNJYGA3TENBTGI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

satk0 · 2024-12-27T14:21:13Z

To sum up, to resolve this warning, values of the same type should be used and then they can be cast to e.g. int, like that:

replace_dict = {'foo': '2', 'bar': '4'}  # keys are str and so must be values
s = s.replace(replace_dict).astype(int)  # cast values to int

DrNickBailey · 2025-01-06T11:36:38Z

This solution is not perfect for those of us what have empty rows in the column they are replacing (I have a column of occasional text labels which I was converting to int's for plotting purposes).

In this case .astype(int) can't cope with empty or nan, so .astype(float) is required.

SamLovesHoneyWater · 2025-01-07T01:23:17Z

In my case, I had to convert strings to either bools or nan depending on the value of the string.

Instead of suppressing the warning resulting from using replace(), I think it is generally better to use map() when you need to both change dtypes and deal with null values. Given:

print(df['has_attribute'])
Name
A    Yes
B    No
C    Unknown
Name: has_attribute, Length: 3, dtype: object

I do:

df = df.map(lambda x: True if x == 'Yes' else (False if x == 'No' else np.nan))
print(df['has_attribute'])

Which, without any warnings, results in:

Name
A    True
B    False
C    NaN
Name: has_attribute, Length: 3, dtype: object

buhtz added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Mar 5, 2024

phofl closed this as completed Mar 6, 2024

jerome-white added a commit to jerome-white/language-model-bda that referenced this issue Mar 14, 2024

Updates for DataFrame.replace deprecation

39d5831

See pandas-dev/pandas#57734

caballerofelipe mentioned this issue May 9, 2024

BUG: convert_dtypes() doesn't convert after a previous conversion was done #58543

Closed

3 tasks

druzsan mentioned this issue Aug 13, 2024

Update deprecated code druzsan/justetf-scraping#12

Closed

agmoore4 mentioned this issue Sep 11, 2024

FutureWarning when downcasting with pandas replace() function in AIT tutorial sandialabs/pvOps#99

Closed

WillAyd mentioned this issue Sep 24, 2024

ENH: Restore the functionality of .fillna #59831

Open

3 tasks

jorisvandenbossche added Docs and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 5, 2024

This was referenced Nov 20, 2024

Pandas 3.0 future warning on downcasting spam pcdshub/engineering_tools#224

Closed

grep_more_ioc fix pandas silent downcasting spam pcdshub/engineering_tools#225

Merged

n-poulsen mentioned this issue Mar 24, 2025

DeepLabCut: Fix pandas FutureWarning in evaluate_multianimal.py (inplace=True in .replace()) DeepLabCut/DeepLabCut#2927

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QST: How to solve pandas (2.2.0) "FutureWarning: Downcasting behavior in `replace` is deprecated" on a Series? #57734

QST: How to solve pandas (2.2.0) "FutureWarning: Downcasting behavior in `replace` is deprecated" on a Series? #57734

buhtz commented Mar 5, 2024

phofl commented Mar 6, 2024

buhtz commented Mar 7, 2024

jerome-white commented Mar 13, 2024

phofl commented Mar 13, 2024

buhtz commented Mar 13, 2024 •

edited

Loading

jerome-white commented Mar 13, 2024 •

edited

Loading

caballerofelipe commented May 7, 2024

caballerofelipe commented May 7, 2024 •

edited

Loading

jerome-white commented May 15, 2024

caballerofelipe commented May 20, 2024

Data-Salad commented Jul 6, 2024

daviewales commented Aug 8, 2024 •

edited

Loading

albertbuchard commented Oct 29, 2024

Arnaudno commented Nov 1, 2024 •

edited

Loading

Data-Salad commented Nov 1, 2024

Arnaudno commented Nov 2, 2024

herzphi commented Dec 21, 2024

ghost commented Dec 21, 2024 via email

satk0 commented Dec 27, 2024

DrNickBailey commented Jan 6, 2025

SamLovesHoneyWater commented Jan 7, 2025 •

edited

Loading

QST: How to solve pandas (2.2.0) "FutureWarning: Downcasting behavior in replace is deprecated" on a Series? #57734

QST: How to solve pandas (2.2.0) "FutureWarning: Downcasting behavior in replace is deprecated" on a Series? #57734

Comments

buhtz commented Mar 5, 2024

Research

Link to question on StackOverflow

Question about pandas

phofl commented Mar 6, 2024

buhtz commented Mar 7, 2024

jerome-white commented Mar 13, 2024

phofl commented Mar 13, 2024

buhtz commented Mar 13, 2024 • edited Loading

jerome-white commented Mar 13, 2024 • edited Loading

caballerofelipe commented May 7, 2024

caballerofelipe commented May 7, 2024 • edited Loading

jerome-white commented May 15, 2024

caballerofelipe commented May 20, 2024

Data-Salad commented Jul 6, 2024

daviewales commented Aug 8, 2024 • edited Loading

albertbuchard commented Oct 29, 2024

Arnaudno commented Nov 1, 2024 • edited Loading

Data-Salad commented Nov 1, 2024

Arnaudno commented Nov 2, 2024

herzphi commented Dec 21, 2024

ghost commented Dec 21, 2024 via email

satk0 commented Dec 27, 2024

DrNickBailey commented Jan 6, 2025

SamLovesHoneyWater commented Jan 7, 2025 • edited Loading

QST: How to solve pandas (2.2.0) "FutureWarning: Downcasting behavior in `replace` is deprecated" on a Series? #57734

QST: How to solve pandas (2.2.0) "FutureWarning: Downcasting behavior in `replace` is deprecated" on a Series? #57734

buhtz commented Mar 13, 2024 •

edited

Loading

jerome-white commented Mar 13, 2024 •

edited

Loading

caballerofelipe commented May 7, 2024 •

edited

Loading

daviewales commented Aug 8, 2024 •

edited

Loading

Arnaudno commented Nov 1, 2024 •

edited

Loading

SamLovesHoneyWater commented Jan 7, 2025 •

edited

Loading