Skip to content

DOC: Series.update throws a FutureWarning about def[col] = df[col].method but .update returns None and works inplace #59788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
spawn-guy opened this issue Sep 13, 2024 · 9 comments
Labels
Copy / view semantics Needs Discussion Requires discussion from core team before further action Warnings Warnings that appear or should be added to pandas

Comments

@spawn-guy
Copy link

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/dev/reference/api/pandas.Series.update.html#pandas.Series.update

Documentation problem

df.update resembles how python.dict.update works, but df.update doesn't support CoW

Suggested fix for documentation

remove FutureWarning for the df.update

or create a (for example) df.coalesce method that will, actually, return something. this shouldn't brake existing code

@spawn-guy spawn-guy added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 13, 2024
@rhshadrach
Copy link
Member

df.update doesn't support CoW

Thanks for the report - can you provide a reproducible example on how CoW is not supported.

@rhshadrach rhshadrach added Copy / view semantics Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 14, 2024
@spawn-guy
Copy link
Author

@rhshadrach here is some code and log

# select best source: heading
# HeadingTrue > HeadingMagnetic > HeadingAndDeclination (this is also magnetic) > TrackMadeGood
measurements_df["heading"] = measurements_df["gps_course_over_ground"]
# replace if other value is not nan
measurements_df["heading"].update(measurements_df["gps_heading"])

FutureWarning

_task.py:427: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  measurements_df["heading"].update(measurements_df["gps_heading"])

Inconsistency with the warning:

  • update is always "inplace=True"
  • there is no df[col] = df[col].update(value) that actually "returns" something

@rhshadrach
Copy link
Member

Thanks @spawn-guy, however your example is not reproducible because you did not provide measurements_df. Can you provide a reproducible example?

@spawn-guy
Copy link
Author

spawn-guy commented Nov 26, 2024

@rhshadrach it took me some time to pick this up, but here is a small test. at first i thought it might be related to the mask that i use, but the FutureWarning is thrown without it as well

import numpy as np
import pandas as pd

# test pandas warnings
df = pd.DataFrame(
    {
        "A": [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
        "B": [1, 1, 1, 1, 1, 1],
        "C": [np.nan, 5, 6, np.nan, np.nan, np.nan],
        "D": [0, 0, 2, 2, 0, 0],
    }
)

# with mask
# df = df[df["D"] > 0]

df["E"] = df["A"]
df["E"].update(df["B"])
# df["E"].update(df["C"])
print(df)

results in

cli_python_311_upgrade_test.py:209: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df["E"].update(df["B"])

    A  B    C  D    E
0 NaN  1  NaN  0  1.0
1 NaN  1  5.0  0  1.0
2 NaN  1  6.0  2  1.0
3 NaN  1  NaN  2  1.0
4 NaN  1  NaN  0  1.0
5 NaN  1  NaN  0  1.0

the FutureWarning is thrown after df["E"].update(df["B"])

so, in current implementation, i don't see a way to fix this FutureWarning for the reasons mentioned above

@spawn-guy
Copy link
Author

spawn-guy commented Nov 26, 2024

and if i do as the warning suggests - it will be a mistake

import numpy as np
import pandas as pd

# test pandas warnings
df = pd.DataFrame(
    {
        "A": [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
        "B": [1, 1, 1, 1, 1, 1],
        "C": [np.nan, 5, 6, np.nan, np.nan, np.nan],
        "D": [0, 0, 2, 2, 0, 0],
    }
)

# with mask
# df = df[df["D"] > 0]

df["E"] = df["A"]
df["E"].update(df["B"])
# df["E"].update(df["C"])
print(df)

df["E"] = df["A"]
df["E"] = df["E"].update(df["C"])
print(df)

output

cli_python_311_upgrade_test.py:209: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df["E"].update(df["B"])
    A  B    C  D    E
0 NaN  1  NaN  0  1.0
1 NaN  1  5.0  0  1.0
2 NaN  1  6.0  2  1.0
3 NaN  1  NaN  2  1.0
4 NaN  1  NaN  0  1.0
5 NaN  1  NaN  0  1.0

    A  B    C  D     E
0 NaN  1  NaN  0  None
1 NaN  1  5.0  0  None
2 NaN  1  6.0  2  None
3 NaN  1  NaN  2  None
4 NaN  1  NaN  0  None
5 NaN  1  NaN  0  None
cli_python_311_upgrade_test.py:214: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df["E"] = df["E"].update(df["C"])

notice the all-None column E

@rhshadrach
Copy link
Member

rhshadrach commented Dec 2, 2024

Thanks for the example. To do this operation, you'd need to have something like:

ser = df["E"]
ser.update(df["C"])
df["E"] = ser

or create a (for example) df.coalesce method that will, actually, return something. this shouldn't brake existing code

I think you're looking for combine_first

@jorisvandenbossche - thoughts on changing the warning message here? It would likely need to go into 2.3, yea?

@rhshadrach rhshadrach added Needs Discussion Requires discussion from core team before further action Warnings Warnings that appear or should be added to pandas and removed Needs Info Clarification about behavior needed to assess issue Docs labels Dec 2, 2024
@jmarshall9120
Copy link

jmarshall9120 commented Apr 22, 2025

@rhshadrach - I want to take a sec to throw my support behind OP on this one. I just got this warning. I don't use df.update much but seeing the warning I'm very confused about its future use. I was using the syntax:

df[col].update(df[col].transform(myTransformFunc))

I'm really not sure how the api being anymore difficult than this would add anything. At the very least, add a return value from df.update so its consistent with other methods or remove the warning on this one. As it its, its completely unclear to me how to make this warning go away.

EDIT
After working on this for a little bit, I don't believe there is a way to use df.update without triggering the warning. Its a bug.

@rhshadrach
Copy link
Member

I'm really not sure how the api being anymore difficult than this would add anything.

I believe it is not achievable with Copy-on-Write.

At the very least, add a return value from df.update so its consistent with other methods or remove the warning on this one.

I'd be positive on this, and is somewhat related to #51466, but I'm not sure if the case of update was explicitly considered there since it doesn't have an inplace argument but just always acts inplace. I've asked this to the core devs in that PR.

After working on this for a little bit, I don't believe there is a way to use df.update without triggering the warning. Its a bug.

Please provide a reproducible example. I am not seeing any warning on 2.2.x using the first example from the docs.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Apr 23, 2025

There's a long discussion about DataFrame.update() and inplace at this issue #21858

There was a related PR #22286 that was eventually closed because at the time, there wasn't agreement on how to proceed.

Note: I came across this because of some code my staff was writing that put update() in a method chain, and it didn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Copy / view semantics Needs Discussion Requires discussion from core team before further action Warnings Warnings that appear or should be added to pandas
Projects
None yet
Development

No branches or pull requests

4 participants