Skip to content

BUG: DataFrame.update not operating in-place for datetime64[ns, UTC] dtype #56228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -520,7 +520,7 @@ Indexing

Missing
^^^^^^^
-
- Bug in :meth:`DataFrame.update` wasn't updating in-place for tz-aware datetime64 dtypes (:issue:`56227`)
-

MultiIndex
Expand Down
14 changes: 6 additions & 8 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -8838,14 +8838,14 @@ def update(
in the original dataframe.

>>> df = pd.DataFrame({'A': [1, 2, 3],
... 'B': [400, 500, 600]})
... 'B': [400., 500., 600.]})
Comment on lines 8840 to +8841
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing these to floats, because otherwise we'd get

<ipython-input-3-3ae759fa2c19>:1: FutureWarning: Downcasting behavior in Series and DataFrame methods 'where', 'mask', and 'clip' is deprecated. In a future version this will not infer object dtypes or cast all-round floats to integers. Instead call result.infer_objects(copy=False) for object inference, or cast round floats explicitly. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`

which looks correct because we're trying to update an int column df['B'] with a float one new_df['B']

>>> new_df = pd.DataFrame({'B': [4, np.nan, 6]})
>>> df.update(new_df)
>>> df
A B
0 1 4
1 2 500
2 3 6
A B
0 1 4.0
1 2 500.0
2 3 6.0
"""
if not PYPY and using_copy_on_write():
if sys.getrefcount(self) <= REF_COUNT:
Expand All @@ -8862,8 +8862,6 @@ def update(
stacklevel=2,
)

from pandas.core.computation import expressions

# TODO: Support other joins
if join != "left": # pragma: no cover
raise NotImplementedError("Only left join is supported")
Expand Down Expand Up @@ -8897,7 +8895,7 @@ def update(
if mask.all():
continue

self.loc[:, col] = expressions.where(mask, this, that)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I suppose numexpr generally will prevent inplace operations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being inplace here is generally not a good idea since non inplace methods are faster if you replace the whole column

self.loc[:, col] = self[col].where(mask, that)

# ----------------------------------------------------------------------
# Data reshaping
Expand Down
16 changes: 16 additions & 0 deletions pandas/tests/frame/methods/test_update.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,22 @@ def test_update_datetime_tz(self):
expected = DataFrame([pd.Timestamp("2019", tz="UTC")])
tm.assert_frame_equal(result, expected)

def test_update_datetime_tz_in_place(self, using_copy_on_write, warn_copy_on_write):
# https://github.com/pandas-dev/pandas/issues/56227
result = DataFrame([pd.Timestamp("2019", tz="UTC")])
orig = result.copy()
view = result[:]
with tm.assert_produces_warning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use assert_cow_warning here? That's optimised for these types of warnings and requires only warn_copy_on_write as input

FutureWarning if warn_copy_on_write else None, match="Setting a value"
):
result.update(result + pd.Timedelta(days=1))
expected = DataFrame([pd.Timestamp("2019-01-02", tz="UTC")])
tm.assert_frame_equal(result, expected)
if not using_copy_on_write:
tm.assert_frame_equal(view, expected)
else:
tm.assert_frame_equal(view, orig)

def test_update_with_different_dtype(self, using_copy_on_write):
# GH#3217
df = DataFrame({"a": [1, 3], "b": [np.nan, 2]})
Expand Down