Skip to content

BUG: DataFrame.update not operating in-place for datetime64[ns, UTC] dtype #56228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 29, 2023

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Nov 28, 2023

@MarcoGorelli MarcoGorelli force-pushed the update-actually-inplace branch from 626efcc to 8e4b65c Compare November 28, 2023 17:50
Comment on lines 8840 to +8841
>>> df = pd.DataFrame({'A': [1, 2, 3],
... 'B': [400, 500, 600]})
... 'B': [400., 500., 600.]})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing these to floats, because otherwise we'd get

<ipython-input-3-3ae759fa2c19>:1: FutureWarning: Downcasting behavior in Series and DataFrame methods 'where', 'mask', and 'clip' is deprecated. In a future version this will not infer object dtypes or cast all-round floats to integers. Instead call result.infer_objects(copy=False) for object inference, or cast round floats explicitly. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`

which looks correct because we're trying to update an int column df['B'] with a float one new_df['B']

@MarcoGorelli MarcoGorelli marked this pull request as ready for review November 28, 2023 17:52
@MarcoGorelli MarcoGorelli marked this pull request as draft November 28, 2023 17:52
@MarcoGorelli
Copy link
Member Author

@phofl any idea what the copy-on-write warnings CI jobs are failing here?

@MarcoGorelli MarcoGorelli added inplace Relating to inplace parameter or equivalent Copy / view semantics labels Nov 28, 2023
@@ -8876,8 +8874,8 @@ def update(
other = other.reindex(self.index)

for col in self.columns.intersection(other.columns):
this = self[col]._values
that = other[col]._values
this = self[col]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is responsible, is this necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice one, thanks! 🙏

@MarcoGorelli MarcoGorelli marked this pull request as ready for review November 28, 2023 20:44
@MarcoGorelli MarcoGorelli changed the title inplace update BUG: DataFrame.update not operating in-place for datetime64[ns, UTC] dtype Nov 28, 2023
@mroeschke mroeschke added this to the 2.2 milestone Nov 29, 2023
@@ -8897,7 +8895,7 @@ def update(
if mask.all():
continue

self.loc[:, col] = expressions.where(mask, this, that)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I suppose numexpr generally will prevent inplace operations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being inplace here is generally not a good idea since non inplace methods are faster if you replace the whole column

@mroeschke mroeschke merged commit 0ae4dfd into pandas-dev:main Nov 29, 2023
@mroeschke
Copy link
Member

Thanks @MarcoGorelli

result = DataFrame([pd.Timestamp("2019", tz="UTC")])
orig = result.copy()
view = result[:]
with tm.assert_produces_warning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use assert_cow_warning here? That's optimised for these types of warnings and requires only warn_copy_on_write as input

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Copy / view semantics inplace Relating to inplace parameter or equivalent
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame.update not operating in-place for datetime64[ns, UTC] dtype
3 participants