-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame.update not operating in-place for datetime64[ns, UTC] dtype #56228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: DataFrame.update not operating in-place for datetime64[ns, UTC] dtype #56228
Conversation
626efcc
to
8e4b65c
Compare
>>> df = pd.DataFrame({'A': [1, 2, 3], | ||
... 'B': [400, 500, 600]}) | ||
... 'B': [400., 500., 600.]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing these to floats, because otherwise we'd get
<ipython-input-3-3ae759fa2c19>:1: FutureWarning: Downcasting behavior in Series and DataFrame methods 'where', 'mask', and 'clip' is deprecated. In a future version this will not infer object dtypes or cast all-round floats to integers. Instead call result.infer_objects(copy=False) for object inference, or cast round floats explicitly. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
which looks correct because we're trying to update an int column df['B']
with a float one new_df['B']
@phofl any idea what the copy-on-write warnings CI jobs are failing here? |
pandas/core/frame.py
Outdated
@@ -8876,8 +8874,8 @@ def update( | |||
other = other.reindex(self.index) | |||
|
|||
for col in self.columns.intersection(other.columns): | |||
this = self[col]._values | |||
that = other[col]._values | |||
this = self[col] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is responsible, is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice one, thanks! 🙏
@@ -8897,7 +8895,7 @@ def update( | |||
if mask.all(): | |||
continue | |||
|
|||
self.loc[:, col] = expressions.where(mask, this, that) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I suppose numexpr generally will prevent inplace operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being inplace here is generally not a good idea since non inplace methods are faster if you replace the whole column
Thanks @MarcoGorelli |
result = DataFrame([pd.Timestamp("2019", tz="UTC")]) | ||
orig = result.copy() | ||
view = result[:] | ||
with tm.assert_produces_warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use assert_cow_warning
here? That's optimised for these types of warnings and requires only warn_copy_on_write
as input
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.