-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fixing DataFrame.Update crashes when NaT present #49395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
2d126dd
b4c792a
c63d1cc
7bea232
9ee4329
f191629
35fa0a8
ead244a
bd07c5a
8a1e0f7
d8167ea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -166,3 +166,37 @@ def test_update_modify_view(self, using_copy_on_write): | |
tm.assert_frame_equal(result_view, df2_orig) | ||
else: | ||
tm.assert_frame_equal(result_view, expected) | ||
|
||
def test_update_dt_column_with_NaT_create_column(self): | ||
MarcoGorelli marked this conversation as resolved.
Show resolved
Hide resolved
|
||
df = DataFrame( | ||
{ | ||
"A": [1, None], | ||
"B": [ | ||
pd.NaT, | ||
pd.to_datetime("2016-01-01"), | ||
], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can we keep this on a single line? |
||
} | ||
) | ||
df2 = DataFrame({"A": [2, 3]}) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. let's remove all these newlines in the tests |
||
df.update(df2, overwrite=False) | ||
|
||
expected = DataFrame( | ||
{"A": [1.0, 3.0], "B": [pd.NaT, pd.to_datetime("2016-01-01")]} | ||
) | ||
|
||
tm.assert_frame_equal(df, expected) | ||
|
||
def test_update_dt_column_with_NaT_create_row(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not really sure what this test adds, I'd suggest to either:
|
||
|
||
df = DataFrame({"A": [1, None], "B": [pd.to_datetime("2017-1-1"), pd.NaT]}) | ||
|
||
df2 = DataFrame({"A": [2], "B": [pd.to_datetime("2016-01-01")]}) | ||
|
||
df.update(df2, overwrite=False) | ||
|
||
expected = DataFrame( | ||
{"A": [1, None], "B": [pd.to_datetime("2017-1-1"), pd.NaT]} | ||
) | ||
|
||
tm.assert_frame_equal(df, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the original issue, it suggests that this may be an issue in the
reindex_like
operations being incompatible with something, so I don't think a patched-over check like this is appropriateThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Thanks for reviewing!
This was my assessment of it in the original issue: #16713 (comment)
Based on that, I think the issue in reindex_like is just that when it creates columns that aren't in the pre-reindex dataframe that they aren't datetime/datetime compatible by default. Is that the part I should be looking to change instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally yes, or possibly when the masks are compared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will look into this more, thanks for the guidance!