-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: handle nan values in DataFrame.update when overwrite=False (#15593) #16430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
doc/source/whatsnew/v0.20.2.txt
Outdated
@@ -37,6 +37,7 @@ Bug Fixes | |||
~~~~~~~~~ | |||
|
|||
- Bug in using ``pathlib.Path`` or ``py.path.local`` objects with io functions (:issue:`16291`) | |||
- Bug in using ``DataFrame.update`` when overwrite=False and the second value is nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a ref to the issue ath the end here
(:issue:`15593`)
df1 = DataFrame({'A': [1, None, 3], 'B': date_range('2000', periods=3)}) | ||
df2 = DataFrame({'A': [None, 2, 3]}) | ||
df1.update(df2, overwrite=False) | ||
# it works! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you make an exected dataframe and compare?
expected = df1 = DataFrame({'A': [1.0, 2, 3], 'B': date_range('2000', periods=3)})
tm.assert_frame_equal(df1, expected)
yes, i just made the changes.
…On Mon, May 22, 2017 at 3:58 PM, Tom Augspurger ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In doc/source/whatsnew/v0.20.2.txt
<#16430 (comment)>:
> @@ -37,6 +37,7 @@ Bug Fixes
~~~~~~~~~
- Bug in using ``pathlib.Path`` or ``py.path.local`` objects with io functions (:issue:`16291`)
+- Bug in using ``DataFrame.update`` when overwrite=False and the second value is nan
Could you add a ref to the issue ath the end here
(:issue:`15593`)
------------------------------
In pandas/tests/frame/test_combine_concat.py
<#16430 (comment)>:
> @@ -763,3 +763,13 @@ def test_concat_datetime_datetime64_frame(self):
# it works!
pd.concat([df1, df2_obj])
+
+
+class TestDataFrameUpdate(TestData):
+
+ def test_update_nan(self):
+ # #15593 #15617
+ df1 = DataFrame({'A': [1, None, 3], 'B': date_range('2000', periods=3)})
+ df2 = DataFrame({'A': [None, 2, 3]})
+ df1.update(df2, overwrite=False)
+ # it works!
could you make an exected dataframe and compare?
expected = df1 = DataFrame({'A': [1.0, 2, 3], 'B': date_range('2000', periods=3)})
tm.assert_frame_equal(df1, expected)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#16430 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABSVsjx9erVVSzQBeyQlFJlbjjRXyqd3ks5r8ejbgaJpZM4Niywe>
.
|
df2 = DataFrame({'A': [None, 2, 3]}) | ||
df1.update(df2, overwrite=False) | ||
# it works! | ||
tm.assert_frame_equal(df1, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a second test with
df1 = DataFrame({'A': [1,None,3], 'B': date_range('2000', periods=3)})
df1.update(df2, overwrite=False)
just to make sure we cover the original issue? You can do it in this same test, and use the same expected
|
||
def test_update_nan(self): | ||
# #15593 #15617 | ||
expected = df1 = DataFrame({'A': [1.0, 2, 3], 'B': date_range('2000', periods=3)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do like this
df1 = DataFrame(....)
df2 = DataFrame(....)
expected = df1
result = df1.update(....)
tm.assert_frame_equal(result, expected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think update is inplace? So expected = df1.copy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it returns a new object (but has inplace arg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, you are right @TomAugspurger . Hmm that is pretty bogus it should have a kwarg for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc/source/whatsnew/v0.20.2.txt
Outdated
@@ -37,6 +37,7 @@ Bug Fixes | |||
~~~~~~~~~ | |||
|
|||
- Bug in using ``pathlib.Path`` or ``py.path.local`` objects with io functions (:issue:`16291`) | |||
- Bug in using ``DataFrame.update`` when overwrite=False and the second value is nan (:issue:`15593`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the 2nd issue as well.
Bug in DataFrame.update()
with overwrite=False
and NaN
values
@pcluo can you update with the requested changes? Thanks! |
@TomAugspurger yes, plz check. |
tm.assert_frame_equal(df1, expected) | ||
|
||
# test 2 | ||
expected = df1 = DataFrame({'A': [1, None, 3], 'B': date_range('2000', periods=3)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think this got lost in the conversation, but can you write this as:
df1 = DataFrame({'A': [1, None, 3], 'B': date_range('2000', periods=3)})
df2 = DataFrame({'A': [None, 2, 3]})
expected = DataFrame({'A': [1, 2, 3], 'B': date_range('2000', periods=3)})
The one on line 773 can be a bit shorter, just
df1 = DataFrame({'A': [1.0, 2, 3], 'B': date_range('2000', periods=3)})
expected = df1.copy()
basically, since .update
works inplace, the test will always be true no matter what happens. expected
and df1
are reference to the same object. It's like
In [2]: a = b = {'x': 1}
In [3]: a['y'] = 2
In [4]: a == b
Out[4]: True
Which "shouldn't" be true (but is, since a
and b
are both references to the same object). Let me know if that makes sense.
d8bde84
to
be7a230
Compare
@TomAugspurger just made another attempt at this. Sorry the last conversation got lost when i squashed commits. I changed [1, 2, 3] in test 2 to [1.0, 2, 3] otherwise the dtypes are not equal, float64 vs int64. |
can you rebase on master |
Yeah, that sounds right. If you rebase and force push, all the tests should pass.
FYI, typically you want to checkout a new branch for features / bug fixes. http://pandas.pydata.org/pandas-docs/stable/contributing.html#creating-a-branch Not a huge deal though. |
…as-dev#15593) add nan test for DataFrame.update update whatsnew v0.20.2
it seems all tests passed now. thx @TomAugspurger and will checkout a new branch next time. |
Looks like Travis isn't picking up your commits for some reason. I'll ask their support. |
thanks! |
FYI @pcluo TravisCI support thought that if you sign into travis with your github account then things should be good. |
…as-dev#15593) (pandas-dev#16430) (cherry picked from commit 85080aa)
BUG: handle nan values in DataFrame.update when overwrite=False (#15593)
add nan test for DataFrame.update
update whatsnew v0.20.2
closes handle nan values in DataFrame.update when overwrite=False #15593 #15617
git diff upstream/master --name-only -- '*.py' | flake8 --diff