-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix bug in loc setitem changing the dtype when condition is False #37672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 16 commits
9df9c89
6e0ec24
26b7dae
11d27b1
a527342
722cf9d
345b3c1
8c57c09
3b92d31
97203de
2f7bf13
eb3b204
9d59de4
b02e629
b27c825
8edc7d0
e57e407
6506df5
8cdef72
c14d7b2
01d7ffc
da94459
85120c9
bae602e
824c927
6871ae9
56f4f76
d334c7a
f12ef06
131d168
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -339,3 +339,17 @@ def test_setitem_boolean_mask(self, mask_type, float_frame): | |
expected = df.copy() | ||
expected.values[np.array(mask)] = np.nan | ||
tm.assert_frame_equal(result, expected) | ||
|
||
@pytest.mark.parametrize("func", [list, np.array, Series]) | ||
jbrockmendel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
@pytest.mark.parametrize("value", [[], [False]]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it isnt clear to me that this is correct with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't an empty indexer do nothing to the underlying object? |
||
def test_setitem_loc_only_false_indexer_dtype_changed(self, func, value): | ||
# GH#37550 | ||
# Dtype is only changed when value to set is a Series and indexer is | ||
# empty/bool all False | ||
df = DataFrame({"a": ["a"], "b": [1], "c": [1]}) | ||
df.loc[func(value), ["b"]] = 10 - df["c"] | ||
expected = DataFrame({"a": ["a"], "b": [1], "c": [1]}) | ||
jreback marked this conversation as resolved.
Show resolved
Hide resolved
|
||
tm.assert_frame_equal(df, expected) | ||
|
||
df.loc[[False], ["b"]] = 9 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should |
||
tm.assert_frame_equal(df, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we need to retain the check for len(value) since we only want to catch boolean masking (though clearly we dont have tests with a mismatched len(value))
probably also need to check that pi is a listlike with matching length too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indexers.check_setitem_lengths may be useful here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would we want to do with empty indexers then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I thought this would be ok is, that empty indexers look here the same as all False boolean indexers. Value in is here empty too, so we have that mismatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think
ser.loc[[]] = [1, 2, 3]
should raise kind of likeser.loc[[1]] = [1, 2, 3, 4]
wouldThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem here is:
In case of
[]
we have the following parameters:In case of
[False]
we get exactyl the same parametrization. We cant really decide here whether we got a boolean indexer or not.Additionally,
does not raise on master either. Was not introduced here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a tough one. Let's get another opinion. @jreback: what do you expect to happen with
ser.loc[[]] = [1, 2, 3]
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @jreback ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coming back to this, yeah i think if we don't have equal len indexer we should always raise, so @jbrockmendel example should raise