-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: use check_setitem_lengths in DTA.__setitem__ #36339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: use check_setitem_lengths in DTA.__setitem__ #36339
Conversation
…eck-empty-setitem
…eck-empty-setitem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Only a comment regarding the todo
pandas/core/indexers.py
Outdated
"different length than the value" | ||
) | ||
else: | ||
# TODO: dont we still need lengths to match? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think yes?
But can you do that here, since the original code (I think?) did it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the existing code does not check for length-match in the len==0 case. ill take a look at adding that check and getting rid of this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually is checked right now, because of the len(key) != len(value)
before checking elif not len(key):
(but it's clearly not tested ..)
As example:
In [1]: arr = pd.date_range("2012", periods=3)._data
In [2]: arr
Out[2]:
<DatetimeArray>
['2012-01-01 00:00:00', '2012-01-02 00:00:00', '2012-01-03 00:00:00']
Length: 3, dtype: datetime64[ns]
In [3]: arr[[]] = [pd.Timestamp("2012")]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-d4b75e105bf5> in <module>
----> 1 arr[[]] = [pd.Timestamp("2012")]
~/scipy/pandas/pandas/core/arrays/datetimelike.py in __setitem__(self, key, value)
620 f"'{len(value)}'."
621 )
--> 622 raise ValueError(msg)
623 elif not len(key):
624 return
ValueError: shape mismatch: value array of length '0' does not match indexing result of length '1'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch; i was comparing against the existing check_setitem_lengths
pandas/core/indexers.py
Outdated
if len(value) != length_of_indexer(indexer, values): | ||
raise ValueError( | ||
"cannot set using a slice indexer with a " | ||
"different length than the value" | ||
) | ||
if len(value) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use not len(value) (and same as above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I find if len(value) == 0:
more readable / clearer about the intent ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not idiomatic python (though i tend to agree), either way as long as we are consistent, and unfortunately this is not consistent with the rest of the code base
@jbrockmendel can you add a test for the setitem case with zero-length indexer and non-zero length value (as it is clearly untested ..) |
updated per suggestions |
Thanks! |
Turning this into a two-line check will make it easy to add to
PandasArray.__setitem__
andCategorical.__setitem__
, at which point we'll be able to move these methods up toNDArrayBackedExtensionArray.__setitem__