-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: iloc raising for ea series #50171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also my more general comment about using IntegerArrays
in .iloc
in #49521.
df = DataFrame([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) | ||
result = df.iloc[Series([1], dtype="Int64"), Series([0, 1], dtype="Int64")] | ||
expected = DataFrame([[5, 6]], index=[1]) | ||
tm.assert_frame_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty suire this would fail if a pd.NA
was in the indexers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep it does, which is fine. Will add a test
@@ -1481,7 +1481,10 @@ def _validate_key(self, key, axis: AxisInt): | |||
# so don't treat a tuple as a valid indexer | |||
raise IndexingError("Too many indexers") | |||
elif is_list_like_indexer(key): | |||
arr = np.array(key) | |||
if isinstance(key, ABCSeries): | |||
arr = key._values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about integer arrays with pd.NA
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is validated later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The errors are different for Series and IntegerArray:
>>> import pandas as pd
>>> df = pd.DataFrame([[0,1,2,3,4],[5,6,7,8,9]])
>>> iarr = pd.array([0,1,2, pd.NA], dtype = pd.Int64Dtype())
>>> df.iloc[:, iarr]
IndexError: .iloc requires numeric indexers, got [0 1 2 <NA>]
df.iloc[:, pd.Series(iarr)]
ValueError: cannot convert to 'int64'-dtype NumPy array with missing values. Specify an appropriate 'na_value' for this dtype.
I think the ValueError should be raised in both cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you tried the series case on main? I am getting consistent errors on my pr.
In General I would prefer a better suited error, pointing to NA in iloc. But will probably do in a follow up. We will have to check this later, because not all cases get here and this does not cover the series.iloc case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry wrong snippet. It's the non-NA array:
>>> import pandas as pd
>>> df = pd.DataFrame([[0,1,2,3,4],[5,6,7,8,9]])
>>> iarr = pd.array([0,1,2], dtype = pd.Int64Dtype())
>>> df.iloc[:, iarr]
IndexError: .iloc requires numeric indexers, got [0 1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got you, fixed that as well
Thanks @phofl |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.