-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: support Ellipsis in loc/iloc #37750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: support Ellipsis in loc/iloc #37750
Conversation
I realize there is still an open issue for this but it is quite old. What do you see as the main motivation to support this given we no longer have a Panel? |
As @shoyer mentioned in #10956, this makes it easier to write code that is valid for both Series and DataFrame, e.g. In particular Im finding this would be useful for parametrizing indexing tests. |
…h-indexing-ellipses
I agree this is good for general indexing compatiblity and is unambiguous. @jbrockmendel can you make this raise for a MI or is it unambiguous there? (I think it is more useful than the slice syntax, but that should be a dedicated PR). |
MI cases are appreciably more complicated and im planning to handle them separately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally don't see myself using it, but the arguments of @shoyer on the issue certainly make sense (although they are from a time we still had multidimensional dataframes (>2d) I think)
Now, if adding it, this will need some docs.
Also, I think we would need some some more test cases?
Eg df.loc[..., ...]
(in numpy this errors about "an index can only have a single ellipsis ('...')")
Or series.loc[...]
, or df.loc[..., 1, 2]
, ...
In numpy, arr[.., 1]
and arr[1]
actually do something different on a 1D array (numpy scalar vs 0d array; now we don't have this difference in pandas, so that's probably not an issue)
MI cases are appreciably more complicated and im planning to handle them separately
Do you mean having an Ellipsis "within" the indexer for the MultiIndexes axis?
That's indeed certainly a different topic, but just having an Ellipsis when indexing a dataframe with a MultiIndex can be tested here, I think, as it is already covered by your changes (I assume). Eg the same tests as you have now, like df.loc[..., [1]]
but on a DataFrame with a MI (I suppose this doesn't take a different code path necessarily, as it just expands to a null slice, but still good to explicitly have tests with different index types I think)
pandas/core/indexing.py
Outdated
# TODO: other cases? only one test gets here, and that is covered | ||
# by _validate_key_length | ||
return tup | ||
|
||
def _has_valid_tuple(self, key: Tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename this method if it now actually returns a modified key?
…h-indexing-ellipses
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
i think this is ok, we should be able to accept ellipsis (even if not that useful). add a whatsnew and rebase. ping on green (i think @jorisvandenbossche had a comment as well). |
@jbrockmendel can you rebase this |
sure, but there are still a handful of comments I havent addressed, so this isnt ready |
…h-indexing-ellipses
…h-indexing-ellipses
…h-indexing-ellipses
…h-indexing-ellipses
…h-indexing-ellipses
updated to flesh out tests, i think comments are addressed |
@jbrockmendel if this is ready can you add a release note and a section on the new capability. |
whatsnew added + green |
@jbrockmendel needs rebase |
rebased + green |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be a nice extension to support ...
in both dimensions as we do for :
today (followon)
(and maybe in multi-index) but i am not sure it adds a lot of value.
Actually i dont' think this is any different than :
right ultimately right?
The main thing this allows is to write code that works for either Series or DataFrame |
ahh got it. then am +1 here. cc @phofl if any comments. |
can you create an issue for adding docs about this in the user guide / doc-string as appropriate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good.
Is
df = pd.DataFrame({"a": [1, 2, 3]}).loc[(..., ), :]
valid syntax? cause this raises a KeyError
If you mean "should it be equivalent to
(on master, not PR branch) |
thanks @jbrockmendel |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
closes #10956