Skip to content

API: allow step!=1 slice with IntervalIndex #31658

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Mar 8, 2020

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Feb 4, 2020

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

See discussion #16386 (comment)

cc @jorisvandenbossche @jschendel

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Interval Interval data type labels Feb 5, 2020
@jreback jreback added this to the 1.1 milestone Feb 5, 2020
@jreback
Copy link
Contributor

jreback commented Feb 5, 2020

I suppose should add a whatsnew bug fix entry.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test that ensures that slicing with an interval as step still raises an error?

@@ -145,7 +145,7 @@ Strings

Interval
^^^^^^^^

- Bug in :class:`IntervalIndex` slicing that prevented slicing with ``step > 1`` (:issue:`31658`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this in the "other enhancements" section

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel what would you expect for this:

s = Series(np.arange(5), IntervalIndex.from_breaks(np.arange(2, 14, 2))) 
s[1.5:9.5:2]

I think that is less clear, which might be the reason it was disallowed initially.

"these indexers .* of <class 'pandas._libs.interval.Interval'>"
)
with pytest.raises(TypeError, match=msg):
s[0 : 4 : Interval(0, 1)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would test it where start and stop are also intervals.
And can you move it to the test_loc_with_slices a bit above that is testing slicing with Intervals ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start and stop are also intervals

out of scope

And can you move it to the test_loc_with_slices a bit above that is testing slicing with Intervals ?

sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start and stop are also intervals

out of scope

I think that is not out of scope, as that hits this path, and right now actually ensured there is a proper error message.

msg = (
"cannot do slice indexing on "
"<class 'pandas.core.indexes.interval.IntervalIndex'> with "
"these indexers .* of <class 'pandas._libs.interval.Interval'>"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current error message I get is more comprehensible I think (with s from the tests here):

In [41]: s[Interval(3, 4):Interval(4, 5):Interval(1, 2)]                                                                                                                                                           
...
ValueError: cannot support not-default step in a slice

In [42]: s[1:2:Interval(1, 2)]                                                                                                                                                           
...
ValueError: cannot support not-default step in a slice

@jbrockmendel
Copy link
Member Author

what would you expect for [...] s[1.5:9.5:2] [...] I think that is less clear, which might be the reason it was disallowed initially.

I agree that is less obvious, as that is not positional.

@jorisvandenbossche
Copy link
Member

I agree that is less obvious, as that is not positional.

So let's not allow that?

@jbrockmendel
Copy link
Member Author

So let's not allow that?

OK I see, I thought that the float-slice would have been disallowed somewhere else, but evidently not. So yes, this needs to be updated to restore that.

@jbrockmendel
Copy link
Member Author

@jorisvandenbossche double-checking: suppose we get integer sllce that is valid for either loc or iloc:

import pandas as pd
import numpy as np

ser = pd.Series(np.arange(5), pd.IntervalIndex.from_breaks(np.arange(2, 14, 2))) 
key = slice(3, 6)

ser[key]
ser.loc[key]
ser.iloc[key]

key2 = 3
ser[key2]
ser.loc[key2]
ser.iloc[key2]

For the slice we treat the ambiguous __getitem__ key as positional, while for the int we treat it as a label. That's on purpose, right?

@jbrockmendel
Copy link
Member Author

Closing to clear the queue (in particular #31786 should happen before this), will revisit

@jreback
Copy link
Contributor

jreback commented Feb 22, 2020

can you rebase. look ok to me. @jschendel

@jbrockmendel
Copy link
Member Author

rebased+green

@jreback
Copy link
Contributor

jreback commented Mar 3, 2020

can you rebase.

@jbrockmendel
Copy link
Member Author

rebased+green

Copy link
Member

@jschendel jschendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@@ -20,6 +21,34 @@
# Indexer Identification


def is_valid_positional_slice(slc: slice) -> bool:
"""
Check if a slice object can be interpretd as a positional indexer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: interpretd --> interpreted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@jreback jreback merged commit 90bb1aa into pandas-dev:master Mar 8, 2020
@jreback
Copy link
Contributor

jreback commented Mar 8, 2020

thanks 2 followups:

  1. @jorisvandenbossche comment about having start/stop as intervals, pls create an issue
  2. does this need an example in the docs?

@jbrockmendel jbrockmendel deleted the convert_scalar_indexer branch March 9, 2020 18:05
SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020
* Allow step!=1 in slicing Series with IntervalIndex

* Whatsnew

* doc suggestion

* test for interval step

* remove commented-out

* reword whatsnew

* disallow label-based with step!=1

* black fixup

* update error message

* typo fixup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Interval Interval data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants