-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Implement new indexing behavior for intervals #27100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks really good, some comments.
if you have other issues you are closing add them as closes in the top of the PR |
Codecov Report
@@ Coverage Diff @@
## master #27100 +/- ##
===========================================
- Coverage 91.87% 41.9% -49.98%
===========================================
Files 180 180
Lines 50853 50861 +8
===========================================
- Hits 46719 21311 -25408
- Misses 4134 29550 +25416
Continue to review full report at Codecov.
|
[1, -1])]) | ||
def test_get_indexer_with_interval(self, query, expected): | ||
|
||
tuples = [(0, 2.5), (1, 3), (2, 4)] | ||
tuples = [(0, 2), (2, 4), (5, 7)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change here was to have a non-overlapping index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since overlapping indexes are invalid for get_indexer
and need to use get_indexer_non_unique
instead. For intervals, overlapping is the analogue for uniqueness, as uniqueness isn't a strong enough condition since a query on a unique IntervalIndex
can still have multiple values returned if they come from the overlap of 2+ intervals.
It would also be good to add some documentation on indexing in the advanced.rst interval section (maybe copy some example from the whatsnew you added) |
@jorisvandenbossche : added a description of the new behavior to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates look good!
@jreback @jorisvandenbossche : also removed the warning about indexing behavior being provisional |
@jreback @jorisvandenbossche : all existing comments have been addressed |
doc/source/user_guide/advanced.rst
Outdated
@@ -965,6 +964,32 @@ If you select a label *contained* within an interval, this will also select the | |||
df.loc[2.5] | |||
df.loc[[2.5, 3.5]] | |||
|
|||
Selecting using an ``Interval`` will only return exact matches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a versionchanged 0.25 here
Indexing an ``IntervalIndex`` with ``Interval`` objects | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Indexing methods for :class:`IntervalIndex` have been modified to require exact matches only for :class:`Interval` queries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a link the docs you added above in indexing
|
||
elif lhs == -1: | ||
# check that target IntervalIndex is compatible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe worth it to make a helper method for indexing with an II (as this is the same code as in get_indexer for unique)
also merge master |
merge master (as the other PR was merged) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 small changes that remove code done in the precursor and i think good to go
thanks @jschendel very nice patch! |
Glad to see this finally resolved. Looking forward to trying out the new behavior. |
@zfrenchee it would be very welcome to test and give feedback ! |
@jorisvandenbossche I noticed today
Both in 25.3 and master. Is there still room to contribute to intervalIndex? |
@alexlenail I think it is best to open a new issue for this (to ensure it doesn't get lost) |
git diff upstream/master -u -- "*.py" | flake8 --diff
I left the
test_interval.py
andtest_interval_new.py
files separate for now, in order to make it more obvious what is being changed. Will condense these files in a follow-up.