Skip to content

Features which Interval / IntervalIndex should probably have #19480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alexlenail opened this issue Jan 31, 2018 · 6 comments
Open

Features which Interval / IntervalIndex should probably have #19480

alexlenail opened this issue Jan 31, 2018 · 6 comments
Labels
Enhancement Interval Interval data type

Comments

@alexlenail
Copy link
Contributor

alexlenail commented Jan 31, 2018

Here's a list of possible features which should be added to the Interval and IntervalIndex types. Some of them might already exist, in which case, please excuse my mistake. Also note, I'm not asking for these, just listing some ideas I had which might be useful for others.

closest

>>> Interval(3, 4).closest(IntervalIndex.from_tuples([(1, 2), (5, 9)]), how="min|mean|max")
>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]).closest(IntervalIndex.from_tuples([(1, 4), (2, 5)]))

Big question: what if there are two or more identically distant intervals?

complement

>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]).complement(lower_bound=np.neginf, upper_bound=np.inf)
IntervalIndex.from_tuples([(np.neginf, 1), (2, 5), (9, np.inf)])

intersection

>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]).intersect(IntervalIndex.from_tuples([(3, 4), (6, 7)]))
IntervalIndex.from_tuples([(6, 7)])
>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]).intersect(Interval(6, 7))
???

union

>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]).intersect(IntervalIndex.from_tuples([(3, 4), (6, 7)]))
IntervalIndex.from_tuples([(1, 2), (3, 4), (5, 9)])

subtract|difference

>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]) - IntervalIndex.from_tuples([(3, 4), (6, 7)])
IntervalIndex.from_tuples([(1, 2), (5, 6), (7, 9)])
>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]) - Interval(6, 7)
IntervalIndex.from_tuples([(1, 2), (5, 6), (7, 9)])
>>> Interval(6, 7) - IntervalIndex.from_tuples([(1, 2), (5, 9)])
???

sort

(I'm guessing this is probably already implicitly implemented. Does it work for multi-index in which one of the levels is Intervalindex?)

shift

>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]).shift(2)
IntervalIndex.from_tuples([(3, 4), (7, 11)])

(Could be really useful for datetimes?)

slop|grow|window|better name

>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]).slop(1)
IntervalIndex.from_tuples([(0, 3), (4, 10)])
@jschendel
Copy link
Member

Note that union/intersection/difference/shift will likely need new names, as they already have more generic uses (e.g. IntervalIndex.union is already implemented with set theoretic behavior in terms of individual elements).

Some more ideas, all of which should return booleans, mostly taken from postgres range types:

  • is_adjacent_to
  • strictly_left_of
  • strictly_right_of
  • is_empty

(see the postgres docs for examples)

Another postgres function that could be nice is range_merge, which we'd almost certainly want to rename to something like combine. It gives the smallest interval containing two given intervals, e.g. Interval(0, 1).combine(Interval(3, 4)) would return Interval(0, 4).

@tsabsch
Copy link

tsabsch commented Feb 6, 2018

I was currently working with an IntervalIndex and needed to get the interval ranges. Maybe this would be useful for others as well?

>>> IntervalIndex.from_tuples([(1, 2), (5, 9)]).ranges()
array([1, 4])

@jschendel
Copy link
Member

@tsabsch : By ranges, you're referring to the length of each interval in the IntervalIndex, right?

If so, note that this was implemented as the length property in #18805, and will be present in 0.23.0:

In [2]: pd.__version__
Out[2]: '0.23.0.dev0+230.gf391cbf'

In [3]: pd.IntervalIndex.from_tuples([(1, 2), (5, 9)]).length
Out[3]: Int64Index([1, 4], dtype='int64')

@tsabsch
Copy link

tsabsch commented Feb 6, 2018

@jschendel Yes, this is exactly what I meant. Thank you for pointing it out. I wasn't aware that it was already implemented for the next release. I'm on pandas 0.22.0.

@jschendel
Copy link
Member

@tsabsch : No worries. Glad to hear length will be useful!

@mezzode
Copy link

mezzode commented Apr 27, 2018

It might also be nice to be able to check if an Interval is in another Interval.
Also, this issue could use the Interval label for discoverability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Interval Interval data type
Projects
None yet
Development

No branches or pull requests

5 participants