Skip to content

[WIP] ENH: interval accessor #19502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jschendel
Copy link
Member

@jschendel jschendel commented Feb 2, 2018

Still have a quite a few things to do, but wanted to make sure I'm on the right track, and have this PR in place to motivate any additional issues/PR that need to be done as a prereq for this.

TODO:

  • Add IntervalIndex methods to the accessor
    • I don't immediately see any existing methods that are appropriate for this
    • There are a few open issues/PR related to methods that could be appropriate here
  • Documentation
  • More tests
    • Tests for IntervalIndex methods once they've been added
    • Anything else I'm missing?

Example usage:

In [2]: s = pd.Series(pd.IntervalIndex.from_breaks([0, 1, 3, 6, 10]))

In [3]: s
Out[3]:
0     (0, 1]
1     (1, 3]
2     (3, 6]
3    (6, 10]
dtype: object

In [4]: s.iv.length
Out[4]:
0    1
1    2
2    3
3    4
dtype: int64

In [5]: s.iv.mid
Out[5]:
0    0.5
1    2.0
2    4.5
3    8.0
dtype: float64

cc @TomAugspurger : thought you'd be interested as this relates to the new extension accessor code

@@ -510,6 +510,39 @@ def is_interval_dtype(arr_or_dtype):
return IntervalDtype.is_dtype(arr_or_dtype)


def is_interval_arraylike(arr):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be needed in the long-term; once #19453 is closed is_interval_dtype should be sufficient anywhere I've used this, so haven't added related tests. Can add tests if this PR looks like it will get closed first. Or is there a reason we'd want to keep this after #19453, in which case tests should be added regardless?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure as part of adding first class dtypes we would want to remove this, but no problem add it here now. you can add a couple of basic test for this though.

@codecov
Copy link

codecov bot commented Feb 2, 2018

Codecov Report

Merging #19502 into master will decrease coverage by 0.01%.
The diff coverage is 77.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19502      +/-   ##
==========================================
- Coverage   91.67%   91.66%   -0.02%     
==========================================
  Files         148      148              
  Lines       48553    48610      +57     
==========================================
+ Hits        44513    44558      +45     
- Misses       4040     4052      +12
Flag Coverage Δ
#multiple 90.03% <77.96%> (-0.02%) ⬇️
#single 41.82% <27.11%> (+0.1%) ⬆️
Impacted Files Coverage Δ
pandas/core/indexes/interval.py 92.41% <100%> (+0.02%) ⬆️
pandas/core/series.py 94.6% <100%> (ø) ⬆️
pandas/core/dtypes/common.py 94.9% <66.66%> (-0.42%) ⬇️
pandas/core/indexes/accessors.py 86% <77.08%> (-4.2%) ⬇️
pandas/core/dtypes/cast.py 88.15% <0%> (+0.16%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 78ba063...2f06ea2. Read the comment docs.

@TomAugspurger
Copy link
Contributor

Looks good to me (I wonder if any bits can be put into PandasDelegate, and shared between subclasses?)

What do people think about the name .iv, versus the longer .interval?

@jreback
Copy link
Contributor

jreback commented Feb 3, 2018

should be interval i think

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Feb 3, 2018 via email

@jschendel
Copy link
Member Author

FWIW my choice of using an abbreviated name was that it seems more consistent with the existing accessor naming conventions; we use .str instead of .string, and likewise for .cat and .dt. Another option is .itv, but willing to go with .interval if that's the consensus.

@@ -510,6 +510,39 @@ def is_interval_dtype(arr_or_dtype):
return IntervalDtype.is_dtype(arr_or_dtype)


def is_interval_arraylike(arr):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure as part of adding first class dtypes we would want to remove this, but no problem add it here now. you can add a couple of basic test for this though.

@jreback
Copy link
Contributor

jreback commented Sep 25, 2018

@jschendel happy to have this. if you can close for now, or update

@jreback
Copy link
Contributor

jreback commented Dec 9, 2018

closing as stale for now, happy to have this when you have time @jschendel

@jreback jreback closed this Dec 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Interval Interval data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: .interval accessor
3 participants