Skip to content

ENH: Add length attribute to Interval and IntervalIndex #18805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 23, 2017

Conversation

jschendel
Copy link
Member

@codecov
Copy link

codecov bot commented Dec 16, 2017

Codecov Report

Merging #18805 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18805      +/-   ##
==========================================
+ Coverage   91.65%   91.67%   +0.02%     
==========================================
  Files         154      154              
  Lines       51368    51373       +5     
==========================================
+ Hits        47080    47095      +15     
+ Misses       4288     4278      -10
Flag Coverage Δ
#multiple 89.54% <100%> (+0.02%) ⬆️
#single 40.84% <12.5%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/interval.py 93.88% <100%> (+0.05%) ⬆️
pandas/util/testing.py 84.68% <0%> (-0.22%) ⬇️
pandas/plotting/_converter.py 66.95% <0%> (+1.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 316acbf...6a719ef. Read the comment docs.

Return an Index with entries denoting the length of each Interval in
the IntervalIndex
"""
return self.right - self.left
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some intervals don't have a length, if values are categorical-like, e.g. letters (pd.Interval('b', 'e')).

I think maybe the length calculation should be wrapped in a try/except statement, with a helpful error message for when length cannot be calculated (e.g. "not possible to calculate length between 'b' and 'e'" in the above example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@topper-123 : I agree with this assessment. @jschendel : Do wrap the length property as such and add tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Should probably audit the existing code for more instances of this, as I imagine there are a few places where similar failures occur, e.g. Interval.mid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably a good idea. Depend on how big that change is, you may want to leave that as a follow-up PR though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, planning to do it in a separate PR so I don't delay this one with unrelated work while I look for more instances. Cleaner than having the instances I immediately see go in here, and the others in a separate PR too.

@jreback
Copy link
Contributor

jreback commented Dec 18, 2017

can you make sure this is in api.rst (though IIRC we are listing these attributes directly in the class).

return self.right - self.left
except TypeError:
# length not defined for some types, e.g. string
msg = 'cannot compute length between {left} and {right}'
Copy link
Contributor

@topper-123 topper-123 Dec 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nitpick perhaps: IMO the form msg = 'cannot compute length between {left!r} and {right!r}' (note the !r) is better, so variables get outputted as raw strings in error messages.

In general IMO variables in error messages should use !r, though this is not very common in pandas or in general. Does pandas have a policy/view on this?

@jschendel jschendel force-pushed the interval-length branch 2 times, most recently from 9e54733 to c891a80 Compare December 19, 2017 08:24
@jschendel
Copy link
Member Author

@jreback : added this to api.rst, and made minor changes to the IntervalIndex docstring. Generated the docs and changes look good.

@jschendel jschendel force-pushed the interval-length branch 2 times, most recently from feda6a7 to dd534b7 Compare December 21, 2017 15:55
@@ -141,6 +141,7 @@ Other Enhancements
- ``IntervalIndex.to_tuples()`` has gained the ``na_tuple`` parameter to control whether NA is returned as a tuple of NA, or NA itself (:issue:`18756`)
- ``Categorical.rename_categories``, ``CategoricalIndex.rename_categories`` and :attr:`Series.cat.rename_categories`
can now take a callable as their argument (:issue:`18862`)
- :class:`Interval` and :class:`IntervalIndex` have gained a ``length`` attribute (:issue:`18789`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebase on master. can you move to 0.23 (docs were renamed), prob easiest to just check this file from master and past in new one

@jreback jreback added this to the 0.23.0 milestone Dec 23, 2017
@jreback jreback merged commit 507157d into pandas-dev:master Dec 23, 2017
@jreback
Copy link
Contributor

jreback commented Dec 23, 2017

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Interval Interval data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Add a length attribute to Interval and IntervalIndex
4 participants