-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: Implement RangeIndex min/max using RangeIndex properties #17611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #17611 +/- ##
==========================================
- Coverage 91.2% 91.18% -0.02%
==========================================
Files 163 163
Lines 49637 49648 +11
==========================================
+ Hits 45269 45271 +2
- Misses 4368 4377 +9
Continue to review full report at Codecov.
|
I'm wondering it something similar, but more general, couldn't be done that works with both indices and series using E.g.: def max(self):
if self.is_monotonic_increasing:
return self[-1]
elif self.is_monotonic_decreasing:
return self[0]
else:
return nanops.nanmax(self.values) Seems to me this should work. What do you think, have I missed something nonobvious? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc comment otherwise lgtm
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -472,6 +472,7 @@ Performance Improvements | |||
- Improved performance of :meth:`Categorical.set_categories` by not materializing the values (:issue:`17508`) | |||
- :attr:`Timestamp.microsecond` no longer re-computes on attribute access (:issue:`17331`) | |||
- Improved performance of the :class:`CategoricalIndex` for data that is already categorical dtype (:issue:`17513`) | |||
- Improved performance of ``RangeIndex.min`` and ``RangeIndex.max`` by using ``RangeIndex`` properties to perform the computations (:issue:`17607`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u use :meth: here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will using :meth:
work in this case? It doesn't look like RangeIndex
is in api.rst
, and I can't find any RangeIndex
related items in the existing HTML API reference. Not super well versed on how the docs work though, so could be mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so should prob add RangeIndex
to api.rst :>
(and prob any other Indexes that are missing). We don't need to repeat the methods though (they are inherited).
pandas/tests/indexes/test_range.py
Outdated
@@ -994,3 +994,22 @@ def test_append(self): | |||
# Append single item rather than list | |||
result2 = indices[0].append(indices[1]) | |||
tm.assert_index_equal(result2, expected, exact=True) | |||
|
|||
def test_max_min(self): | |||
params = [(0, 400, 3), (500, 0, -6), (-10**6, 10**6, 4), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest.mark.parametrize
this- Reference issue number above this
@topper-123 : I think that will work in general, but I wonder if there are cases where the additional overhead of computing |
Implemented RangeIndex min/max in terms of RangeIndex properties.
4cb0de8
to
f7e0bfa
Compare
doc/source/api.rst
Outdated
|
||
.. autosummary:: | ||
:toctree: generated/ | ||
:template: autosummary/class_without_autosummary.rst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just list them all in a single autosummary
directive?
f7e0bfa
to
8aa59c3
Compare
thanks @jschendel |
@@ -473,6 +473,7 @@ Performance Improvements | |||
- Improved performance of :meth:`Categorical.set_categories` by not materializing the values (:issue:`17508`) | |||
- :attr:`Timestamp.microsecond` no longer re-computes on attribute access (:issue:`17331`) | |||
- Improved performance of the :class:`CategoricalIndex` for data that is already categorical dtype (:issue:`17513`) | |||
- Improved performance of :meth:`RangeIndex.min` and :meth:`RangeIndex.max` by using ``RangeIndex`` properties to perform the computations (:issue:`17607`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those links won't work because the pages are not generated (only the class docstring is included in the api).
To solve this, we would need to find a way to specify which of the methods to include of which not (or list them manually in the api.rst, that is probably easier)
git diff upstream/master -u -- "*.py" | flake8 --diff
Benchmarks are essentially the same as what's referenced in the issue. Maybe a few ns slower since I combined logic into a helper function vs having identical code (other than
elif
) for both min/max, which was the case I did the benchmarks posted in the issue. Updated benchmarks: