Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[PERF] Get rid of MultiIndex conversion in IntervalIndex.is_unique #26391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PERF] Get rid of MultiIndex conversion in IntervalIndex.is_unique #26391
Changes from 3 commits
4ec1fe9
51d6910
d11acd6
202b2cf
8e8384b
8dde393
d3af9c9
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.values.left
should be equivalent toself.left
, so I think we can get by without needing to define these, and just refer to them asself.left
/self.right
where neededThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my previous comment is correct, I don't think we need this to be a function anymore since it's only called once, so you can just put the function's logic at the end of the method.
Can you also test out the following variant of
_is_unique
:I did a sample run of this, and it appears to be a bit more efficient:
I haven't fully tested this in all scenarios though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HEAD adopts
_is_unique2
and HEAD~3 adopts_is_unique
. The performance is slightly worse but the code is more explanatory.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these asv's really short? maybe have a longer one and see how this scales
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was wondering this too;
is_unique
is cached, so I wonder if the asv is just timing the cache lookup? Does anything special need to be done to handle things that are cached?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HEAD~6 adopts
_is_unique
while HEAD adopts_is_unique2
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
self.isna().sum() > 1
is a little more idiomatic and performant.Doing single runs to avoid caching: