-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: improve is_unique check in Index.intersection #38154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ok let's merge this for 1.2, thought it would have more of an impact. |
pandas/core/indexes/base.py
Outdated
if not result.is_unique: | ||
result = result.unique()._values | ||
else: | ||
result = result._values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im surprised this makes a difference. Index.unique should already be doing this check internally i guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback suggested to check for uniqueness before calling unique() to save a bit of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually @jbrockmendel makes a very good point here. Let's repurpose this PR to actually do a uniquess check internal to unique (of course this may only make sense for > 1000 elements or something).
moving this off 1.2 for a refactor |
Added the unique check to Index.unique. Categeorical.unique does remove values which are not within the categories. If we want to keep this, we can not add a check for uniqueness |
can you run relevant benchmarks and see if this makes any difference? |
pandas/core/indexes/base.py
Outdated
@@ -2379,6 +2379,10 @@ def unique(self, level=None): | |||
""" | |||
if level is not None: | |||
self._validate_index_level(level) | |||
|
|||
if self.is_unique: | |||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to return self._shallow_copy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx.
asvs in index_object:
|
LGTM |
thanks @phofl |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Seems to have no impact.
cc @jreback