-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
PERF: Series.nunique can compute unique, then remove na #40865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
Take |
@taytzehao - Apologies, I should have indicated in the issue itself. This was added for a sprint today, would you be okay with finding a different issue? |
is this issue open for contributions? |
all issues are open for contribution |
KenilMehta
pushed a commit
to KenilMehta/pandas
that referenced
this issue
Apr 28, 2021
1 task
KenilMehta
pushed a commit
to KenilMehta/pandas
that referenced
this issue
Apr 29, 2021
1 task
jreback
pushed a commit
that referenced
this issue
May 3, 2021
yeshsurya
pushed a commit
to yeshsurya/pandas
that referenced
this issue
May 6, 2021
JulianWgs
pushed a commit
to JulianWgs/pandas
that referenced
this issue
Jul 3, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Uh oh!
There was an error while loading. Please reload this page.
Currently we first remove nans, then use
len
on the result ofSeries.unique
. Except for Series that are mostly null values, it is more performant to switch the order of these operations:gives
Changing part_nan to 100 gives
On my machine, they are about equal when part_nan is 250 (~70% null values).
The text was updated successfully, but these errors were encountered: