Skip to content

PERF: MultiIndex.equals #43589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 16, 2021
Merged

PERF: MultiIndex.equals #43589

merged 3 commits into from
Sep 16, 2021

Conversation

jbrockmendel
Copy link
Member

from dateutil import tz
import pandas as pd
from time import time

dates = pd.date_range('2010-01-01', periods=1000, tz=tz.tzutc())
index = pd.MultiIndex.from_product([range(100), dates])
index2 = index.copy()

%timeit index.equals(index2)
330 ms ± 6.44 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  # <- master
1.79 ms ± 7.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <- PR

@jbrockmendel jbrockmendel added MultiIndex Performance Memory or execution speed performance labels Sep 15, 2021
@@ -3544,8 +3544,13 @@ def equals(self, other: object) -> bool:
if len(self_values) == 0 and len(other_values) == 0:
continue

if not array_equivalent(self_values, other_values):
return False
if not isinstance(self_values, np.ndarray):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe should do this check in array_equivalent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at some point probably; need to check if other usages will be affected by the semantics of EA.equals

@jreback jreback modified the milestones: 1.4, 1.3.4 Sep 16, 2021
@jreback
Copy link
Contributor

jreback commented Sep 16, 2021

prob worth backporting this, can you add a whatsnew note @jbrockmendel ping on green

@jreback jreback merged commit 0a35903 into pandas-dev:master Sep 16, 2021
@jreback
Copy link
Contributor

jreback commented Sep 16, 2021

@meeseeksdev backport 1.3.x

@lumberbot-app
Copy link

lumberbot-app bot commented Sep 16, 2021

Something went wrong ... Please have a look at my logs.

@jbrockmendel jbrockmendel deleted the perf-iter branch September 16, 2021 21:07
simonjayhawkins pushed a commit that referenced this pull request Sep 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MultiIndex Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF: MultiIndex equals method got 20x slower
2 participants