-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: Indexing a multi-index is a lot slower #31648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Mmm, thanks for bisecting things. I'm a bit surprised to see that the cc @topper-123. |
Changing diff --git a/pandas/core/indexes/multi.py b/pandas/core/indexes/multi.py
index c560d81ba9..95921bd90e 100644
--- a/pandas/core/indexes/multi.py
+++ b/pandas/core/indexes/multi.py
@@ -677,7 +677,7 @@ class MultiIndex(Index):
# --------------------------------------------------------------------
# Levels Methods
- @property
+ @cache_readonly
def levels(self):
result = [
x._shallow_copy(name=name) for x, name in zip(self._levels, self._names) With that, I see the timings
These are about 1.5x the 0.25 timings on my machine. So slower, but by a constant factor I think. I'm not sure, but I suspect that the IndexEngine was previously cached, and |
Thanks for the report @valtron.
If _shallow_engine doesn't move the reference to the indexing engine over to the new index, I'd think that should be considered a performance bug. I.e. we should probably have that @TomAugspurger , wouldn't you agree that |
Yep, that's #28584. Caching those makes sense to me. |
IIRC there was an issue a few weeks ago with MultiIndex names propogating to levels. If we cache
That sounds like a good idea. |
See #31651. |
Right, I had forgotten about that issue. Shouldn't be too difficult to fix. EDIT: Want to try your hand on this one, @valtron? Will fix the underlying issue you're reporting and I'm sure this will have a large impact on performance in general for pandas. |
* PERF: Cache MultiIndex.levels Closes #31648 * fixup tests
* PERF: Cache MultiIndex.levels Closes pandas-dev#31648 * fixup tests
Indexing a multi-index seemingly went from O(1) to O(N):
I did a bisect, and found this was caused by the
_shallow_copy
here: b0f33b3#diff-4ffd1c69d47e0ac9f2de4f9e3e4a118cR643.Code Sample
The text was updated successfully, but these errors were encountered: