-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: concatenation of MultiIndexed objects (MultiIndex.append) #53697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up! And it seems that all tests pass, so that's nice ;)
Do you think it would be worth testing this with more MIs? (not necessarily in the asvs, can also be just here in the PR) Or are we confident this should be faster (or at least not a big slowdown) for most cases?
pandas/core/indexes/multi.py
Outdated
mi.codes[i], | ||
mi.levels[i], | ||
level_values, | ||
copy=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mi.codes[i], | |
mi.levels[i], | |
level_values, | |
copy=False, | |
mi.codes[i], mi.levels[i], level_values, copy=False |
(styling nitpick, this can fit on one line I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, thanks
Did a quick test with a case of complete unique values in the MI levels (so not even repetition within a level):
and even here the new algo is faster (just a smaller difference, "only" around 2x faster). |
Very nice thanks @lukemanley |
…as-dev#53697) * improve perf of MultiIndex.append * fix test * style
@lukemanley could you add a test like in the linked dask issue? |
doc/source/whatsnew/v2.1.0.rst
file if fixing a bug or adding a new feature.cc: @jorisvandenbossche - nice find/suggestion.
Perf improvement for
MultiIndex.append
(andpd.concat
for objects with MultiIndexes):