-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: groupby iteration #51109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: groupby iteration #51109
Conversation
Without knowing the context of the regression I'm -.05 to doing this - I think it gets a little wonky over time to have overridden constructors like this scattered through the codebase |
Technically, this certainly looks fine (in theory, we might want to also checks But for the timings, can you do those again with actual |
Good point, will update. Also can improve the Series case a little bit by pinning _name instead of passing |
I don't think that's needed, the |
#51784 improved the performance on main on the order of 25%. Updated timings:
Makes this much less compelling. |
No traction here, closing. |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Lookups on ._constructor and finalize calls add up when iterating over many groups. This does a pre-iteration check to see if we can skip those lookups/calls.
Using the example from #46505
On main this takes 31.655s of which 21.520 are in
_chop
and 8.357 are in__finalize__
. In this PR this takes 20.328s of which 11.237 are in_chop_fast
. (still slower than 1.3.5 which was 6.553sThere's a reasonable argument to be made that having separate paths for subclasses is not ideal, which I'll leave to @jorisvandenbossche to make if he's interested.