-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: specialized 1D take version #40246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: specialized 1D take version #40246
Conversation
can you factor even more, this ultimately should be used in the other functions. |
I don't think more is easily possible. In the existing take_nd, there are several places with additional checks related to 2D, and I don't want to add that back / share that in a common helper, because it's exactly the purpose of this specialized 1D version to not have to care about (and have overhead due) 2D handling. |
It is used in other functions, eg in unstack (added here, and now the other PR is merged, can also use it in reindex) |
pandas/core/array_algos/take.py
Outdated
indexer: np.ndarray, | ||
fill_value=None, | ||
allow_fill: bool = True, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> ArrayLike
?
|
Remaining failure also happens on master. |
@jorisvandenbossche pls resist the temptation to self-merge. i'll sleep better. (no complaints about the PRs you've merged recently, just as you've commented elsewhere about premature merging, better to have good habits and stay on the safe side) can this new function be used anywhere else, e.g. in Series/Index? |
To be clear, I only do that for PRs that are trivial and/or have been reviewed and are approved or have no active discussion (and thus not for eg PRs that impact the API). Of course that's always a bit subjective, so I have no problem in being called out for in case you think I made a wrong judgement call. And we are using git, so it's always easy to revert something if needed.
Maybe? That would need to be investigated for each individual case whether the constraints are met (eg the input already validated etc). And I am currently not planning to do such investigation (in general, unless it's within a performance bottleneck, I would personally mostly limit the use of this function to the internals. |
All the case so far i think have been benign.
In practice I doubt it will make a difference, but I will be moderately more comfortable if you ping me/jeff. On the other hand if we're not reasonably responsive (timezones notwithstanding), do let me know. Really the failure mode that I'm most worried about is something slipping through the cracks of my inbox. |
A part that I took out of #39692, doing it separately here.
This gives a bit of code duplication (but all large parts of the function were already factored out into shared helper functions before, like
_take_preprocess_indexer_and_fill_value
), but having a 1D version gives less overhead compared totake_nd
.Using the same benchmark case (unstack ASV case) as in #39692, this gives:
It's a small difference (around 10% improvement), but I repeated the timings above multiple times it was quite consistently a bit faster.