-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: optimize algos.take for repeated calls #39692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 23 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
4512f9c
PERF: optimize algos.take for repeated calls
jorisvandenbossche 36c3ed2
fix nd check + fix cache differentiation of int / bool
jorisvandenbossche 6d52932
fix non-scalar fill_value case
jorisvandenbossche ded773a
fix mypy
jorisvandenbossche 2ee2543
try fix mypy
jorisvandenbossche f489ba5
fix annotation
jorisvandenbossche 96305c5
improve docstrings
jorisvandenbossche 480d2b4
Merge remote-tracking branch 'upstream/master' into am-perf-take
jorisvandenbossche c70ac4d
faster EA check
jorisvandenbossche 9fba887
Merge remote-tracking branch 'upstream/master' into am-perf-take
jorisvandenbossche 5273cd5
rename take_1d_array to take_1d
jorisvandenbossche d3dd4e4
add comment about being useful for array manager
jorisvandenbossche 288c6f2
Merge remote-tracking branch 'upstream/master' into am-perf-take
jorisvandenbossche 06a3901
Merge remote-tracking branch 'upstream/master' into am-perf-take
jorisvandenbossche ca30487
use take_nd for now
jorisvandenbossche bf598a7
Merge remote-tracking branch 'upstream/master' into am-perf-take
jorisvandenbossche 05b6b87
move caching of maybe_promote to cast.py
jorisvandenbossche 2284813
move type comment
jorisvandenbossche 4861fdb
Merge remote-tracking branch 'upstream/master' into am-perf-take
jorisvandenbossche a41ee6b
typo
jorisvandenbossche b52e1ec
Merge remote-tracking branch 'upstream/master' into am-perf-take
jorisvandenbossche 76371cf
ensure deprecation warning is always raised
jorisvandenbossche 2faf70b
single underscore
jorisvandenbossche 1c19732
Merge remote-tracking branch 'upstream/master' into am-perf-take
jorisvandenbossche File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unfortunate, but to ensure the warning is always shown, we can't use the cached version for datetime data.
I check what would be the fastest option. The most specific check would be
if isinstance(fill_value, date) and not isinstance(fill_value, datetime)
, butif dtype.kind == "M"
is a bit faster.So the trade-off was between faster for all non-M8 dtypes vs faster for M8 (by being able to use the cached version in most cases) but a bit slower for all other dtypes. So I went with the first (fastest for numeric dtypes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how big the is the perf tradeoff?
since stacklevels are a constant hassle, one option would be to take the find_stacklevel function and change it so that instead of hard-coding "astype" it just looks for the first call that isn't from inside (non-test) pandas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the stacklevel as such, it's the warning itself. With caching, it occurs only once, while otherwise this warning is raised every time you use it.
The other option would be to check for this case / raise the warning a level higher up (so eg the line we are commenting up), so that other cases still use the cached version.