-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: improve perf. of Categorical.searchsorted #28795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: improve perf. of Categorical.searchsorted #28795
Conversation
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -162,6 +162,7 @@ Performance improvements | |||
- Performance improvement in :meth:`DataFrame.corr` when ``method`` is ``"spearman"`` (:issue:`28139`) | |||
- Performance improvement in :meth:`DataFrame.replace` when provided a list of values to replace (:issue:`28099`) | |||
- Performance improvement in :meth:`DataFrame.select_dtypes` by using vectorization instead of iterating over a loop (:issue:`28317`) | |||
- Performance improvement in :meth:`Categorical.searchsorted` and :meth:`CategoricalIndex.searchsorted` when searching for a single scalar value (:issue:`XXXXX`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reference the PR as the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, fixed.
0f46d60
to
27bd6f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, small comment, ping on green.
|
||
codes = codes[0] if is_scalar(value) else codes | ||
|
||
if is_scalar(value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, i would add a comment here that this is perf sensitive
Comments addressed. |
thanks @topper-123 |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Improves performance of
Categorical.searchsorted
by avoiding expensive data convertions.Also,
CategoricalIndex.searchsorted
now callsself.values.searchsorted
directly instead of going throughalgorithms.searchsorted
, which always ends up callingself.values.searchsorted
anyway. This ends up getting performance to 5.5 µs instead of 12 µs.