-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Place the calculation of mask prior to the calls of comp in replace_list to improve performance #35229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Place the calculation of mask prior to the calls of comp in replace_list to improve performance #35229
Conversation
I've included the change in the whatsnew document of the 1.1 milestone. Do you think we can include it in that release, if this PR is approved on time? It would be nice since this change is related to #32890 which is also in 1.1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you show the results of the asv's
…prove-replace_list-performance
|
…prove-replace_list-performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment, pls ping on green.
doc/source/whatsnew/v1.1.0.rst
Outdated
@@ -856,6 +856,7 @@ Performance improvements | |||
- Performance improvement in arithmetic operations (sub, add, mul, div) for MultiIndex (:issue:`34297`) | |||
- Performance improvement in `DataFrame[bool_indexer]` when `bool_indexer` is a list (:issue:`33924`) | |||
- Significant performance improvement of :meth:`io.formats.style.Styler.render` with styles added with various ways such as :meth:`io.formats.style.Styler.apply`, :meth:`io.formats.style.Styler.applymap` or :meth:`io.formats.style.Styler.bar` (:issue:`19917`) | |||
- Performance improvement in the :func:`Series.replace` when `to_replace` is a dict (:issue:`33920`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this improve performance relative to 1.0.5, or just master? We only need this if it's faster relative to 1.0.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locally, I have this at ~1.5x slower than 1.0.5 on the example from #33920. So seems like we could still use some work to get us back to 1.0.x speeds, as long as we don't sacrifice correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, those improvements are only relative to master. So, maybe it doesn't make sense to list this change at all in the whatsnew document. I'll have it removed. However, this should still be merged to 1.1.0 (if we manage on time) right? And then keep improving on it?
thanks @chrispe92 happy to take additional perf improvements in strings or anywhere :-> |
…ist to improve performance (pandas-dev#35229)
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff