-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: significant speedups in tz-aware operations #24491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5a5ed18
to
6ffad7c
Compare
Codecov Report
@@ Coverage Diff @@
## master #24491 +/- ##
=======================================
Coverage 92.31% 92.31%
=======================================
Files 166 166
Lines 52412 52412
=======================================
Hits 48382 48382
Misses 4030 4030
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #24491 +/- ##
=======================================
Coverage 92.38% 92.38%
=======================================
Files 166 166
Lines 52478 52478
=======================================
Hits 48483 48483
Misses 3995 3995
Continue to review full report at Codecov.
|
6ffad7c
to
5996d17
Compare
Overall LGTM. Could you also add |
There are I think 7 places in tslibs where we do this iteration. Can they all be optimized? |
Another perf improvement waiting in these files is calling searchsorted via |
a bit -1 on actually using the numpy c-api for searchsorted as it adds complexity; I suspect it won't actually speed things up for us (and is out of scope for this PR anyhow). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. ex comments about adding some additional benchamarks including tzlocal.
Definitely agree on the scope, and like I said it is a PITA. But it's also one of the few python-space calls left in these files that we do at each stage of a loop. |
yep. |
@jbrockmendel Not all, but at least one or two more. I just got back into town and have some errands to run, so will have new asv results this evening. I have an (unsubmitted) PR for |
@qwhelan can you update (small comment), ping on green. |
@qwhelan can you merge master and update that comment |
@mroeschke Looks like the
But it's still ~200x slower:
@jreback digging into a regression I thought was spurious, will have an update in an hour or two |
sure! |
rather than piecewise
5996d17
to
360415c
Compare
@jreback Good to merge, can't reproduce the spurious regression. |
thanks @qwhelan keep em coming! |
Operations involving tz-aware data currently incur a pretty substantial penalty:
This PR improves the performance of tz-aware operations to near that of tz-naive ones through a couple approaches:
_freq
directlyfreq.setter
calls a validation check that compares the input value againstto_offset(self.inferred_freq)
, which is exactly what we are passing.inferred_freq
requires converting the entire array to the appropriate tz. The time to do so dominates the runtime for our benchmark. Simply eliminating 1 of 2 calls cuts runtime by 50%.pandas._libs.tslibs.conversion
functions by batchingsearchsorted
calls rather than doing piecewise.O(n log(k))
. However, call overhead appears to be substantially larger than the actual search time for theN
of our existing benchmark.iNaT
s, but this ignores the fact thatsearchsorted
has some optimizations for when theneedle
array is also sorted (which allows it to incrementally shrink the search region).is_tzlocal()
Here's the same comparison with the PR:
And
asv
output:git diff upstream/master -u -- "*.py" | flake8 --diff