Skip to content

PERF: tz_convert/tz_convert_single #35087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 2, 2020

Conversation

jbrockmendel
Copy link
Member

Making these follow the same pattern we use elsewhere, we get a perf bump:

In [2]: dti = pd.date_range("2016-01-01", periods=10000, tz="US/Pacific")        
In [3]: %timeit dti.tz_localize(None)                                                                                                                                                                              
91.4 µs ± 3.06 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)  # <-- PR
102 µs ± 6.43 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)  # <-- master

In [4]: ts = pd.Timestamp.now("US/Pacific")                                                                                                                                                                        
In [5]: %timeit ts.tz_localize(None)                                                                                                                                                                               
17.2 µs ± 307 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)   # <-- PR
19.6 µs ± 233 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)   # <-- PR

Next up is making sure we have full asv coverage for tz_convert/tz_convert_single, analogous to #35075. That can either be separate or added to this PR.

@@ -57,9 +57,10 @@ def test_tz_convert_single_matches_tz_convert(tz_aware_fixture, freq):
],
)
def test_tz_convert_corner(arr):
result = tzconversion.tz_convert(
arr, timezones.maybe_get_tz("US/Eastern"), timezones.maybe_get_tz("Asia/Tokyo")
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this test no longer relevant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR requires that one of the two args be UTC

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, that's what tz_convert_single was doing before? so what's the difference now between tz_convert and tz_convert_single? and is converting between 2 tzs no longer needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what's the difference now between tz_convert and tz_convert_single?

One handles a scalar (i.e. Timestamp) the other handles an array. I'm trying to de-duplicate these further, havent found a way to do so without a perf hit.

and is converting between 2 tzs no longer needed?

correct.

@jreback jreback added the Timezones Timezone data dtype label Jul 2, 2020
@jreback jreback added this to the 1.1 milestone Jul 2, 2020
@jreback jreback added the Performance Memory or execution speed performance label Jul 2, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine, @simonjayhawkins comment

@jreback jreback merged commit 7412c5a into pandas-dev:master Jul 2, 2020
@jbrockmendel jbrockmendel deleted the ref-tz_convert-2 branch July 2, 2020 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants