Skip to content

standardize post-call treatment of get_dst_info, delay sorting calls #21960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 20, 2018

Conversation

jbrockmendel
Copy link
Member

@jreback we discussed how there are a bunch of functions that do really similar things with get_dst_info but have their slight idiosyncrasies. This standardizes them, and is really verbose so as to delay certain calls until absolutely necessary.

After this we can see about de-duplicating the 6ish occurrences of really similar code by passing a function pointer or something.

@gfyoung gfyoung added Refactor Internal refactoring of code Internals Related to non-user accessible pandas implementation Datetime Datetime data dtype labels Jul 18, 2018
@codecov
Copy link

codecov bot commented Jul 18, 2018

Codecov Report

Merging #21960 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #21960   +/-   ##
=======================================
  Coverage   91.96%   91.96%           
=======================================
  Files         166      166           
  Lines       50329    50329           
=======================================
  Hits        46287    46287           
  Misses       4042     4042
Flag Coverage Δ
#multiple 90.36% <ø> (ø) ⬆️
#single 42.23% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3cb64bd...76e0c08. Read the comment docs.

# functions would then hit an IndexError because they assume
# `deltas` is non-empty.
# (under the just-deleted code that returned empty arrays)
raise ValueError(tz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe an AssertionError for paths that are not hit (or can just use an assert)

@jreback
Copy link
Contributor

jreback commented Jul 18, 2018

looks fine. is there any perf diff?

@jbrockmendel
Copy link
Member Author

is there any perf diff?

It should be a very small improvement (I'll run asv after the current batch is done, but we know it'll be swamped by noise). The main difference is that calls to trans.searchsorted are avoided in a few cases where we know len(deltas) == 1

@jbrockmendel
Copy link
Member Author

Noise, but at least good noise I guess.

asv continuous -f 1.1 -E virtualenv master HEAD -b timeseries
[...]
       before           after         ratio
     [3cb64bd2]       [76e0c080]
-     3.92±0.03μs      3.55±0.01μs     0.91  timeseries.DatetimeIndex.time_get('dst')
-       136±0.1ms       120±0.06ms     0.89  timeseries.DatetimeIndex.time_to_pydatetime('tz_aware')
-     29.8±0.07ms       26.0±0.1ms     0.87  timeseries.Iteration.time_iter_preexit(<function period_range at 0x7f36ac823de8>)
-         935±1μs          731±2μs     0.78  timeseries.DatetimeIndex.time_to_pydatetime('dst')
taskset 6 asv continuous -f 1.1 -E virtualenv master HEAD -b timeseries
[...]
       before           after         ratio
     [3cb64bd2]       [76e0c080]
-       294±0.2ms          263±1ms     0.89  timeseries.ToDatetimeISO8601.time_iso8601_tz_spaceformat
-     7.38±0.03μs      6.50±0.04μs     0.88  timeseries.DatetimeIndex.time_get('tz_aware')
-      10.6±0.1ms         7.75±0ms     0.73  timeseries.ToDatetimeISO8601.time_iso8601

@jreback jreback added this to the 0.24.0 milestone Jul 20, 2018
@jreback jreback merged commit 652a3de into pandas-dev:master Jul 20, 2018
@jreback
Copy link
Contributor

jreback commented Jul 20, 2018

thanks @jbrockmendel happy to have cleanups

@jreback
Copy link
Contributor

jreback commented Jul 24, 2018

on build getting

warning: pandas/_libs/tslibs/conversion.pyx:614:39: Buffer unpacking not optimized away.
warning: pandas/_libs/tslibs/conversion.pyx:614:39: Buffer unpacking not optimized away.

if you have chance

@jbrockmendel jbrockmendel deleted the tzverbose branch July 25, 2018 00:42
@jbrockmendel
Copy link
Member Author

warning: pandas/_libs/tslibs/conversion.pyx:614:39: Buffer unpacking not optimized away

I'm pretty sure this has to do with the return type being int64_t[:], will see if there's anything that can be done about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Internals Related to non-user accessible pandas implementation Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants