Skip to content

move src/datetime.pxd funcs to np_datetime and fix misleading names #18045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 31, 2017

Conversation

jbrockmendel
Copy link
Member

The name _pydatetime_to_dts is misleading, since dts is the variable names used for pandas_datetimestruct objects, while the func actually returns an int64. This renames it to _pydatetime_to_dt64.

The analogous function for datetime.date has the non-misleading name _date_to_datetime64. For consistency this is renamed to _pydate_to_dt64.

Both are moved to np_datetime, and have their inputs typed as datetime and date instead of object.

@codecov
Copy link

codecov bot commented Oct 31, 2017

Codecov Report

Merging #18045 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18045      +/-   ##
==========================================
- Coverage   91.24%   91.23%   -0.02%     
==========================================
  Files         163      163              
  Lines       50114    50114              
==========================================
- Hits        45729    45720       -9     
- Misses       4385     4394       +9
Flag Coverage Δ
#multiple 89.04% <ø> (ø) ⬆️
#single 40.24% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c65a0f5...075574d. Read the comment docs.

@codecov
Copy link

codecov bot commented Oct 31, 2017

Codecov Report

Merging #18045 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18045      +/-   ##
==========================================
- Coverage   91.24%   91.23%   -0.02%     
==========================================
  Files         163      163              
  Lines       50114    50114              
==========================================
- Hits        45729    45720       -9     
- Misses       4385     4394       +9
Flag Coverage Δ
#multiple 89.04% <ø> (ø) ⬆️
#single 40.24% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c65a0f5...ccf793e. Read the comment docs.

@@ -14,3 +16,6 @@ cdef check_dts_bounds(pandas_datetimestruct *dts)

cdef int64_t dtstruct_to_dt64(pandas_datetimestruct* dts) nogil
cdef void dt64_to_dtstruct(int64_t dt64, pandas_datetimestruct* out) nogil

cdef int64_t _pydatetime_to_dt64(datetime val, pandas_datetimestruct *dts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would de-privatize

@jreback jreback added Clean Datetime Datetime data dtype labels Oct 31, 2017
@jbrockmendel
Copy link
Member Author

taskset 3 asv continuous -E virtualenv -f 1.1 master HEAD -b timeseries
[...]
       before           after         ratio
     [4578a039]       [075574d7]
+           1.61s            1.95s     1.22  timeseries.AsOfDataFrame.time_asof_nan
+        124±20ms            145ms     1.17  timeseries.DatetimeIndex.time_dti_tz_factorize
-      21.6±0.2μs       19.3±0.1μs     0.89  timeseries.Offsets.time_custom_bday_incr
-      27.1±0.2μs       23.9±0.1μs     0.88  timeseries.Offsets.time_custom_bday_cal_incr
-           1.89s            1.66s     0.88  timeseries.AsOfDataFrame.time_asof
-       149±0.2ms        130±0.1ms     0.88  timeseries.ToDatetime.time_iso8601_tz_spaceformat
-      17.6±0.1μs       14.7±0.1μs     0.83  timeseries.Offsets.time_custom_bday_apply
-        14.4±1ms      11.8±0.01ms     0.82  timeseries.DatetimeIndex.time_infer_freq_none
-        21.0±2ms         11.8±1ms     0.56  timeseries.TimeSeries.time_add_irregular

Any idea why time_asof_nan and time_dti_tz_factorize might slow down? There should be small bumps from specifying types for pydate_to_dt64 and pydatetime_to_dt64, and from cimporting tz_convert_single in offsets. The only drag I can think of is maybe having funcs in a .pyx instead of .pxd affects how they are inlined?

@jreback jreback added this to the 0.22.0 milestone Oct 31, 2017
@jreback jreback merged commit d80a4b8 into pandas-dev:master Oct 31, 2017
@jreback
Copy link
Contributor

jreback commented Oct 31, 2017

I benchmarked this and it showed very little changes (as expected). something is still going on with your asv. you should always use affinity.

@jbrockmendel jbrockmendel deleted the tslibs-conversion8 branch October 31, 2017 19:57
@jbrockmendel
Copy link
Member Author

Thanks for double-checking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants