Skip to content

modernize syntax in Timestamp #18327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 19, 2017

Conversation

jbrockmendel
Copy link
Member

Small cleanups to Timestamp+_Timestamp, use @property instead of older syntax.

The only substantive change is in __add__, where result = Timestamp(normalize_date(result)) is changed to result = result.normalize(). The latter has the benefits

  1. is robust for tzaware timestamps,

  2. does not require normalize_date in the namespace, so after this Timestamp+_Timestamp can be cut/paste into a new tslibs.timestamps and

  3. is robust to subclassing of Timestamp.

@gfyoung gfyoung added Internals Related to non-user accessible pandas implementation Code Style Code style, linting, code_checks labels Nov 16, 2017
@@ -92,6 +92,7 @@ from tslibs.nattype import NaT, nat_strings, iNaT
from tslibs.nattype cimport _checknull_with_nat, NPY_NAT


# TODO: Can this become a classmethod?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, see how its used, it is for compat on creating datetime, date, Timestamp

@codecov
Copy link

codecov bot commented Nov 16, 2017

Codecov Report

Merging #18327 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18327      +/-   ##
==========================================
- Coverage   91.38%   91.36%   -0.02%     
==========================================
  Files         164      164              
  Lines       49780    49790      +10     
==========================================
+ Hits        45491    45492       +1     
- Misses       4289     4298       +9
Flag Coverage Δ
#multiple 89.17% <ø> (ø) ⬆️
#single 39.49% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/core/generic.py 95.73% <0%> (ø) ⬆️
pandas/core/series.py 95.02% <0%> (+0.01%) ⬆️
pandas/core/sparse/series.py 95.33% <0%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dd38eea...fcaf3c3. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this change any perf?

@@ -92,6 +92,7 @@ from tslibs.nattype import NaT, nat_strings, iNaT
from tslibs.nattype cimport _checknull_with_nat, NPY_NAT


# TODO: Can this become a classmethod?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, see how its used, it is for compat on creating datetime, date, Timestamp

return '%s %s' % (self._date_repr, self._time_repr)
@property
def _repr_base(self):
return '%s %s' % (self._date_repr, self._time_repr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change what you can to use format syntax

@jbrockmendel
Copy link
Member Author

no, see how its used, it is for compat on creating datetime, date, Timestamp

I'll remove the comment, but I don't see why the compat usage affects the answer. There are a handful of classmethods that in principle should use cls instead of a hard-coded Timestamp. If create_timestamp_from_ts were a (cdef'd, not python-visible) classmethod, then those could be fixed/improved.

@jbrockmendel
Copy link
Member Author

does this change any perf?

I didn't benchmark it. The @property should be treated by cython identically to the older syntax. __repr__ might be slightly improved because the self.freq is None case should be more common than not, so will short-circuit that evaluation. But that will be miniscule.

Using result.normalize() instead of normalize_date(result) is a toss-up.

@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

actually pls give this a quick benchmark, I recall this syntax as somewhat preferable, though that maybe outdated.

@jbrockmendel
Copy link
Member Author

Looks like noise:

asv continuous -f 1.1 -E virtualenv master HEAD -b timestamp -b timeseries
[...]
       before           after         ratio
     [6405919e]       [fcaf3c33]
+     5.75±0.01μs      7.41±0.01μs     1.29  timestamp.TimestampProperties.time_days_in_month
+     5.89±0.02μs      7.30±0.01μs     1.24  timestamp.TimestampProperties.time_is_month_start
+         443±2ms         544±30ms     1.23  timeseries.ToDatetime.time_format_no_exact
+     18.5±0.09μs       21.3±0.1μs     1.15  timeseries.Offsets.time_custom_bday_incr
+      17.5±0.2μs         19.7±1μs     1.12  timeseries.Offsets.time_timeseries_day_apply
+      392±0.06ms       437±0.03ms     1.12  timeseries.DatetimeIndex.time_add_offset_slow
-         116±3ns        105±0.2ns     0.90  timestamp.TimestampProperties.time_microsecond
-         447±2ms        401±0.8ms     0.90  timeseries.SemiMonthOffset.time_end_incr_rng
-        22.2±1μs       19.7±0.1μs     0.89  timeseries.Offsets.time_timeseries_day_incr
-        14.4±1μs      12.7±0.03μs     0.88  timeseries.SemiMonthOffset.time_begin_apply
-      6.39±0.3μs      5.62±0.01μs     0.88  timestamp.TimestampProperties.time_quarter
-      24.9±0.6μs       21.5±0.1μs     0.86  timeseries.SemiMonthOffset.time_begin_decr_n
-        116±20ms         33.9±3ms     0.29  timeseries.DatetimeIndex.time_dti_factorize
-        444±10ms         102±20ms     0.23  binary_ops.TimeseriesTZ.time_timestamp_ops_diff2
-        129±10ms         29.0±1ms     0.23  timeseries.DatetimeIndex.time_dti_tz_factorize
-           2.00s          406±4ms     0.20  timeseries.AsOfDataFrame.time_asof
-       929±100ms         169±30ms     0.18  binary_ops.Timeseries.time_timestamp_ops_diff2
-         388±7ms         69.1±2ms     0.18  binary_ops.TimeseriesTZ.time_timestamp_ops_diff1

and with affinity

taskset 4 asv continuous -f 1.1 -E virtualenv master HEAD -b timestamp -b timeseries
[...]
       before           after         ratio
     [6405919e]       [fcaf3c33]
+     5.69±0.02μs      6.68±0.01μs     1.17  timestamp.TimestampProperties.time_is_quarter_end
+     17.5±0.08μs       20.1±0.2μs     1.15  timeseries.Offsets.time_timeseries_day_apply
+     29.8±0.09μs       34.1±0.2μs     1.15  timeseries.Offsets.time_custom_bday_decr
+     5.89±0.02μs      6.50±0.02μs     1.10  timestamp.TimestampProperties.time_is_year_start
-     20.7±0.07μs      18.8±0.05μs     0.91  timeseries.SemiMonthOffset.time_begin_decr
-      27.9±0.1μs       24.3±0.1μs     0.87  timeseries.Offsets.time_custom_bday_cal_incr
-       53.4±10ms         26.6±2ms     0.50  timeseries.DatetimeIndex.time_dti_factorize

@jreback jreback added this to the 0.22.0 milestone Nov 19, 2017
@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

your asv runs are still very wild.

@jreback jreback merged commit b315e7d into pandas-dev:master Nov 19, 2017
@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

thanks

@jbrockmendel jbrockmendel deleted the tslibs-timestamps2 branch November 19, 2017 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants