Skip to content

BUG: GH29461 Strftime #34668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 47 commits into from
Closed

BUG: GH29461 Strftime #34668

wants to merge 47 commits into from

Conversation

matteosantama
Copy link
Contributor

Closed the other PR while I sorted some things out.

matteosantama and others added 30 commits May 19, 2020 23:24
result += f'.{self.microsecond:06d}'

return result
fmt = '%H:%M:%S'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this change any of the benchmarks?

Copy link
Contributor Author

@matteosantama matteosantama Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we suffer a penalty for the additional super() function call in strftime(), and then there's an additional performance impact when we actually need to process nanoseconds.

       before           after         ratio
     [6efb0b20]       [0c4ef36f]
     <master>         <strftime>
+      3.61±0.1μs       4.40±0.2μs     1.22  tslibs.timestamp.TimestampStrftimeMethod.time_strftime('%Y-%m-%d %H:%M:%S.%f')
+      3.29±0.1μs      3.68±0.05μs     1.12  tslibs.timestamp.TimestampStrftimeMethod.time_strftime('%Y-%m-%d %H:%M:%S')
+      8.91±0.2μs       9.90±0.4μs     1.11  tslibs.timestamp.TimestampProperties.time_month_name(None, None)
-        409±20ns         308±10ns     0.75  tslibs.timestamp.TimestampProperties.time_days_in_month(<UTC>, 'B')

"""
if '%f' in format and self.nanosecond:
replacement = f'{self.microsecond * 1000 + self.nanosecond:09d}'
format = re.sub('%f', replacement, format)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use string replacement rather than re? doesn't this change perf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Tested both and string replace is WAY more performant.

@jreback jreback added Output-Formatting __repr__ of pandas objects, to_string Datetime Datetime data dtype labels Jun 9, 2020
@TomAugspurger
Copy link
Contributor

@matteosantama can you merge master and fix the conflicts?

@jreback
Copy link
Contributor

jreback commented Jul 17, 2020

can u merge master, and show benchmarks again; ping on green

@matteosantama
Copy link
Contributor Author

@jreback All checks passing. ASV output is a little confusing. Not entirely sure why so many tests were affected.

       before           after         ratio
     [bfac1362]       [b0de2c4c]
     <master>         <strftime>
+      6.44±0.4μs        34.8±30μs     5.41  tslibs.timestamp.TimestampProperties.time_is_month_end(tzfile('/usr/share/zoneinfo/US/Central'), 'B')
+         204±4ns       1.06±0.8μs     5.18  tslibs.timestamp.TimestampProperties.time_microsecond(tzlocal(), 'B')
+      8.78±0.3μs        42.3±30μs     4.81  tslibs.timestamp.TimestampProperties.time_month_name(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>, None)
+      7.77±0.2μs        19.6±10μs     2.53  tslibs.timestamp.TimestampProperties.time_month_name(<UTC>, 'B')
+      3.52±0.1μs       4.24±0.1μs     1.20  tslibs.timestamp.TimestampStrftimeMethod.time_strftime('%Y-%m-%d %H:%M:%S.%f')
+     3.16±0.06μs       3.60±0.1μs     1.14  tslibs.timestamp.TimestampStrftimeMethod.time_strftime('%Y-%m-%d %H:%M:%S')
-        314±10ns          284±3ns     0.90  tslibs.timestamp.TimestampProperties.time_week(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>, None)
-         277±5ns          250±9ns     0.90  tslibs.timestamp.TimestampProperties.time_days_in_month(<UTC>, None)
-        279±20ns          251±5ns     0.90  tslibs.timestamp.TimestampProperties.time_days_in_month(datetime.timezone(datetime.timedelta(seconds=3600)), 'B')
-        320±20ns          284±2ns     0.89  tslibs.timestamp.TimestampProperties.time_week(tzlocal(), None)
-        279±40ns          248±4ns     0.89  tslibs.timestamp.TimestampProperties.time_is_year_start(tzlocal(), None)
-        283±10ns          250±4ns     0.88  tslibs.timestamp.TimestampProperties.time_dayofweek(<UTC>, None)
-        137±10μs          120±2μs     0.88  tslibs.timestamp.TimestampOps.time_ceil(tzfile('/usr/share/zoneinfo/US/Central'))
-        548±10ns          475±4ns     0.87  tslibs.timestamp.TimestampProperties.time_freqstr(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>, None)
-        79.4±4μs       68.7±0.5μs     0.87  tslibs.timestamp.TimestampOps.time_ceil(<UTC>)
-        227±30ns          195±7ns     0.86  tslibs.timestamp.TimestampProperties.time_microsecond(tzfile('/usr/share/zoneinfo/US/Central'), None)
-      9.13±0.9μs       7.51±0.1μs     0.82  tslibs.timestamp.TimestampConstruction.time_parse_iso8601_tz
-       81.3±20μs       52.9±0.9μs     0.65  tslibs.timestamp.TimestampOps.time_replace_tz(tzlocal())
-        18.1±5μs       11.5±0.3μs     0.63  tslibs.timestamp.TimestampOps.time_to_julian_date(tzutc())
-       43.8±20μs       25.5±0.5μs     0.58  tslibs.timestamp.TimestampOps.time_replace_tz(tzfile('/usr/share/zoneinfo/US/Central'))
-        20.9±7μs       11.5±0.3μs     0.55  tslibs.timestamp.TimestampOps.time_to_julian_date(None)
-       70.4±60μs       31.1±0.5μs     0.44  tslibs.timestamp.TimestampOps.time_replace_None(tzlocal())
-        5.45±5μs      2.28±0.06μs     0.42  tslibs.timestamp.TimestampOps.time_replace_None(tzutc())
-       539±300ns          219±4ns     0.41  tslibs.timestamp.TimestampProperties.time_is_quarter_start(tzfile('/usr/share/zoneinfo/US/Central'), None)
-        15.8±8μs       6.10±0.1μs     0.39  tslibs.timestamp.TimestampProperties.time_is_year_start(tzutc(), 'B')
-        15.2±8μs      5.82±0.08μs     0.38  tslibs.timestamp.TimestampProperties.time_is_month_start(None, 'B')
-        9.42±5μs       3.60±0.1μs     0.38  tslibs.timestamp.TimestampOps.time_normalize(None)
-       52.7±30μs       20.1±0.7μs     0.38  tslibs.timestamp.TimestampOps.time_replace_tz(<UTC>)
-       575±300ns          214±8ns     0.37  tslibs.timestamp.TimestampProperties.time_is_month_start(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>, None)
-        108±70μs       34.8±0.6μs     0.32  tslibs.timestamp.TimestampOps.time_normalize(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-        7.23±5μs      2.33±0.05μs     0.32  tslibs.timestamp.TimestampOps.time_replace_None(<UTC>)
-       142±100μs       40.7±0.6μs     0.29  tslibs.timestamp.TimestampOps.time_normalize(tzlocal())
-       137±100μs         36.8±1μs     0.27  tslibs.timestamp.TimestampOps.time_normalize(tzfile('/usr/share/zoneinfo/US/Central'))
-       13.1±10μs      3.47±0.09μs     0.27  tslibs.timestamp.TimestampOps.time_normalize(<UTC>)
-        1.28±1μs          245±5ns     0.19  tslibs.timestamp.TimestampProperties.time_is_year_start(tzfile('/usr/share/zoneinfo/US/Central'), None)
-       647±500μs          123±1μs     0.19  tslibs.timestamp.TimestampOps.time_floor(tzfile('/usr/share/zoneinfo/US/Central'))
-      1.15±0.9μs          217±5ns     0.19  tslibs.timestamp.TimestampProperties.time_is_month_start(None, None)
-        2.54±2μs          476±3ns     0.19  tslibs.timestamp.TimestampProperties.time_freqstr(tzutc(), None)
-       362±300μs         67.5±2μs     0.19  tslibs.timestamp.TimestampOps.time_floor(<UTC>)
-        1.63±1μs          304±2ns     0.19  tslibs.timestamp.TimestampProperties.time_freqstr(datetime.timezone(datetime.timedelta(seconds=3600)), 'B')
-        1.33±1μs          248±3ns     0.19  tslibs.timestamp.TimestampProperties.time_is_year_start(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>, None)
-        1.29±1μs          240±3ns     0.19  tslibs.timestamp.TimestampOps.time_tz_localize(<UTC>)
-        1.64±1μs          305±6ns     0.19  tslibs.timestamp.TimestampProperties.time_freqstr(tzutc(), 'B')
-        1.28±1μs          236±4ns     0.18  tslibs.timestamp.TimestampProperties.time_dayofweek(None, 'B')
-        1.21±1μs          222±3ns     0.18  tslibs.timestamp.TimestampProperties.time_is_quarter_end(tzlocal(), None)
-        2.57±2μs          470±5ns     0.18  tslibs.timestamp.TimestampProperties.time_freqstr(tzlocal(), None)
-       859±700μs          154±4μs     0.18  tslibs.timestamp.TimestampOps.time_ceil(tzlocal())
-       668±500μs          119±1μs     0.18  tslibs.timestamp.TimestampOps.time_floor(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-       386±300μs         67.0±2μs     0.17  tslibs.timestamp.TimestampOps.time_floor(tzutc())
-       564±500μs         91.5±2μs     0.16  tslibs.timestamp.TimestampOps.time_floor(datetime.timezone(datetime.timedelta(seconds=3600)))
-       945±900μs          149±3μs     0.16  tslibs.timestamp.TimestampOps.time_floor(tzlocal())

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a merge conflict - can you move note to 1.2?

str
String representation of Timestamp
"""
if self.nanosecond and '%f' in format:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this check and the one on L649? Can there just be one check done here?

@@ -915,6 +915,7 @@ Datetimelike
- Bug in :meth:`DatetimeIndex.intersection` and :meth:`TimedeltaIndex.intersection` with results not having the correct ``name`` attribute (:issue:`33904`)
- Bug in :meth:`DatetimeArray.__setitem__`, :meth:`TimedeltaArray.__setitem__`, :meth:`PeriodArray.__setitem__` incorrectly allowing values with ``int64`` dtype to be silently cast (:issue:`33717`)
- Bug in subtracting :class:`TimedeltaIndex` from :class:`Period` incorrectly raising ``TypeError`` in some cases where it should succeed and ``IncompatibleFrequency`` in some cases where it should raise ``TypeError`` (:issue:`33883`)
- Bug in :meth:`Timestamp.strftime` did not display full nanosecond precision (:issue:`29461`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to 1.2

@MarcoGorelli
Copy link
Member

Hi @matteosantama - is this still active? If so, could you address Jeff's and Will's comments please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Timestamp.strftime(): missing support for nanoseconds
5 participants