Skip to content

BUG: GH29461 Strftime #34668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 47 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
12d0b4d
TST: GH28813 test .diff() on Sparse dtype
matteosantama May 19, 2020
02c4a85
TST: GH28813 test .diff() on Sparse dtype
matteosantama May 20, 2020
7e3256b
TST: GH28813 pull sparse diff() test into its own function
matteosantama May 20, 2020
f00e48a
Merge branch 'sparse_diff'
matteosantama May 22, 2020
a90f928
Merge branch 'master' of https://github.com/pandas-dev/pandas
matteosantama May 22, 2020
26a920c
BUG: GH29461 display nanoseconds with strftime()
matteosantama May 22, 2020
3a529ac
BUG: GH29461 don't display nanoseconds in strftime if none exists for…
matteosantama May 22, 2020
d0124aa
update whatsnew doc
matteosantama May 22, 2020
0c0aaf2
black formatting
matteosantama May 22, 2020
abdbe4e
remove trailing whitespace to conform to CI
matteosantama May 22, 2020
97ae6b3
still trying to pass linting checks
matteosantama May 22, 2020
1132aba
Merge branch 'master' of https://github.com/pandas-dev/pandas
matteosantama May 23, 2020
0e42026
Merge branch 'master' of github.com:matteosantama/pandas into strftime
matteosantama May 23, 2020
ab0c9d4
Add strftime benchmarks
matteosantama May 23, 2020
532b19c
Early exit strftime if no nanoseconds
matteosantama May 23, 2020
88fba5e
Make loop more pythonic
matteosantama May 24, 2020
2ac690c
Fix benchmark test
matteosantama May 24, 2020
3b533c9
Remove whitespace
matteosantama May 24, 2020
f6ba1c9
Remove extra function call
matteosantama May 26, 2020
07b27e2
Benchmark series.strftime()
matteosantama May 26, 2020
4b57c4f
Merge branch 'master' of https://github.com/pandas-dev/pandas into st…
matteosantama May 27, 2020
d72222a
Use explicitly named parameters in testing
Jun 8, 2020
e920d20
Use regex for replacing %f
Jun 8, 2020
34db469
Clean up Timestamp._time_repr to use new strftime functionality
Jun 8, 2020
fbe286e
Commiting so I can merge master
Jun 8, 2020
54d30eb
Merge branch 'master' of https://github.com/pandas-dev/pandas into st…
Jun 8, 2020
90629c5
Use regex for replacing %f
Jun 9, 2020
821dfbb
Clean up _time_repr to use new strftime functionality
Jun 9, 2020
16b0f9f
Test for all datetime strftime directives
matteosantama Jun 9, 2020
b8565f2
Test for all datetime strftime directives
matteosantama Jun 9, 2020
6866a3d
Test for all datetime strftime directives
matteosantama Jun 9, 2020
3806567
Merge branch 'master' of https://github.com/pandas-dev/pandas into st…
matteosantama Jun 9, 2020
a4ddcbb
Call super strftime instead of time strftime
matteosantama Jun 9, 2020
7fe0a5e
Improve docstring
matteosantama Jun 9, 2020
ce9ef3d
Remove whitespace in docstring
matteosantama Jun 9, 2020
7b69abb
Only show fractional seconds if they exist
matteosantama Jun 9, 2020
858b3fb
Docstring not building correctly
matteosantama Jun 9, 2020
e7b8525
Fixed typo
matteosantama Jun 9, 2020
6a2f3d2
Check nanoseconds first
matteosantama Jun 9, 2020
d62ff1d
Merge branch 'master' of https://github.com/pandas-dev/pandas into st…
matteosantama Jun 9, 2020
cbb735e
Rename testing classes
matteosantama Jun 9, 2020
0c4ef36
Use string replace instead of re package
matteosantama Jun 11, 2020
62c1126
Fix test parametrization
matteosantama Jun 11, 2020
4468168
Fix test parametrization
matteosantama Jun 11, 2020
88c7e26
Resolve merge conflicts
matteosantama Jul 7, 2020
4d433f9
Update docstring for NaT to match Timestamp
matteosantama Jul 7, 2020
b0de2c4
Merge branch 'master' of https://github.com/pandas-dev/pandas into st…
matteosantama Jul 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions asv_bench/benchmarks/timeseries.py
Original file line number Diff line number Diff line change
Expand Up @@ -423,4 +423,23 @@ def time_dt_accessor_year(self, tz):
self.series.dt.year


class DateTimeAccessorStrftime:

params = (
[None, "US/Eastern", "UTC", dateutil.tz.tzutc()],
["%Y-%m-%d %H:%M:%S.%f%z", "%Y-%m-%d %H:%M:%S%z"],
["T", "S", "NS"],
)
param_names = ["tz", "fmt", "frequency"]

def setup(self, tz, fmt, frequency):
N = 100000
self.series = Series(
date_range(start="1/1/2000", periods=N, freq=frequency, tz=tz)
)

def time_dt_accessor_strftime(self, tz, fmt, frequency):
self.series.dt.strftime(fmt)


from .pandas_vb_common import setup # noqa: F401 isort:skip
11 changes: 11 additions & 0 deletions asv_bench/benchmarks/tslibs/timestamp.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,17 @@ def time_weekday_name(self, tz, freq):
self.ts.day_name()


class TimestampStrftimeMethod:
params = ["%Y-%m-%d %H:%M:%S", "%Y-%m-%d %H:%M:%S.%f"]
param_names = ["fmt"]

def setup(self, fmt):
self.ts = Timestamp("2020-05-23 18:06:13.123456789")

def time_strftime(self, fmt):
self.ts.strftime(fmt)


class TimestampOps:
params = _tzs
param_names = ["tz"]
Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -915,6 +915,7 @@ Datetimelike
- Bug in :meth:`DatetimeIndex.intersection` and :meth:`TimedeltaIndex.intersection` with results not having the correct ``name`` attribute (:issue:`33904`)
- Bug in :meth:`DatetimeArray.__setitem__`, :meth:`TimedeltaArray.__setitem__`, :meth:`PeriodArray.__setitem__` incorrectly allowing values with ``int64`` dtype to be silently cast (:issue:`33717`)
- Bug in subtracting :class:`TimedeltaIndex` from :class:`Period` incorrectly raising ``TypeError`` in some cases where it should succeed and ``IncompatibleFrequency`` in some cases where it should raise ``TypeError`` (:issue:`33883`)
- Bug in :meth:`Timestamp.strftime` did not display full nanosecond precision (:issue:`29461`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to 1.2

- Bug in constructing a Series or Index from a read-only NumPy array with non-ns
resolution which converted to object dtype instead of coercing to ``datetime64[ns]``
dtype when within the timestamp bounds (:issue:`34843`).
Expand Down
18 changes: 18 additions & 0 deletions pandas/_libs/tslibs/nattype.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,25 @@ class NaTType(_NaT):
Function is not implemented. Use pd.to_datetime().
""",
)
strftime = _make_error_func(
"strftime",
"""
Constructs datetime style `format` string from Timestamp.

See `datetime <https://docs.python.org/3/library/datetime\
.html#strftime-and-strptime-format-codes>`_ module for all available directives.

Parameters
----------
format : str
String of formatting directives

Returns
-------
str
String representation of Timestamp
""",
)
utcfromtimestamp = _make_error_func(
"utcfromtimestamp",
"""
Expand Down
34 changes: 26 additions & 8 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -645,14 +645,10 @@ cdef class _Timestamp(ABCTimestamp):

@property
def _time_repr(self) -> str:
result = f'{self.hour:02d}:{self.minute:02d}:{self.second:02d}'

if self.nanosecond != 0:
result += f'.{self.nanosecond + 1000 * self.microsecond:09d}'
elif self.microsecond != 0:
result += f'.{self.microsecond:06d}'

return result
fmt = '%H:%M:%S'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this change any of the benchmarks?

Copy link
Contributor Author

@matteosantama matteosantama Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we suffer a penalty for the additional super() function call in strftime(), and then there's an additional performance impact when we actually need to process nanoseconds.

       before           after         ratio
     [6efb0b20]       [0c4ef36f]
     <master>         <strftime>
+      3.61±0.1μs       4.40±0.2μs     1.22  tslibs.timestamp.TimestampStrftimeMethod.time_strftime('%Y-%m-%d %H:%M:%S.%f')
+      3.29±0.1μs      3.68±0.05μs     1.12  tslibs.timestamp.TimestampStrftimeMethod.time_strftime('%Y-%m-%d %H:%M:%S')
+      8.91±0.2μs       9.90±0.4μs     1.11  tslibs.timestamp.TimestampProperties.time_month_name(None, None)
-        409±20ns         308±10ns     0.75  tslibs.timestamp.TimestampProperties.time_days_in_month(<UTC>, 'B')

if self.microsecond or self.nanosecond:
fmt = '%H:%M:%S.%f'
return self.strftime(fmt)

@property
def _short_repr(self) -> str:
Expand Down Expand Up @@ -1473,6 +1469,28 @@ default 'raise'
self.nanosecond / 3600.0 / 1e+9
) / 24.0)

def strftime(self, format: str) -> str:
"""
Constructs datetime style `format` string from Timestamp.

See `datetime <https://docs.python.org/3/library/datetime\
.html#strftime-and-strptime-format-codes>`_ module for all available directives.

Parameters
----------
format : str
String of formatting directives

Returns
-------
str
String representation of Timestamp
"""
if self.nanosecond and '%f' in format:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this check and the one on L649? Can there just be one check done here?

replacement = f'{self.microsecond * 1000 + self.nanosecond:09d}'
format = format.replace('%f', replacement)
return super().strftime(format)


# Aliases
Timestamp.weekofyear = Timestamp.week
Expand Down
65 changes: 64 additions & 1 deletion pandas/tests/scalar/timestamp/test_timestamp.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from pandas.compat.numpy import np_datetime64_compat
import pandas.util._test_decorators as td

from pandas import NaT, Timedelta, Timestamp
from pandas import NaT, Timedelta, Timestamp, to_datetime
import pandas._testing as tm

from pandas.tseries import offsets
Expand Down Expand Up @@ -381,6 +381,55 @@ def test_tz_conversion_freq(self, tz_naive_fixture):
t2 = Timestamp("2019-01-02 12:00", tz="UTC", freq="T")
assert t2.tz_convert(tz="UTC").freq == t2.freq

@pytest.mark.parametrize(
"_input,fmt,_output",
[
("2020-05-22 11:07:30", "%Y-%m-%d", "2020-05-22"),
("2020-05-22 11:07:30.123456", "%Y-%m-%d %f", "2020-05-22 123456"),
("2020-05-22 11:07:30.123456789", "%f", "123456789"),
],
)
def test_strftime(self, _input, fmt, _output):
ts = Timestamp(_input)
result = ts.strftime(fmt)
assert result == _output

@pytest.mark.parametrize(
"fmt",
[
"%a",
"%A",
"%w",
"%d",
"%b",
"%B",
"%m",
"%y",
"%Y",
"%H",
"%I",
"%p",
"%M",
"%S",
"%f",
"%z",
"%Z",
"%j",
"%U",
"%W",
"%c",
"%x",
"%X",
"%G",
"%u",
"%V",
],
)
def test_strftime_components(self, fmt):
ts = Timestamp("2020-06-09 09:04:11.123456", tzinfo=utc)
dt = datetime(2020, 6, 9, 9, 4, 11, 123456, tzinfo=utc)
assert ts.strftime(fmt) == dt.strftime(fmt)


class TestTimestampNsOperations:
def test_nanosecond_string_parsing(self):
Expand Down Expand Up @@ -442,6 +491,20 @@ def test_nanosecond_timestamp(self):
assert t.value == expected
assert t.nanosecond == 10

@pytest.mark.parametrize(
"date",
[
"2020-05-22 08:53:19.123456789",
"2020-05-22 08:53:19.123456",
"2020-05-22 08:53:19",
],
)
@pytest.mark.parametrize("fmt", ["%m/%d/%Y %H:%M:%S.%f", "%m%d%Y%H%M%S%f"])
def test_nanosecond_roundtrip(self, date, fmt):
ts = Timestamp(date)
string = ts.strftime(fmt)
assert ts == to_datetime(string, format=fmt)


class TestTimestampToJulianDate:
def test_compare_1700(self):
Expand Down