Skip to content

ENH: Support timespec argument in Timestamp.isoformat() #44397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 14, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.4.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,7 @@ Other enhancements
- :meth:`DataFrame.dropna` now accepts a single label as ``subset`` along with array-like (:issue:`41021`)
- :meth:`read_excel` now accepts a ``decimal`` argument that allow the user to specify the decimal point when parsing string columns to numeric (:issue:`14403`)
- :meth:`.GroupBy.mean` now supports `Numba <http://numba.pydata.org/>`_ execution with the ``engine`` keyword (:issue:`43731`)
- :meth:`Timestamp.isoformat`, now handles the ``timespec`` argument from the base :class:``datetime`` class (:issue:`26131`)

.. ---------------------------------------------------------------------------

Expand Down
34 changes: 33 additions & 1 deletion pandas/_libs/tslibs/nattype.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,39 @@ cdef class _NaT(datetime):
def __str__(self) -> str:
return "NaT"

def isoformat(self, sep="T") -> str:
def isoformat(self, sep: str = "T", timespec: str = "auto") -> str:
"""
Return the time formatted according to ISO.

The full format looks like 'YYYY-MM-DD HH:MM:SS.mmmmmmnnn'.
By default, the fractional part is omitted if self.microsecond == 0
and self.nanosecond == 0.

If self.tzinfo is not None, the UTC offset is also attached, giving
giving a full format of 'YYYY-MM-DD HH:MM:SS.mmmmmmnnn+HH:MM'.

Parameters
----------
sep : str, default 'T'
String used as the separator between the date and time.

timespec : str, default 'auto'
Specifies the number of additional terms of the time to include.
The valid values are 'auto', 'hours', 'minutes', 'seconds',
'milliseconds', 'microseconds', and 'nanoseconds'.

Returns
-------
str

Examples
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example should be for NaT (to be honest dont' need it here though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I ran into - there was a test failure because the docstrings differed between NaT and Timestamp, so that's why I copied the docstring into NaT. Do you have any advice about that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right we have tests that confirm this, but since we are not explicitly making these differnt you can change the test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so I ended up reverting the docstring change to NaT and updated the test to ignore the docstring changes with Timestamp.

--------
>>> ts = pd.Timestamp('2020-03-14T15:32:52.192548651')
>>> ts.isoformat()
'2020-03-14T15:32:52.192548651'
>>> ts.isoformat(timespec='microseconds')
'2020-03-14T15:32:52.192548'
"""
# This allows Timestamp(ts.isoformat()) to always correctly roundtrip.
return "NaT"

Expand Down
48 changes: 41 additions & 7 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -737,20 +737,54 @@ cdef class _Timestamp(ABCTimestamp):
# -----------------------------------------------------------------
# Rendering Methods

def isoformat(self, sep: str = "T") -> str:
base = super(_Timestamp, self).isoformat(sep=sep)
if self.nanosecond == 0:
def isoformat(self, sep: str = "T", timespec: str = "auto") -> str:
"""
Return the time formatted according to ISO.

The full format looks like 'YYYY-MM-DD HH:MM:SS.mmmmmmnnn'.
By default, the fractional part is omitted if self.microsecond == 0
and self.nanosecond == 0.

If self.tzinfo is not None, the UTC offset is also attached, giving
giving a full format of 'YYYY-MM-DD HH:MM:SS.mmmmmmnnn+HH:MM'.

Parameters
----------
sep : str, default 'T'
String used as the separator between the date and time.

timespec : str, default 'auto'
Specifies the number of additional terms of the time to include.
The valid values are 'auto', 'hours', 'minutes', 'seconds',
'milliseconds', 'microseconds', and 'nanoseconds'.

Returns
-------
str

Examples
--------
>>> ts = pd.Timestamp('2020-03-14T15:32:52.192548651')
>>> ts.isoformat()
'2020-03-14T15:32:52.192548651'
>>> ts.isoformat(timespec='microseconds')
'2020-03-14T15:32:52.192548'
"""
base_ts = "microseconds" if timespec == "nanoseconds" else timespec
base = super(_Timestamp, self).isoformat(sep=sep, timespec=base_ts)
if self.nanosecond == 0 and timespec != "nanoseconds":
return base

if self.tzinfo is not None:
base1, base2 = base[:-6], base[-6:]
else:
base1, base2 = base, ""

if self.microsecond != 0:
base1 += f"{self.nanosecond:03d}"
else:
base1 += f".{self.nanosecond:09d}"
if timespec == "nanoseconds" or (timespec == "auto" and self.nanosecond):
if self.microsecond:
base1 += f"{self.nanosecond:03d}"
else:
base1 += f".{self.nanosecond:09d}"

return base1 + base2

Expand Down
71 changes: 71 additions & 0 deletions pandas/tests/scalar/timestamp/test_formats.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
import pytest

from pandas import Timestamp

ts_no_ns = Timestamp(
year=2019,
month=5,
day=18,
hour=15,
minute=17,
second=8,
microsecond=132263,
)
ts_ns = Timestamp(
year=2019,
month=5,
day=18,
hour=15,
minute=17,
second=8,
microsecond=132263,
nanosecond=123,
)
ts_ns_tz = Timestamp(
year=2019,
month=5,
day=18,
hour=15,
minute=17,
second=8,
microsecond=132263,
nanosecond=123,
tz="UTC",
)
ts_no_us = Timestamp(
year=2019,
month=5,
day=18,
hour=15,
minute=17,
second=8,
microsecond=0,
nanosecond=123,
)


@pytest.mark.parametrize(
"ts, timespec, expected_iso",
[
(ts_no_ns, "auto", "2019-05-18T15:17:08.132263"),
(ts_no_ns, "seconds", "2019-05-18T15:17:08"),
(ts_no_ns, "nanoseconds", "2019-05-18T15:17:08.132263000"),
(ts_ns, "auto", "2019-05-18T15:17:08.132263123"),
(ts_ns, "hours", "2019-05-18T15"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also run on NaT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it made sense to run the same tests on NaT(?) but I did add a test to the NaT tests to check that it at least accepts the timespec parameter.

(ts_ns, "minutes", "2019-05-18T15:17"),
(ts_ns, "seconds", "2019-05-18T15:17:08"),
(ts_ns, "milliseconds", "2019-05-18T15:17:08.132"),
(ts_ns, "microseconds", "2019-05-18T15:17:08.132263"),
(ts_ns, "nanoseconds", "2019-05-18T15:17:08.132263123"),
(ts_ns_tz, "auto", "2019-05-18T15:17:08.132263123+00:00"),
(ts_ns_tz, "hours", "2019-05-18T15+00:00"),
(ts_ns_tz, "minutes", "2019-05-18T15:17+00:00"),
(ts_ns_tz, "seconds", "2019-05-18T15:17:08+00:00"),
(ts_ns_tz, "milliseconds", "2019-05-18T15:17:08.132+00:00"),
(ts_ns_tz, "microseconds", "2019-05-18T15:17:08.132263+00:00"),
(ts_ns_tz, "nanoseconds", "2019-05-18T15:17:08.132263123+00:00"),
(ts_no_us, "auto", "2019-05-18T15:17:08.000000123"),
],
)
def test_isoformat(ts, timespec, expected_iso):
assert ts.isoformat(timespec=timespec) == expected_iso