Skip to content

Timestamp.strftime(): missing support for nanoseconds #29461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jgehrcke opened this issue Nov 7, 2019 · 5 comments
Open

Timestamp.strftime(): missing support for nanoseconds #29461

jgehrcke opened this issue Nov 7, 2019 · 5 comments
Labels
Bug Datetime Datetime data dtype Output-Formatting __repr__ of pandas objects, to_string

Comments

@jgehrcke
Copy link
Contributor

jgehrcke commented Nov 7, 2019

When parsing text into a Timestamp object we can specify a format string. Currently %f is documented with

note that "%f" will parse all the way up to nanoseconds

See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html, in particular the description of the format parameter. The note about %f was added in this patch: #8904

The fact that we can parse text using nanosecond precision is great, and here I make use of that behavior (showing two methods yielding the same result):

# Implicit format:
>>> t = pd.to_datetime('2019-10-03T09:30:12.133333337')
>>> t
Timestamp('2019-10-03 09:30:12.133333337')


# Explicit format using %f:
>>> t = pd.to_datetime('2019-10-03T09:30:12.133333337', format='%Y-%m-%dT%H:%M:%S.%f')
>>> t
Timestamp('2019-10-03 09:30:12.133333337')

But when I now want to invert that process using strftime() then the fractional part is truncated to microsecond precision:

>>> t.strftime('%Y-%m-%dT%H:%M:%S.%f')
'2019-10-03T09:30:12.133333'

On the one hand this is inconsistent with the meaning of %f while parsing. On the other hand it corresponds to what's documented in Python's stdlib documentation (which says that %f means "Microsecond as a decimal number, zero-padded on the left.").

In any case, I think it would make sense to have a format string specifier that allows us to turn the timestamp into a string with nanosecond precision.

If I am not mistaken, we otherwise have to work around the absence of that format specifier by using the nanosecond property:

>>> t.nanosecond
337

>>> t.strftime('%Y-%m-%dT%H:%M:%S.%f') + str(t.nanosecond)
'2019-10-03T09:30:12.133333337'

Do you agree that we should have a format specifier for that? Or do we have one, and it's just not documented?

INSTALLED VERSIONS

commit : None
python : 3.7.4.final.0
python-bits : 64
OS : Linux
OS-release : 5.3.7-200.fc30.x86_64
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 0.25.2
numpy : 1.17.3
pytz : 2019.3
dateutil : 2.8.0
pip : 19.0.3
setuptools : 40.8.0
Cython : 0.29.13
pytest : 5.2.1
hypothesis : 4.41.3
sphinx : 2.2.0
blosc : 1.8.1
feather : None
xlsxwriter : 1.2.2
lxml.etree : 4.4.1
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.8.0
pandas_datareader: None
bs4 : 4.8.1
bottleneck : 1.2.1
fastparquet : 0.3.2
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.0
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : 0.3.5
scipy : 1.3.1
sqlalchemy : 1.3.10
tables : 3.6.0
xarray : 0.14.0
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.2

@jgehrcke
Copy link
Contributor Author

jgehrcke commented Nov 7, 2019

Note to self: interestingly, I just saw that Ruby's strftime has a %N - Fractional seconds digits, default is 9 digits (nanosecond): https://ruby-doc.org/core-2.6.4/Time.html, and Go has no strftime, but a property https://golang.org/pkg/time/#Time.Nanosecond that can be zero-padded.

Another note to self: Timestamp.timestamp() returns microsecond precision, instead of nanosecond precision:

>>> t = pd.to_datetime('2019-10-03T09:30:12.133333337')
>>> t.timestamp()
1570095012.133333

Third note to self: for the specific format I want to emit in my example above there is this working shortcut: t.isoformat() yields '2019-11-07T12:29:23.444348736'.

@jbrockmendel jbrockmendel added the Datetime Datetime data dtype label Dec 1, 2019
@mroeschke mroeschke added the Bug label Apr 2, 2020
@matteosantama
Copy link
Contributor

matteosantama commented May 22, 2020

Ran into this issue today. Proposed solution is to add .strftime() method to Timestamp object in pandas/_libs/tslibs/timestamps.pyx.

Is the best route to fully reimplement the method? I think ideally it would look something like this

def strftime(self, fmt: str) -> str:
    if self.nanosecond == 0:
        return super.strftime(fmt)
   
    # else 

but I can't think of an elegant else clause (outside of reimplementing the entire method).

@jreback jreback added the Output-Formatting __repr__ of pandas objects, to_string label May 22, 2020
@jreback jreback added this to the 1.1 milestone Jun 9, 2020
@TomAugspurger TomAugspurger removed this from the 1.1 milestone Jul 6, 2020
@AlexeyDmitriev
Copy link

  1. I don't think that you need the if. Because user may need to output all 9 digits even if zeros are here
  2. You can avoid implementing the whole method this way: (supposing %9 to print 9 digits of nanoseconds)
    You find all %9's (but you need to parse it carefully, e.g. "%%9" is literal percent sign, then literal 9), then replace them all with nine digits of nanoseconds. Then pass the resulting string to the datetime.strftime

@AlexeyDmitriev
Copy link

@jgehrcke

Third note to self: for the specific format I want to emit in my example above there is this working shortcut: t.isoformat() yields '2019-11-07T12:29:23.444348736'.
The problem with your shortcut through is that if t accidentally is whole number of microseconds, the last 000 are not printed

@smarie
Copy link
Contributor

smarie commented May 20, 2022

Note that for Periods there is a different convention in pandas: %u means microseconds and %n means nanoseconds. This is mostly because for such objects, %f means "fiscal year".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants