BUG: Timedelta components rounded by float imprecision #31354

pganssle · 2020-01-27T16:19:06Z

Problem description

It appears that there is some premature rounding happening in the Timedelta constructor that makes it so the unit-adjusted sums of the days, seconds, microseconds and nanoseconds attributes do not sum to the total number of nanoseconds. A fundamental assumption of the datetime.timedelta type (and breaking this assumption breaks Liskov substitutability) is that the total time difference at the precision of microseconds can be represented by summing up the unit-adjusted days, seconds and microseconds attributes, and it's how datetime.total_seconds() works.

I believe that this is the root cause of issue #31043, which was "fixed" with what is essentially a workaround in PR #31155, as I mentioned in this comment.

At the moment the most obvious effect is that bug #31043 only is fixed for recent versions of dateutil, but presumably it will show up in other places where standard datetime arithmetic is being used on pandas timestamps.

Code Sample, a copy-pastable example if possible

def to_ns(td):
  ns = td.days * 86400
  ns += td.seconds
  ns *= 1000000
  ns += td.microseconds
  ns *= 1000
  ns += td.nanoseconds
  return ns

td = timedelta(1552211999999999872, unit="ns")
print(td.value)  # 1552211999999999872
print(to_ns(td))  # 1552212000000000872

Actual output:

1552211999999999872
1552212000000000872

Expected output

1552211999999999872
1552211999999999872

The text was updated successfully, but these errors were encountered:

mroeschke · 2020-01-28T06:01:14Z

I suspect this line might be responsible for premature rounding:

pandas/pandas/_libs/tslibs/timedeltas.pyx

Line 1253 in 0a099d8

td_base = _Timedelta.__new__(cls, microseconds=int(value) / 1000)

In [15]: from datetime import timedelta

In [17]: timedelta(microseconds=int(1552211999999999872) / 1000).seconds
Out[17]: 36000

pd.Timedelta actually has private attributes that can compose value accurately, we just need to ensure that the corresponding attributes in datetime.timedelta are passed those

In [23]: ((td._d * 86400 + td._seconds) * 1000000 + td._microseconds) * 1000 + td._ns
Out[23]: 1552211999999999872

pganssle changed the title ~~BUG: Timedelta~~ BUG: Timedelta components rounded due by float imprecision Jan 27, 2020

pganssle changed the title ~~BUG: Timedelta components rounded due by float imprecision~~ BUG: Timedelta components rounded by float imprecision Jan 27, 2020

mroeschke added Bug Timedelta Timedelta data type labels Jan 28, 2020

mroeschke mentioned this issue Jan 28, 2020

BUG: Timedelta components no longer rounded with high precision integers #31380

Merged

5 tasks

mroeschke closed this as completed in #31380 Feb 2, 2020

simonjayhawkins mentioned this issue May 29, 2022

BUG: Timedelta.total_seconds method is returning wrong values in nanosecond intervals #46819

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Timedelta components rounded by float imprecision #31354

BUG: Timedelta components rounded by float imprecision #31354

pganssle commented Jan 27, 2020

mroeschke commented Jan 28, 2020

BUG: Timedelta components rounded by float imprecision #31354

BUG: Timedelta components rounded by float imprecision #31354

Comments

pganssle commented Jan 27, 2020

Problem description

Code Sample, a copy-pastable example if possible

Actual output:

Expected output

mroeschke commented Jan 28, 2020