Skip to content

BUG: Timedelta components rounded by float imprecision #31354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pganssle opened this issue Jan 27, 2020 · 1 comment · Fixed by #31380
Closed

BUG: Timedelta components rounded by float imprecision #31354

pganssle opened this issue Jan 27, 2020 · 1 comment · Fixed by #31380
Labels
Bug Timedelta Timedelta data type

Comments

@pganssle
Copy link
Contributor

Problem description

It appears that there is some premature rounding happening in the Timedelta constructor that makes it so the unit-adjusted sums of the days, seconds, microseconds and nanoseconds attributes do not sum to the total number of nanoseconds. A fundamental assumption of the datetime.timedelta type (and breaking this assumption breaks Liskov substitutability) is that the total time difference at the precision of microseconds can be represented by summing up the unit-adjusted days, seconds and microseconds attributes, and it's how datetime.total_seconds() works.

I believe that this is the root cause of issue #31043, which was "fixed" with what is essentially a workaround in PR #31155, as I mentioned in this comment.

At the moment the most obvious effect is that bug #31043 only is fixed for recent versions of dateutil, but presumably it will show up in other places where standard datetime arithmetic is being used on pandas timestamps.

Code Sample, a copy-pastable example if possible

def to_ns(td):
  ns = td.days * 86400
  ns += td.seconds
  ns *= 1000000
  ns += td.microseconds
  ns *= 1000
  ns += td.nanoseconds
  return ns

td = timedelta(1552211999999999872, unit="ns")
print(td.value)  # 1552211999999999872
print(to_ns(td))  # 1552212000000000872

Actual output:

1552211999999999872
1552212000000000872

Expected output

1552211999999999872
1552211999999999872
@pganssle pganssle changed the title BUG: Timedelta BUG: Timedelta components rounded due by float imprecision Jan 27, 2020
@pganssle pganssle changed the title BUG: Timedelta components rounded due by float imprecision BUG: Timedelta components rounded by float imprecision Jan 27, 2020
@mroeschke
Copy link
Member

I suspect this line might be responsible for premature rounding:

td_base = _Timedelta.__new__(cls, microseconds=int(value) / 1000)

In [15]: from datetime import timedelta

In [17]: timedelta(microseconds=int(1552211999999999872) / 1000).seconds
Out[17]: 36000

pd.Timedelta actually has private attributes that can compose value accurately, we just need to ensure that the corresponding attributes in datetime.timedelta are passed those

In [23]: ((td._d * 86400 + td._seconds) * 1000000 + td._microseconds) * 1000 + td._ns
Out[23]: 1552211999999999872

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants