Skip to content

BUG: to_timedelta overflows without raising in some very particular cases #17037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cchwala opened this issue Jul 20, 2017 · 8 comments
Closed
Assignees
Labels
Bug Timedelta Timedelta data type

Comments

@cchwala
Copy link
Contributor

cchwala commented Jul 20, 2017

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

int_min = np.iinfo(np.int64).min
int_max = np.iinfo(np.int64).max

def float_array_with_smallest_increments(initial_float, N_points_in_one_direction):    
    floats_upward = [initial_float, ]
    floats_downward = [initial_float, ]
    for i in range(N_points_in_one_direction):
        floats_upward.append(np.nextafter(floats_upward[-1] , int_max))
        floats_downward.append(np.nextafter(floats_downward[-1] , int_min)) 
    return np.array(floats_downward[::-1] + floats_upward[1:])

seconds_as_floats = float_array_with_smallest_increments(int_max/1e9, 5)

for v in np.nditer(seconds_as_floats):
    print('%.20f' % v)
    
pd.to_timedelta(seconds_as_floats, unit='s')

Output:

9223372036.85476684570312500000
9223372036.85476875305175781250
9223372036.85477066040039062500
9223372036.85477256774902343750
9223372036.85477447509765625000
9223372036.85477638244628906250
9223372036.85477828979492187500
9223372036.85478019714355468750
9223372036.85478210449218750000
9223372036.85478401184082031250
9223372036.85478591918945312500

TimedeltaIndex([  '106751 days 23:47:16.854767',
                  '106751 days 23:47:16.854769',
                  '106751 days 23:47:16.854771',
                  '106751 days 23:47:16.854773',
                  '106751 days 23:47:16.854774',
                '-106752 days +00:12:43.145224',
                '-106752 days +00:12:43.145226',
                '-106752 days +00:12:43.145228',
                '-106752 days +00:12:43.145230',
                '-106752 days +00:12:43.145232',
                '-106752 days +00:12:43.145234'],
               dtype='timedelta64[ns]', freq=None)

Here is a more detailed notebook showing the problem

Problem description

If you pass floating points values close to the edge of overflow to to_timedelta it might return an incorrect Timedelta instead of raising an OverflowError.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Darwin OS-release: 14.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: None.None

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: 0.9.5-11-gff2e4dd
IPython: 5.3.0
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.0
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added Bug Timedelta Timedelta data type labels Jul 20, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 20, 2017

@cchwala : Weird...I'm inclined to believe there is some machine rounding issues going on, but this certainly merits further investigation. If you can pinpoint the cause, feel free to share and / or submit a PR to patch this behavior!

@cchwala
Copy link
Contributor Author

cchwala commented Jul 28, 2017

@gfyoung : Right now, I have no time to dive into this and fix it, but I could at least provide a test that fails on this bug, probably as PR. This would be a starting point. Or would you rather leave this here as it is, till somebody really wants to tackle the problem?

@gfyoung
Copy link
Member

gfyoung commented Jul 28, 2017

@cchwala : We'll just leave it open for now in case anyone wants to tackle it. Thanks for reporting it!

@jbrockmendel
Copy link
Member

@jreback is this taken care of by #17640?

@jreback
Copy link
Contributor

jreback commented Sep 26, 2017

hmm might be

can u check and if so issue a PR with validation tests?

@jbrockmendel jbrockmendel mentioned this issue Sep 29, 2017
59 tasks
@ron819
Copy link

ron819 commented Nov 5, 2018

@jbrockmendel according to your todo list this issue has been fixed.

@jbrockmendel
Copy link
Member

@ron819 is the list accurate? Let's be sure to defend against the failure mode of me messing up

@jbrockmendel jbrockmendel self-assigned this Jan 28, 2019
@rhshadrach
Copy link
Member

I now get pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: cannot convert input 9223372036.854776 with the unit 's'. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timedelta Timedelta data type
Projects
None yet
Development

No branches or pull requests

6 participants