Skip to content

BUG: out-of-bounds datetime parsing #15829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Mar 29, 2017 · 13 comments · Fixed by #41607
Closed

BUG: out-of-bounds datetime parsing #15829

jreback opened this issue Mar 29, 2017 · 13 comments · Fixed by #41607
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Mar 29, 2017

In [2]: pd.__version__
Out[2]: '0.19.0+695.g2e64614'

suprisingly [2] and [3] return the same as [1] on linux-64 (meaning they don't raise).
This is from macosx. This is the correct behavior (raising).

In [1]: pd.Timestamp('01-01-01')
Out[1]: Timestamp('2001-01-01 00:00:00')

In [2]: pd.Timestamp('001-01-01')
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00

In [3]: pd.Timestamp('0001-01-01')
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00
@jreback jreback added Difficulty Intermediate Error Reporting Incorrect or improved errors from pandas Datetime Datetime data dtype labels Mar 29, 2017
@jreback jreback added this to the Next Major Release milestone Mar 29, 2017
@jorisvandenbossche
Copy link
Member

I get the correct behaviour (so raising) on linux-64 (Ubuntu 16.04, python 2 and 3)

In [51]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-70-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

@jreback
Copy link
Contributor Author

jreback commented Mar 29, 2017

very weird. The reason I noticed this was #15828 failed on Travis with this & it failed on a local linux-64 vm. (on 2.7)

macosx is on 3.6. It possible this is a 2/3 issue.

@jorisvandenbossche
Copy link
Member

I tried both python 3.5 and 2.7, and both worked correctly ..

@jbrockmendel
Copy link
Member

@jreback in 0.23.3 I get OutOfBoundsDatetime in both [2] and [3] on both OSX[py27,py37] and Ubuntu16.04[py27,py35]. Do you still get the original behavior?

@mzamyatina
Copy link

Hi, I get that error too on Ubuntu 14.04, Python 3.6.8 Anaconda custom (64-bit):

In [1]: import pandas                                                                                                                                              

In [2]: pandas.__version__                                                                                                                                         
Out[2]: '0.24.1'

In [3]: pandas.Timestamp('0001-01-01')                                                                                                                             
---------------------------------------------------------------------------
OutOfBoundsDatetime                       Traceback (most recent call last)
<ipython-input-3-fbea3956a1c7> in <module>
----> 1 pandas.Timestamp('0001-01-01')
pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()
pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()
pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_str_to_tsobject()
pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_str_to_tsobject()
pandas/_libs/tslibs/np_datetime.pyx in pandas._libs.tslibs.np_datetime.check_dts_bounds()
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1-01-01 00:00:00

Any updates on a fix?

@jbrockmendel
Copy link
Member

@justmeteomary4 the behavior in [3] you posted is correct; we expect that to raise because pandas Timestamp doesn't support dates before 1677-09-21. If you need to represent an earlier date, use pd.Period.

@mroeschke
Copy link
Member

This should be fully correct now and could use a test

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed 2/3 Compat Error Reporting Incorrect or improved errors from pandas Datetime Datetime data dtype labels Mar 31, 2020
@ghost
Copy link

ghost commented Apr 19, 2020

For me the test fails for macOS Catalina 10.15.3:

@pytest.mark.parametrize(
    "s",
    [
        "01-01-01",  # Fails
        "001-01-01",  # Passes
        "0001-01-01",  # Passes
    ],
)
def test_15829(s):
    # GH 15829
    from pandas._libs.tslibs.np_datetime import OutOfBoundsDatetime
    with pytest.raises(OutOfBoundsDatetime):
        Timestamp(s)

The pandas is a fresh master checkout.

Update it appears that the first string, "01-01-01" is being interpreted as 2001-01-01 while the others are interpreted as 0001-01-01. Is 01 -> 2001 an expected year conversion?

@jbrockmendel
Copy link
Member

Is 01 -> 2001 an expected year conversion?

I think so, we're going to be following dateutil convention on this. What dateutil do you have installed?

@ghost
Copy link

ghost commented Apr 23, 2020

@jbrockmendel , no dateutil installed. I observed that conversion done by a panda extension code - specifically it happens before check_dts_bounds(..) somewhere in np_datetime.c. Most likely a wrong call to days_to_yearsdays as it is the only place I've found +2000 to years so far. So no dateutil is being used.

update I personally surprised pandas is using its own date/calendar code instead of relying upon some other library.

@ghost
Copy link

ghost commented Apr 29, 2020

After a thought I believe this could be close as "works as expected" unless either there's some clever idea on how to put a warning for this specific case (how specific?), or pandas core team wants to change the way dates are parsed.

@jbrockmendel
Copy link
Member

no dateutil installed

I dont think this is possible. pandas shouldn't be able to install/import without dateutil

@ghost
Copy link

ghost commented Apr 30, 2020

Found it: python-dateutil==2.8.1

@mroeschke mroeschke mentioned this issue May 21, 2021
10 tasks
@jreback jreback modified the milestones: Contributions Welcome, 1.3 May 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants