You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importpandasaspdimportdatetimed0=datetime.date(1970, 1, 1) # python epochd1=datetime.date(9999, 12, 31) # oft-used max-date in SAS / DB2fordaysinrange(106752, (d1-d0).days): # 2932896 is the number of daystry:
data= {'days_from_python_epoch': [days]}
df=pd.DataFrame(data)
df['date_as_date'] =pd.to_datetime(df['days_from_python_epoch'], unit='d', origin='1970-01-01')
exceptpd._libs.tslibs.np_datetime.OutOfBoundsDatetimease:
print(f'exception with {days}')
z=d0+datetime.timedelta(days=days-1)
print(f'looks like {days} -1 -> {z} is the max possible') # '2262-04-11T00:00:00.000000000'] is the max possibleraisee# 106752 throws exception
Problem description
This causes import of some SAS datasets to fail, and potentially imports/conversions from other sources (such as DB2) where an arbitrarily high date (often 9999-12-31) is used in datawarehousing to indicate currently valid rows with a to_date=9999-12-31
see related issue #20927 - reposted here as the issue is more generic than just SAS dataset imports
and maybe #15836
I'm pretty sure the root cause is because to_datetime() converts via nanoseconds ( in _libs.tslibs.timedeltas.cast_from_unit ) and very large dates in nanoseconds do not fit into int64 space.
Expected Output
correctly parsed dates up to and including 9999-12-31
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
Code Sample, a copy-pastable example if possible
Problem description
This causes import of some SAS datasets to fail, and potentially imports/conversions from other sources (such as DB2) where an arbitrarily high date (often 9999-12-31) is used in datawarehousing to indicate currently valid rows with a to_date=9999-12-31
see related issue #20927 - reposted here as the issue is more generic than just SAS dataset imports
and maybe #15836
I'm pretty sure the root cause is because to_datetime() converts via nanoseconds ( in _libs.tslibs.timedeltas.cast_from_unit ) and very large dates in nanoseconds do not fit into int64 space.
Expected Output
correctly parsed dates up to and including 9999-12-31
Output of
pd.show_versions()
pandas: 0.23.1
pytest: 3.6.2
pip: 10.0.1
setuptools: 39.2.0
Cython: 0.28.3
numpy: 1.14.5
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.5
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.4
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.5
lxml: 4.2.2
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.8
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: