Skip to content

pd.to_datetime raises AttributeError with specific inputs when errors='ignore' #12424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aktiur opened this issue Feb 23, 2016 · 1 comment · Fixed by #13909
Closed

pd.to_datetime raises AttributeError with specific inputs when errors='ignore' #12424

aktiur opened this issue Feb 23, 2016 · 1 comment · Fixed by #13909
Labels
Bug Datetime Datetime data dtype
Milestone

Comments

@aktiur
Copy link

aktiur commented Feb 23, 2016

I'm trying to import a csv file into a PostgreSQL table using odo. During import, odo tries to automatically detect date columns using pd.to_datetime with errors='ignore'. However, with one of my columns (that isn't a date column, it's some kind of zip code), pd.to_datetime raises an AttributeError.

I have been able to narrow down the problem to this small snippet:

>>> pd.to_datetime(pd.Series(['01210',  np.nan]), errors='ignore')
Traceback (most recent call last):
  File "pandas\tslib.pyx", line 1952, in pandas.tslib.array_to_datetime (pandas\tslib.c:35219)
  File "pandas\tslib.pyx", line 1432, in pandas.tslib._check_dts_bounds (pandas\tslib.c:26809)
pandas.tslib.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1210-01-01 00:00:00

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pandas\tslib.pyx", line 1981, in pandas.tslib.array_to_datetime (pandas\tslib.c:35724)
  File "pandas\tslib.pyx", line 1975, in pandas.tslib.array_to_datetime (pandas\tslib.c:35602)
  File "pandas\tslib.pyx", line 1228, in pandas.tslib.convert_to_tsobject (pandas\tslib.c:23563)
  File "pandas\tslib.pyx", line 1432, in pandas.tslib._check_dts_bounds (pandas\tslib.c:26809)
pandas.tslib.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1210-01-01 00:00:00

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Miniconda3\envs\py35\lib\site-packages\pandas\util\decorators.py", line 89, in wrapper
    return func(*args, **kwargs)
  File "C:\Miniconda3\envs\py35\lib\site-packages\pandas\tseries\tools.py", line 276, in to_datetime
    unit=unit, infer_datetime_format=infer_datetime_format)
  File "C:\Miniconda3\envs\py35\lib\site-packages\pandas\tseries\tools.py", line 390, in _to_datetime
    values = _convert_listlike(arg._values, False, format)
  File "C:\Miniconda3\envs\py35\lib\site-packages\pandas\tseries\tools.py", line 372, in _convert_listlike
    require_iso8601=require_iso8601)
  File "pandas\tslib.pyx", line 1847, in pandas.tslib.array_to_datetime (pandas\tslib.c:37155)
  File "pandas\tslib.pyx", line 2005, in pandas.tslib.array_to_datetime (pandas\tslib.c:36116)
AttributeError: 'float' object has no attribute 'view'

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.17.1
nose: None
pip: 8.0.2
setuptools: 19.6.2
Cython: None
numpy: 1.10.1
scipy: 0.16.0
statsmodels: None
IPython: 4.0.1
sphinx: 1.3.1
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
Jinja2: 2.8
@jorisvandenbossche
Copy link
Member

On master, the out of bounds error is not shown anymore, but the AttributeError: 'float' object has no attribute 'view' is still there.

It seems this error is only triggered when there is a NaN and the other string is not valid:

In [7]: pd.to_datetime(pd.Series(['2012-01-01', np.nan]), errors='ignore')
Out[7]:
0   2012-01-01
1          NaT
dtype: datetime64[ns]

In [8]: pd.to_datetime(pd.Series([np.nan]), errors='ignore')
Out[8]:
0   NaT
dtype: datetime64[ns]

In [9]: pd.to_datetime(pd.Series(['1012-01-01', np.nan]), errors='ignore')
AttributeError: 'float' object has no attribute 'view'

In [10]: pd.to_datetime(pd.Series(['1012-01-01']), errors='ignore')
Out[10]:
0    1012-01-01
dtype: object

In [11]: pd.to_datetime(pd.Series(['1012-01-01', '2012-01-01']), errors='ignore'
Out[11]:
0    1012-01-01
1    2012-01-01
dtype: object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants