Skip to content

better error message for to_datetime #4928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cpcloud opened this issue Sep 22, 2013 · 6 comments · Fixed by #5157
Closed

better error message for to_datetime #4928

cpcloud opened this issue Sep 22, 2013 · 6 comments · Fixed by #5157
Labels
Datetime Datetime data dtype Docs Error Reporting Incorrect or improved errors from pandas
Milestone

Comments

@cpcloud
Copy link
Member

cpcloud commented Sep 22, 2013

cc @danbirken

In [2]: to_datetime([1,'1'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-1d4cd9e078aa> in <module>()
----> 1 to_datetime([1,'1'])

/home/phillip/Documents/code/py/pandas/pandas/tseries/tools.pyc in to_datetime(arg, errors, dayfirst, utc, box, format, coerce, unit)
    137         return Series(values, index=arg.index, name=arg.name)
    138     elif com.is_list_like(arg):
--> 139         return _convert_listlike(arg, box=box)
    140
    141     return _convert_listlike(np.array([ arg ]), box=box)[0]

/home/phillip/Documents/code/py/pandas/pandas/tseries/tools.pyc in _convert_listlike(arg, box)
    117                 result = tslib.array_to_datetime(arg, raise_=errors == 'raise',
    118                                                  utc=utc, dayfirst=dayfirst,
--> 119                                                  coerce=coerce, unit=unit)
    120             if com.is_datetime64_dtype(result) and box:
    121                 result = DatetimeIndex(result, tz='utc' if utc else None)

/home/phillip/Documents/code/py/pandas/pandas/tslib.so in pandas.tslib.array_to_datetime (pandas/tslib.c:16487)()

TypeError: object of type 'int' has no len()
@danbirken
Copy link
Contributor

Yeah this is a little strange. I don't know enough about the history of these functions, but in the documentation of to_datetime() it says:

arg : string, datetime, array of strings (with possible NAs)

However, array_to_datetime() and by proxy to_datetime() supports certain cases of being passed arrays of ints or floats:

In [2]: pd.to_datetime([1, 2])
Out[2]:
<class 'pandas.tseries.index.DatetimeIndex'>
[1970-01-01 00:00:00.000000001, 1970-01-01 00:00:00.000000002]
Length: 2, Freq: None, Timezone: None

In [3]: pd.to_datetime([1.5, 2.5])
Out[3]:
<class 'pandas.tseries.index.DatetimeIndex'>
[1970-01-01 00:00:00.000000001, 1970-01-01 00:00:00.000000002]
Length: 2, Freq: None, Timezone: None

So I can:

a) Make it so you must pass an array of strings to to_datetime(), otherwise throw a nice exception that says you passed in a list that wasn't strings

b) Make it so array_to_datetime() (and by proxy to_datetime()) returns the same array back in case it cannot be processed. So like:

# Current behavior for two un-processable strings
In [4]: pd.to_datetime(['1', '2'])
Out[4]: array(['1', '2'], dtype=object)

# Potential behavior for your input.  Un-processable, just return it back
In [4]: pd.to_datetime([1, '1'])
Out[4]: array([1, '1'], dtype=object)

c) some other option I didn't think of

@jreback
Copy link
Contributor

jreback commented Sep 22, 2013

their are a bunch test case I think in tseries/test/test_timeseries IIRC.

on errors it will just return the input, unless errors='strict', or if coerce=True then will make errors into NaT

@jreback
Copy link
Contributor

jreback commented Oct 7, 2013

@danbirken can you address this one? I think that b) is the correct response here (it will just return it to you), unless errors='raise' is passed in which case it will raise

although, maybe in this case as the input is completely bogus, you could raise....

@danbirken
Copy link
Contributor

Sorry for the delay in responding -- I'm still on a very long vacation :)

I think this change should do it.

@jreback
Copy link
Contributor

jreback commented Oct 8, 2013

gr8...just roll that into a PR and put a release notes entry...thankxs

@danbirken
Copy link
Contributor

Put into PR, release notes should be in there.

jreback added a commit that referenced this issue Oct 8, 2013
BUG: Fix to_datetime() uncaught error with unparseable inputs #4928
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Docs Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants