Skip to content

Surprising results of pandas.to_datetime() #11725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jankatins opened this issue Nov 30, 2015 · 4 comments
Closed

Surprising results of pandas.to_datetime() #11725

jankatins opened this issue Nov 30, 2015 · 4 comments
Labels
Datetime Datetime data dtype Usage Question

Comments

@jankatins
Copy link
Contributor

I just used pd.to_datetime on a string column and inspected the head of the dataframe and it looked good. Unfortunately, I had german dates (day.month.year) with high days in the first rows. These were converted correctly, but then I got dates (of the next month) with low days and the conversion switched to american style (month.day.year). I would expect that I at least get a warning that the date formats are different or that the first found date format is used for all dates:

>>> pd.to_datetime(["29.01.1945","1.3.1945", "02.03.1945"])
DatetimeIndex(['1945-01-29', '1945-01-03', '1945-02-03'], dtype='datetime64[ns]', freq=None)
@jreback
Copy link
Contributor

jreback commented Nov 30, 2015

you need to specify dayfirst (which defaults to False)

In [4]: pd.to_datetime(["29.01.1945","1.3.1945", "02.03.1945"],dayfirst=True)
Out[4]: DatetimeIndex(['1945-01-29', '1945-03-01', '1945-03-02'], dtype='datetime64[ns]', freq=None)

@jreback jreback closed this as completed Nov 30, 2015
@jreback jreback added Datetime Datetime data dtype Usage Question labels Nov 30, 2015
@jankatins
Copy link
Contributor Author

I know how to work around it, I'm just surprised that pd.to_datetime() can change the interpretation in between values of the same Series.

E.g. if dayfirst == False, then why does it sometimes ("29.01.1945") use a month value as day and not raise/warn?

@jorisvandenbossche
Copy link
Member

@JanSchulz You are fully correct this is really confusing/problematic behaviour, but that is a limitation by the dependency on dateutil, as explained (briefly) in the docstring: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html

@jorisvandenbossche
Copy link
Member

See also #3341

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Usage Question
Projects
None yet
Development

No branches or pull requests

3 participants