-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
pd.to_datetime/pd.DatetimeIndex unexpected behavior #12501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
pls show a copy-pastable example, not a screen shot. |
Sure, you can use this:
will make difference choices regarding day or month first:
|
This is a known issue with See also #7348 |
Well you have an option to control this,
|
@marcomayer But I agree that this can be very confusing! Note that using |
yes, a warning or so might be something to consider I think. The way I work is that I read in millions of records, then convert the dates etc. and usually just do a head() to check if things are fine. My guess is that others do the same thing... In a perfect world, it would check if there is any ambiguity and if so at least warn or just stop and require format. |
I don't know if this is a bug or not (I guess not), but I think this behavior might be very confusing to many users who like me until now might not be aware of this.
When converting a string column to Date using pd.to_datetime or pd.DatetimeIndex the way the date is interpreted might change from row to row. I experienced this today where March 1st was interpreted as January 3rd. Probably pandas does this simply on a per row set considering only one sample at a time. One way to improve this would be to first go through the whole
At least I'd suggest to implement a warning if none of the format/dayfirst parameters are used.
Here's a screenshot to clarify:
http://snag.gy/mXT7g.jpg
output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
pandas: 0.17.1
nose: 1.3.7
pip: 8.0.2
setuptools: 19.6.2
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
IPython: 4.1.1
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.6
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.5.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: None
Jinja2: 2.8
The text was updated successfully, but these errors were encountered: