-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
auto convert from string to datetime64 in iterrows. #19671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
we had an old issue about this, but can't seem to find it. so
is a legit (though maybe weird) parse by love for you to take a look. |
hope don't mind if I jump into this issue. since Series are constructed row by row of the frame, the dtype is inferred case by case. as @jreback has rightly put 'M1701' is inferred as datetime whereas 'M1609' is object. so I think the issue here is that dtype upcast for each Series is not always consistent as it's evaluated individually in A potential fix could be to pass numpy.dtype (self.values.dtype in iterrows) to create Series so that all series (i.e. rows of the dataframe) generated by iterrows will have consistent upcast across columns. Although numpy seems to upcast numeric (e.g. int to float), it doesn't handle datetime inference. |
Here's the basic problem. We do inference on mixed strings & datetimes upon construction. This allows one to pass 'NaT and/or mixed datetimelike (e.g. you can also pass
so @minggli its the logic actually in |
Sorry for the delay. raised a PR exposing require_iso8601 in dataframe.iterrows(), to_datetime, and Series APIs. |
In [1]: import pandas as pd
In [2]: pd.__version__
Out[2]: '1.1.2'
In [3]: pd.DataFrame({'vict_shop_id': ['414', '1809'], 'cann_open_date': pd.to_datetime(['2017-12-25', '2017-12-25'])})
Out[3]:
vict_shop_id cann_open_dt
0 414 2017-12-25
1 1809 2017-12-25
In [4]: df = pd.DataFrame({'vict_shop_id': ['414', '1809'], 'cann_open_date': pd.to_datetime(['2017-12-25', '2017-12-25'])})
In [5]: for i, row in df.iterrows():
...: print(row)
...:
vict_shop_id 414
cann_open_dt 2017-12-25 00:00:00
Name: 0, dtype: object
vict_shop_id 1809-01-01
cann_open_dt 2017-12-25
Name: 1, dtype: datetime64[ns] Problem description Thanks. |
Behavior is still occurring in 1.5.3 as well |
any update? |
Problem description
Hi, I found the auto convert issue in iterrows or index, the string 'M1701' is converted to '1701-01-01', and It's not supposed to happen. So is it a bug here?
Thanks.
Expected Output
Output of
pd.show_versions()
pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.27.3
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: