-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: in read_csv, keep_date_cols doesn't result in correct dtype #13378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@Thomasillo not sure the doc-string actually says that it should be parsed first. as it only applies to a list-lf-list-of dates. Not very many tests for this either. So not clear if this is an issue or not. cc @gfyoung |
@jreback : I consider this behaviour a bug. Casting should still be done if the column is being kept because it still is a data column after. I think we'll need to remove this function here and just do the date casting first before doing other data conversions, but not sure yet. @Thomasillo : For the time being, try using Python engine instead. It will correctly convert to |
Again, this issue highlights a major issue between the two engines in that the order in which things are applied do not match. As we can see here, the C engine applies the data conversions first before doing date conversions whereas the Python engine does it the other way around. Another reason why the ordering needs to be aligned so that the behaviour can be fixed. |
@jreback : IMO I think that the difficulty is at least "intermediate," as this issue is another manifestation of the larger issue I described above. |
whomever fixes then will get 'intermediate' points (to be spent just like virtual cash:>) 😉 |
@gfyoung : Thanks. However, I had to solved the problem differently for now. Am 06.06.2016 um 16:46 schrieb gfyoung [email protected]:
|
@jreback : fair enough 😃 @Thomasillo : cool - thanks for bringing up the issue! |
The reported bug is still present on version 0.24.1, but there is on activity here for the last three years. Shall we close it for now? |
@holy-motors : It just means that no good solution has been found for it yet. You are more than welcome to investigate if you like! |
>>> import pandas
>>> import io
>>> data = """A
20150908
20150909
"""
>>> t=pandas.read_csv(io.StringIO(data))
>>> t.dtypes
A int64
dtype: object
>>> t=pandas.read_csv(io.StringIO(data),parse_dates={'date':['A']},keep_date_col=True)
>>> t.dtypes
date datetime64[ns]
A object
dtype: object
The second time, the datatype of the column 'A' should also be int64.
The text was updated successfully, but these errors were encountered: