Skip to content

DOC: read_csv: date_format #59557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
igurin-invn opened this issue Aug 20, 2024 · 2 comments · Fixed by #59566
Closed
1 task done

DOC: read_csv: date_format #59557

igurin-invn opened this issue Aug 20, 2024 · 2 comments · Fixed by #59566
Labels
Docs IO CSV read_csv, to_csv

Comments

@igurin-invn
Copy link

igurin-invn commented Aug 20, 2024

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/dev/reference/api/pandas.read_csv.html

Documentation problem

When using read_csv, I came across a UserWarning: Could not infer format.... I tried to resolve it by providing a date_format, but the warning wouldn't go away. Then I realized that my problem was that I was providing specifically the date format but not the time format.

The documentation for this parameter is confusing. It starts with what appear to be two alternate and contradictory definitions of the parameter:

  • "Format to use for parsing dates when used in conjunction with parse_dates."
  • "The strftime to parse time, e.g. "%d/%m/%Y"."

Suggested fix for documentation

Say something like this:

"Format to use for parsing dates and/or times. Used in conjunction with parse_dates. See strftime documentation for more information on choices..."

@igurin-invn igurin-invn added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 20, 2024
maymunashah added a commit to maymunashah/pandas that referenced this issue Aug 20, 2024
maymunashah added a commit to maymunashah/pandas that referenced this issue Aug 20, 2024
@rhshadrach rhshadrach added IO CSV read_csv, to_csv and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 20, 2024
@rhshadrach
Copy link
Member

Thanks for the report. Does this always result in a datetime, or can it result in a date and/or time? I think that would determine whether the documentation should be refined to say datetime or date/time/datetime.

It looks like someone didn't proofread enough.

Reports and contributions are welcome, this is not.

@igurin-invn
Copy link
Author

Does this always result in a datetime, or can it result in a date and/or time? I think that would determine whether the documentation should be refined to say datetime or date/time/datetime.

Looks like the result is always a DatetimeIndex.

>>> from io import StringIO
>>> import pandas as pd
>>> pd.read_csv(StringIO(',A\n7/12/2024 5:40:41 PM,0\n7/12/2024 5:41:41 PM,0'), # date and time
        parse_dates=True, index_col=0).index
DatetimeIndex(['2024-07-12 17:40:41', '2024-07-12 17:41:41'], dtype='datetime64[ns]', freq=None)
>>> pd.read_csv(StringIO(',A\n7/12/2024,0\n7/12/2024,0'),                 # date only
        parse_dates=True, index_col=0).index
DatetimeIndex(['2024-07-12', '2024-07-12'], dtype='datetime64[ns]', freq=None)
>>> pd.read_csv(StringIO(',A\n5:40:41 PM,0\n5:41:41 PM,0'),               # time only
        parse_dates=True, index_col=0).index
DatetimeIndex(['2024-08-20 17:40:41', '2024-08-20 17:41:41'], dtype='datetime64[ns]', freq=None)

Reports and contributions are welcome, this is not.

Sorry. No offense intended. I took it out.

mroeschke added a commit that referenced this issue Aug 21, 2024
#59566)

DOC: Clarify date_format usage in read_csv documentation specifically pandas/io/parsers/readers.py issue #59557

Co-authored-by: Matthew Roeschke <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs IO CSV read_csv, to_csv
Projects
None yet
2 participants