Skip to content

BUG: datetime parsing: error message indicating position of conflicting string is wrong for larger data #55345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jorisvandenbossche opened this issue Oct 1, 2023 · 7 comments
Assignees
Labels
Bug Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas

Comments

@jorisvandenbossche
Copy link
Member

Using the latest pandas main (and also happens on released version 2.1.1):

In [1]: pd.to_datetime(["2012-01-01"] * 49 + ["2012-01-02 09"])
...
ValueError: unconverted data remains when parsing with format "%Y-%m-%d": " 09", at position 49. You might want to try:
    - passing `format` if your strings have a consistent format;
    - passing `format='ISO8601'` if your strings are all ISO8601 but not necessarily in exactly the same format;
    - passing `format='mixed'`, and the format will be inferred for each element individually. You might want to use `dayfirst` alongside this.

In [2]: pd.to_datetime(["2012-01-01"] * 50 + ["2012-01-02 09"])
...
ValueError: unconverted data remains when parsing with format "%Y-%m-%d": " 09", at position 1. You might want to try:
...

In the first case, it correctly says "position 49", while in the second case (n > 50), it confusingly says "position 1".

@jorisvandenbossche jorisvandenbossche added Bug Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas labels Oct 1, 2023
@KartikeyBartwal
Copy link

starting to brawl with this issue

@KartikeyBartwal
Copy link

no issues on my machine:
image

@paulreece
Copy link
Contributor

I can confirm this occurs on the main development branch:

>>> pd.to_datetime(["2012-01-01"] * 50 + ["2012-01-02 09"])
Traceback (most recent call last):
...
ValueError: unconverted data remains when parsing with format "%Y-%m-%d": " 09", at position 1. You might want to try:
...

>>> pd.to_datetime(["2012-01-01"] * 49 + ["2012-01-02 09"])
Traceback (most recent call last):
...
ValueError: unconverted data remains when parsing with format "%Y-%m-%d": " 09", at position 49. You might want to try:
    ...
``

@KartikeyBartwal
Copy link

Might be clashing with some other package. Could you share your requirements.txt content files?

@jorisvandenbossche
Copy link
Member Author

@KartikeyBartwal my guess is that you are using an older version of pandas (starting with pandas 2.0, the datetime parsing got stricter, and we now parse all values using the same format by default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html)

@rsm-23
Copy link
Contributor

rsm-23 commented Oct 5, 2023

take

@Kartikey-Bartwal
Copy link

@KartikeyBartwal my guess is that you are using an older version of pandas (starting with pandas 2.0, the datetime parsing got stricter, and we now parse all values using the same format by default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html)

You got it right! My version was '1.3.4'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants