-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: read_csv with trailing comma, specified header names and usecols gives confusing error #29042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, The file has a header ,so not very sure if both header=0 and names attribute can coexist. I did not get the issue when I either change header as none or when removing the names attribute. Since it's to do with header names attribute and names combined with usecols, it has thrown this error. |
As far as I understand, they can co-exist. The goal here was to overwrite the existing names. The header keyword explanation mentions this use case:
I just realized that I now understand the error message: if you specify both custom
works correctly. |
usecols :Return a subset of the columns. or can write code like this: |
That works, but the purpose was to only select a subset of the columns. |
if you want to select subset col you cant try this: |
Ah, yes, that works (also specifying |
I've encountered a similar issue:
Expected behavior: pandas uses the first column as index and returns data frame with one column "a" and values "2, b" Actual behavior: pandas 0.25.3 The error is quite confusing, especially that in the actual situation I had a long list of columns and most of them were also mentioned in usecols. Spent some time looking for the mismatch mentioned in the error |
The same problem discussed on SO: https://stackoverflow.com/questions/31017823/pandas-returns-passed-header-names-mismatches-usecols-error |
Apart from a bad error message, I think the behavior displayed by @jorisvandenbossche is actually a bug.
works while leaving 'E' out does not. |
Encountered this somewhat strange case: in case you have a malformed file (trailing comma's), we typically set the first column as the index. But if you also pass custom names, and use
usecols
, in that case you get a cryptic error message:I am not fully sure what the behaviour should be, but the current error message is not very helpful (since the usecols argument is only using integers, they are positional, and don't need to match the header names)
The text was updated successfully, but these errors were encountered: