-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
io.read_csv: ValueError due to long lines #12203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here's a longer but self-contained example based on pandas read_csv and filter columns with usecols
So why does the version where the columns are selected based on |
looks like a dupe of this: #9755 pls confirm |
I connot confirm. The reason: in all examples of #9755 In my case, it's data without any column names. So I would like to be able to specify the columns to be used without using names. |
ok, i'll mark it. though I suspect its a very similar soln. pull-requests welcome. |
I noted that |
The whole problem is that E.g. when reading
usecols[0,1] would aloow to get a df with the col 1 and col 2. What I seek, any others on SO, is to limit the number of columns that are parsed initially. e.g.
In this case, usecols does not apply and work. But real world data, often from dataloggers, produces such artefacts on random basis... |
I try to read a csv file were some lines are longer than the rest.
Pandas
C engine
throws an error with these lines.But I do not was to skip these as discussed in a comment on a similar issue.
I prefer to "cut" the "bad columns" off using usecols. But I get the following errors:
Throws:
ValueError: Expected 10 fields in line 100, saw 20
Why is the
ValueError
raised although I explicitly defined the columns to use?References:
The text was updated successfully, but these errors were encountered: