-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
read_csv() crashes if engine='c', header=None, and 2+ extra columns #26218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Specifying the columns to use here would give you the result you want: stream.seek(0)
pd.read_csv(
stream,
header=None,
names=['one', 'two', 'three'],
index_col=False,
engine='c',
usecols=[0, 1, 2]
) Behavior should be consistent, though I'm not sure dropping the remaining columns is really the correct approach regardless. Let's see what others think |
It isn't, it's a known bug that's been hanging around for months. See #22144. |
OK thanks. Linked issue seems related though not quite the same. If you'd like to take a look and submit a PR for this or the other would certainly be appreciated! |
I think I found the problem. Do I fork off of |
Fork off master |
Hi, we were relying on the pre-1.1 behaver to raise an error when the provided list of names mismatched the data, and the exception is no longer caught: https://travis-ci.org/github/bccp/nbodykit/jobs/725070683#L3274 What about adding a read_csv() argument to strictly enforce the matching between the provided schema and the data -- or is there a canonical way for the caller to check this? |
Code Sample, a copy-pastable example if possible
Problem description
The normal behavior for
read_csv()
is to ignore extra columns if it's givennames
. However, if the CSV has two or more extra columns and theengine
isc
then it crashes. The exact same CSV can be read correctly ifengine
ispython
.Expected Output
Behavior of reading a CSV with the C and Python engines should be identical.
Output of
pd.show_versions()
I tested this on Python 3.7.3 as well but for some reason
pd.show_versions()
keeps blowing up with:The text was updated successfully, but these errors were encountered: