-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Loading CSV files (using read_csv
) with blank lines between header and data rows quits Python interpreter
#28071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report. Self-contained example
|
I believe I've narrowed it down to this line: pandas/pandas/_libs/src/parser/tokenizer.c Line 1192 in b68a9bb
Debugging @TomAugspurger example above, I think could add a check that |
take |
I do not claim to understand exactly what happens here, but I created a test that reproduces the problem and, with this fix, produces the expected results. |
Code Sample, a copy-pastable example if possible
Problem description
I have been trying to load a test CSV file ("my_csv.txt", attached), which is structured in a way that there's an information text on the second row; row header line on the fifth row; and the data starts at the ninth row. As you can see in the Python code above,
read_csv
fails whennrows=1
, but doesn't whennrows>1
.I think there's some uncaught bug in Pandas'
read_csv
when CSV file has blank lines between header and the start of the data rows. Thank you for your hard work maintaining and extending this very useful library.my_csv.txt
The text was updated successfully, but these errors were encountered: