Skip to content

BUG: Improve error message for skipfooter malformed rows in Python engine #14749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Nov 26, 2016

Python's native CSV library does not respect the skipfooter parameter, so if one of those skipped
rows is malformed, it will still raise an error. Append some useful information to the error message regarding that if that situation is potentially applicable.

Closes #13879.

@jreback jreback added Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv labels Nov 26, 2016
if self.skipfooter > 0:
reason = ('Error could possibly be due to '
'skipfooter being ignored in Python\'s '
'native csv library.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we can word this a little different. As now it seems to imply that skipfooter is ignored altogether ("so why can I specify it then").
Do I understand it correctly that the skipped lines are still parsed, but not used when converting the parsed lines to a DataFrame (so eg the number of values in the skipped lines do not need to match the rest of the file)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Python's native CSV library does not care about skipfooter, so any malformed lines in the skipfooter section will raise an error. However, if there are no such errors, these rows are then discarded afterwards.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche : Any thoughts on rewording (if any)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying a long version:

Error could possibly be due to parsing errors in the skipped footer rows (the skipfooter keyword is only applied after Python's csv library has parsed all rows)

(is that correct?)
If you have ideas to shorten it a bit but keep it clear, welcome :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your suggestion! I'll change it to that.

@codecov-io
Copy link

codecov-io commented Nov 26, 2016

Current coverage is 85.27% (diff: 100%)

Merging #14749 into master will increase coverage by <.01%

@@             master     #14749   diff @@
==========================================
  Files           144        144          
  Lines         50911      50915     +4   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43413      43419     +6   
+ Misses         7498       7496     -2   
  Partials          0          0          

Powered by Codecov. Last update 2f43ac4...8aae4fe

@gfyoung gfyoung force-pushed the skipfooter-python-error branch 2 times, most recently from 8bcfb77 to 9b1d065 Compare November 28, 2016 21:43
…gine

Python's native CSV library does not respect the
skipfooter parameter, so if one of those skipped
rows is malformed, it will still raise an error.

Closes pandas-devgh-13879.
@gfyoung gfyoung force-pushed the skipfooter-python-error branch from 9b1d065 to 8aae4fe Compare November 29, 2016 01:00
@gfyoung
Copy link
Member Author

gfyoung commented Nov 29, 2016

@jreback , @jorisvandenbossche : Updated the error message, and everything is happy. Ready to merge if there are no other concerns.

@jorisvandenbossche jorisvandenbossche added this to the 0.19.2 milestone Nov 29, 2016
@jorisvandenbossche jorisvandenbossche merged commit dfeae39 into pandas-dev:master Nov 29, 2016
@jorisvandenbossche
Copy link
Member

@gfyoung Thanks!

@gfyoung gfyoung deleted the skipfooter-python-error branch November 29, 2016 21:24
jorisvandenbossche pushed a commit that referenced this pull request Dec 15, 2016
… rows in Python engine (#14749)

Python's native CSV library does not respect the
skipfooter parameter, so if one of those skipped
rows is malformed, it will still raise an error.

Closes gh-13879.
(cherry picked from commit dfeae39)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants