Skip to content

ERR: Fail-fast with incompatible skipfooter combos #23711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Nov 14, 2018

Setup:

from pandas import read_csv
from pandas.compat import StringIO

data = "a\n1\n2\n3\n4\n5"

# Case 1:
read_csv(StringIO(data), skipfooter=1, nrows=2)

# Case 2:
read_csv(StringIO(data), skipfooter=1, chunksize=2)

Currently, we get:

# Case 1:
...
ValueError: skipfooter not supported for iteration

# Case 2:
...
<pandas.io.parsers.TextFileReader object at ...>

In Case 1, the error message is not correct. True, passing in nrows and skipfooter is not supported (skipfooter should be skipping lines at the end of the file, NOT at the end of the nrows of data that we read in, which is what happens currently), but we should raise a better error message.

(BTW, the only way to make this combo work is that we read in the entire file before cutting off the last skipfooter rows but can look into that subsequently...)

In Case 2, creating this reader is deceiving. Any attempt to call .read() on this will raise the ValueError seen in Case 1. It is better that we alert end-users immediately that this reader doesn't work. After this PR, we get some more useful error messages out of the gate:

# Case 1:
...
ValueError: 'skipfooter' not supported with 'nrows'

# Case 2:
...
ValueError: 'skipfooter' not supported for iteration

@gfyoung gfyoung added Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv labels Nov 14, 2018
@gfyoung gfyoung added this to the 0.24.0 milestone Nov 14, 2018
@pep8speaks
Copy link

Hello @gfyoung! Thanks for submitting the PR.

@codecov
Copy link

codecov bot commented Nov 15, 2018

Codecov Report

Merging #23711 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23711      +/-   ##
==========================================
+ Coverage   92.25%   92.25%   +<.01%     
==========================================
  Files         161      161              
  Lines       51383    51385       +2     
==========================================
+ Hits        47404    47406       +2     
  Misses       3979     3979
Flag Coverage Δ
#multiple 90.64% <100%> (ø) ⬆️
#single 42.31% <20%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/parsers.py 95.55% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a23f901...cca6501. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Nov 15, 2018

lgtm. i guess this warrant a whatsnew as well, bug fix section is ok

* Don't create the iterator and error immediately
if the skipfooter parameter is passed in.
* Raise the correct error message when nrows is
passed in with skipfooter.
@gfyoung gfyoung force-pushed the skipfooter-csv-invalid-combos branch from 94351db to 39ab28a Compare November 15, 2018 18:18
@gfyoung
Copy link
Member Author

gfyoung commented Nov 15, 2018

@jreback : Added the whatsnew, and all is green. PTAL.

@jreback
Copy link
Contributor

jreback commented Nov 16, 2018

thanks @gfyoung

@jreback jreback merged commit 57347e8 into pandas-dev:master Nov 16, 2018
@gfyoung gfyoung deleted the skipfooter-csv-invalid-combos branch November 16, 2018 21:27
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
* ERR: Fail-fast with incompatible skipfooter combos

* Don't create the iterator and error immediately
if the skipfooter parameter is passed in.
* Raise the correct error message when nrows is
passed in with skipfooter.

* Fix doc lint errors
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
* ERR: Fail-fast with incompatible skipfooter combos

* Don't create the iterator and error immediately
if the skipfooter parameter is passed in.
* Raise the correct error message when nrows is
passed in with skipfooter.

* Fix doc lint errors
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
* ERR: Fail-fast with incompatible skipfooter combos

* Don't create the iterator and error immediately
if the skipfooter parameter is passed in.
* Raise the correct error message when nrows is
passed in with skipfooter.

* Fix doc lint errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants