Skip to content

ENH/BUG: pd.read_csv(): usecols does not work when also using skipfooter #4335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
89465127 opened this issue Jul 23, 2013 · 8 comments · Fixed by #5211
Closed

ENH/BUG: pd.read_csv(): usecols does not work when also using skipfooter #4335

89465127 opened this issue Jul 23, 2013 · 8 comments · Fixed by #5211
Labels
Enhancement IO CSV read_csv, to_csv IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@89465127
Copy link

Steps to reproduce:

>>> import pandas as pd
>>> pd.__version__
'0.10.1'
>>> import StringIO
>>> data = 'a,b,c,d\n1,2,3,foo\n4,5,6,bar\n7,8,9,baz'
>>> df = pd.read_csv(StringIO.StringIO(data), usecols=[1])
>>> df
   b
0  2
1  5
2  8
>>> df = pd.read_csv(StringIO.StringIO(data), usecols=[1], skipfooter=1)
>>> df
   a  b  c    d
0  1  2  3  foo
1  4  5  6  bar
@jreback
Copy link
Contributor

jreback commented Jul 23, 2013

right now (even in master & 0.12), skip_footer forces the python parser and usecols is not implemented there. So best to simply read in all of your data (with usecols) and then subset it, e.g.

df.iloc[0:-1]

@guyrt
Copy link
Contributor

guyrt commented Oct 4, 2013

@jreback I've been looking at this one. I'm close, but it's become a game of wack-a-mole. There are a lot of interactions between headers, name, and usecols. I'll post a bigger example of those interactions over the weekend.

@jreback
Copy link
Contributor

jreback commented Oct 4, 2013

np @guyrt thanks for the attention! you are really nailing down these csv buggies! lmk

@jreback
Copy link
Contributor

jreback commented Oct 7, 2013

@guyrt any luck with this?

@guyrt
Copy link
Contributor

guyrt commented Oct 7, 2013

I've started working, but it's still very much a WIP.

https://github.com/guyrt/pandas/tree/issue-4355-usecols-python-engine

@jreback
Copy link
Contributor

jreback commented Oct 11, 2013

@guyrt how's this coming?

@guyrt
Copy link
Contributor

guyrt commented Oct 12, 2013

Still WIP. I moved from Toronto to Seattle over the last two days, so I've been busy :) Should have time to clean it up this afternoon.

@jreback
Copy link
Contributor

jreback commented Oct 12, 2013

@guyrt big move! gr8 ...thanks

guyrt added a commit to guyrt/pandas that referenced this issue Oct 13, 2013
guyrt added a commit to guyrt/pandas that referenced this issue Oct 16, 2013
Closes pandas-dev#4335

Added release note and fixed py3 compat

Updated docs for consistency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO CSV read_csv, to_csv IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants