first line comments on a read_csv #4623

hayd · 2013-08-21T20:31:20Z

related #4505

It seems that commenting on the first line is a little buggy (or perhaps not well-defined):

In [11]: s1 = '# notes\na,b,c\n# more notes\n1,2,3'

In [12]: s2 = 'a,b,c\n# more notes\n1,2,3'

In [13]: pd.read_csv(StringIO(s1), comment='#')
Out[13]: 
        Unnamed: 0
a   b            c
NaN NaN        NaN
1   2            3

In [14]: pd.read_csv(StringIO(s2), comment='#')
Out[14]: 
    a   b   c
0 NaN NaN NaN
1   1   2   3

If you ignore the header:

In [15]: pd.read_csv(StringIO(s1), comment='#', header=None)
CParserError: Error tokenizing data. C error: Expected 1 fields in line 2, saw 3

related #3001 and from this SO question.

The text was updated successfully, but these errors were encountered:

jreback · 2013-08-21T20:43:15Z

see also #4505

jreback · 2013-08-21T20:43:24Z

of course in the header is special issue

hayd · 2013-08-21T20:46:43Z

yeah, now I think about it, the first behaviour is ok. Only ignoring header is bug.

gfyoung · 2016-08-02T09:11:19Z

I don't think this is an issue anymore:

>>> from pandas import read_csv
>>> from pandas.compat import StringIO
>>> data = '# notes\na,b,c\n# more notes\n1,2,3'
>>> read_csv(StringIO(data), engine='c', comment='#')
   a  b  c
0  1  2  3
>>> read_csv(StringIO(data), engine='python', comment='#')
   a  b  c
0  1  2  3
>>> read_csv(StringIO(data), engine='c', comment='#', header=None)
   0  1  2
0  a  b  c
1  1  2  3
>>> read_csv(StringIO(data), engine='python', comment='#', header=None)
   0  1  2
0  a  b  c
1  1  2  3

jreback · 2016-08-02T10:13:04Z

ok, this looks closable, can you put some tests.

Closes pandas-devgh-4623.

Closes gh-4623.

jreback mentioned this issue Aug 21, 2013

[io] comment option ignored in read_csv when sep provided. #3001

Closed

hayd mentioned this issue Aug 21, 2013

ENH: Ignoring empty lines in files #4466

Closed

jreback modified the milestones: 0.15.0, 0.14.0 Feb 26, 2014

jreback mentioned this issue Jun 16, 2014

ENH: ignoring comment lines and empty lines in CSV files #7470

Closed

jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015

foobarbecue mentioned this issue Jul 12, 2015

bug: read_csv inserts NaN row if file ends in comment line #10548

Closed

jreback modified the milestones: 0.19.0, Next Major Release Aug 2, 2016

gfyoung added a commit to forking-repos/pandas that referenced this issue Aug 2, 2016

TST: Add first line comment tests in read_csv

becaff7

Closes pandas-devgh-4623.

gfyoung mentioned this issue Aug 2, 2016

TST: Add first line comment tests in read_csv #13881

Merged

jreback closed this as completed in #13881 Aug 3, 2016

jreback pushed a commit that referenced this issue Aug 3, 2016

TST: Add first line comment tests in read_csv (#13881)

66c3b46

Closes gh-4623.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

first line comments on a read_csv #4623

first line comments on a read_csv #4623

hayd commented Aug 21, 2013

jreback commented Aug 21, 2013

jreback commented Aug 21, 2013

hayd commented Aug 21, 2013

gfyoung commented Aug 2, 2016

jreback commented Aug 2, 2016

first line comments on a read_csv #4623

first line comments on a read_csv #4623

Comments

hayd commented Aug 21, 2013

jreback commented Aug 21, 2013

jreback commented Aug 21, 2013

hayd commented Aug 21, 2013

gfyoung commented Aug 2, 2016

jreback commented Aug 2, 2016