Skip to content

[io] comment option ignored in read_csv when sep provided. #3001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michaelaye opened this issue Mar 10, 2013 · 5 comments
Closed

[io] comment option ignored in read_csv when sep provided. #3001

michaelaye opened this issue Mar 10, 2013 · 5 comments
Labels
Bug IO Data IO issues that don't fit into a more specific label

Comments

@michaelaye
Copy link
Contributor

In [8]: import pandas as pd
In [9]: from StringIO import StringIO
In [10]: s = """#        date,            utc,             jdate, orbit, sundist,   sunlat,    sunlon,             sclk,     sclat,     sclon,       scrad,       scalt,  el_cmd,  az_cmd,   af, orientlat, orientlon, c, det,    vlookx,    vlooky,    vlookz,   radiance,       tb,      clat,      clon,     cemis,   csunzen,   csunazi, cloctime, qca, qge, qmi"16-Apr-2011", "00:00:00.025", 2455667.500000290,  8352, 1.00538,  1.33678,  27.55755, 0324604799.65011, -35.60774, 186.36107,  1798.15230,    60.70145, 180.000, 240.000,  110,   0.42378, 275.92764, 1,   1,  0.790839,  0.052956,  0.609729,     0.0077,    0.022, -35.53812, 186.44733,   2.91887, 140.57684,  10.78604, 22.59250, 000, 012, 000"""

In [11]: s = StringIO(s)
In [12]: columns = ['date', 'utc', 'jdate', 'orbit', 'sundist', 'sunlat', 'sunlon', 'sclk', 'sclat', 'sclon', 'scrad', 'scalt', 'el_cmd', 'az_cmd', 'af', 'orientlat', 'orientlon', 'c', 'det', 'vlookx', 'vlooky', 'vlookz', 'radiance', 'tb', 'clat', 'clon', 'cemis', 'csunzen', 'csunazi', 'cloctime', 'qca', 'qge', 'qmi']
In [13]: pd.io.parsers.read_csv(s,sep='\s+',names=columns,comment='#')
Out[13]: <class 'pandas.core.frame.DataFrame'>Int64Index: 2 entries, 0 to 1
Data columns:date         1  non-null valuesutc          1  non-null valuesjdate        1  non-null values
orbit        1  non-null values
sundist      1  non-null values
sunlat       1  non-null values
sunlon       1  non-null values
sclk         1  non-null values
sclat        1  non-null values
sclon        1  non-null values
scrad        1  non-null values
scalt        1  non-null values
el_cmd       1  non-null values
az_cmd       1  non-null values
af           1  non-null values
orientlat    1  non-null values
orientlon    1  non-null values
c            1  non-null values
det          1  non-null values
vlookx       1  non-null values
vlooky       1  non-null values
vlookz       1  non-null values
radiance     1  non-null values
tb           1  non-null values
clat         1  non-null values
clon         1  non-null values
cemis        1  non-null values
csunzen      1  non-null values
csunazi      1  non-null values
cloctime     1  non-null values
qca          1  non-null values
qge          1  non-null values
qmi          1  non-null values
dtypes: float64(1), object(32)

In [14]: pd.__version__
Out[14]: '0.11.0.dev-b3e696c'

expected: 1 entry (comment line should have been ignored)

@michaelaye
Copy link
Contributor Author

After reading the docstring carefully, I realise that line commenting is actually not supported. Howevery, I would consider this an important missing feature.
My current work-around is a df.dropna(how='all') after reading the file in.

@hayd
Copy link
Contributor

hayd commented Aug 21, 2013

Should this be closed? Or could you elaborate what you're after (I don't really follow this issue, maybe it's just confusing as example is big :s ).

@jreback
Copy link
Contributor

jreback commented Aug 21, 2013

see also #4505

@jreback
Copy link
Contributor

jreback commented Aug 21, 2013

@michaelaye this is the purpose of skiprows=1 (non-withstanding that maybe comments should be skipped in the first row, which #4623 is about).....

@hayd I think lets close in favor of the above issue

@hayd
Copy link
Contributor

hayd commented Aug 21, 2013

Also, running the above example now gives:

ValueError: Expected 33 fields in line 1, saw 0

@hayd hayd closed this as completed Aug 21, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

No branches or pull requests

3 participants