Skip to content

read_csv reads in CSV file as all NaNs if header is a list of length 1 #17082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
threecgreen opened this issue Jul 26, 2017 · 2 comments
Closed
Labels
Duplicate Report Duplicate issue or pull request

Comments

@threecgreen
Copy link

Code Sample

If test.csv file looks like:

a,b,c
0,1,2
1,2,3

Reading in the file with the header given in a list of length 0 results in no warnings or errors, but each line is interpreted as NaNs.

>>> import pandas as pd
>>> pd.read_csv("test.csv", header=[0])
      a     b     c
0   NaN   NaN   NaN
1   NaN   NaN   NaN

Problem description

Single-length lists are not a problem elsewhere in pandas or within read_csv. For example, passing index_col=[0] does not cause pandas to read a csv file incorrectly. Preferably pandas would read in the csv file correctly given a list of header rows with one element. Raising an error or warning would also be an improvement over the current functionality.

Expected Output

>>> import pandas as pd
>>> pd.read_csv("test.csv", header=[0])
    a   b   c
0   0   1   2
1   1   2   3

Output of pd.show_versions()

OS: macOS Sierra Python: 2.7.13 pandas: 0.20.2 pytest: None pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.13.1 scipy: 0.19.1 xarray: None IPython: 5.3.0 sphinx: None patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.0.2 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.999 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None
@chris-b1
Copy link
Contributor

Thanks - duplicate of #7757, agree this is surprising!

@chris-b1 chris-b1 marked this as a duplicate of #7757 Jul 26, 2017
@chris-b1 chris-b1 added the Duplicate Report Duplicate issue or pull request label Jul 26, 2017
@chris-b1 chris-b1 added this to the No action milestone Jul 26, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 26, 2017

Because logic: passing in header=[0, 1] has no issues. 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

3 participants