read_csv not respecting MultiIndex names #12518

toobaz · 2016-03-03T08:33:20Z

Code Sample, a copy-pastable example if possible

In [2]: pd.read_csv('/tmp/csv.csv', header=None)
Out[2]: 
   0  1  2
0  0  2  2
1  1  2  2
2  2  2  2

In [3]: mi = pd.MultiIndex.from_tuples([(1,2), (3,4)], names=['l1', 'l2'])

In [4]: pd.read_csv('/tmp/csv.csv', header=None, names=mi)
Out[4]: 
   1  3
   2  4
0  2  2
1  2  2
2  2  2

Expected Output

In [4]: pd.read_csv('/tmp/csv.csv', header=None, names=mi)
Out[4]: 
l1  1  3
l2  2  4
0   2  2
1   2  2
2   2  2

Might be worth fixing together with #7589 (just a rough guess, didn't look at the code)

output of `pd.show_versions()`

In [8]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: eba78032e927d2680850c88190f3bafe18cac6ba
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.3.0-1-amd64
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: it_IT.utf8

pandas: 0.18.0rc1+66.geba7803
nose: 1.3.6
pip: 1.5.6
setuptools: 20.1.1
Cython: 0.23.2
numpy: 1.10.0.post2
scipy: 0.16.0
statsmodels: 0.8.0.dev0+755fa81
xarray: None
IPython: 4.1.1
sphinx: 1.3.1
patsy: 0.3.0-dev
dateutil: 2.4.2
pytz: 2015.6
blosc: None
bottleneck: None
tables: 3.2.2
numexpr: 2.4.3
matplotlib: 1.5.dev1
openpyxl: None
xlrd: 0.9.4
xlwt: None
xlsxwriter: 0.7.3
lxml: None
bs4: 4.4.0
html5lib: 0.999
httplib2: 0.9.1
apiclient: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: None
jinja2: 2.8

The text was updated successfully, but these errors were encountered:

jreback · 2016-03-03T12:23:04Z

you need to specify a list for the header, see http://pandas.pydata.org/pandas-docs/stable/io.html#reading-columns-with-a-multiindex

jreback · 2016-03-03T12:24:04Z

in the future pls supply examples as copy-pastable (use StringIO)

toobaz · 2016-03-03T13:49:07Z

Sorry, the example could have also been simpler:

In [3]: mi = pd.MultiIndex.from_tuples([(1,2), (3,4)], names=['l1', 'l2'])

In [4]: pd.read_csv(StringIO('0,2,2\n1,2,2\n2,2,2'), names=mi)
Out[4]: 
   1  3
   2  4
0  2  2
1  2  2
2  2  2

(the problem did not lie in header)

toobaz · 2016-03-03T21:58:26Z

(... which means that this is a bug, I think)

jreback · 2016-03-03T22:04:47Z

@toobaz that does't make any sense

names : array-like, default None
    List of column names to use. If file contains no header row, then you
    should explicitly pass header=None

what you are writing doesn't match the doc-string, nor the expected behavior.

toobaz · 2016-03-03T23:05:49Z

Ooops, right, sorry. Too bad.

jreback closed this as completed Mar 3, 2016

jreback added Usage Question IO CSV read_csv, to_csv MultiIndex labels Mar 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

read_csv not respecting MultiIndex names #12518

read_csv not respecting MultiIndex names #12518

toobaz commented Mar 3, 2016

jreback commented Mar 3, 2016

jreback commented Mar 3, 2016

toobaz commented Mar 3, 2016

toobaz commented Mar 3, 2016

jreback commented Mar 3, 2016

toobaz commented Mar 3, 2016

read_csv not respecting MultiIndex names #12518

read_csv not respecting MultiIndex names #12518

Comments

toobaz commented Mar 3, 2016

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

jreback commented Mar 3, 2016

jreback commented Mar 3, 2016

toobaz commented Mar 3, 2016

toobaz commented Mar 3, 2016

jreback commented Mar 3, 2016

toobaz commented Mar 3, 2016

output of `pd.show_versions()`