You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
c1 foo bar
c2 a b a b
0 0 1 2.5 2015-01-01 True
1 1 2 3.5 2015-01-02 False
2 2 3 4.5 2015-01-03 False
3 3 4 5.5 2015-01-04 True
Problem description
This behavior is due to pandas.io.excel._pop_header_name().
def_pop_header_name(row, index_col):
""" (header, new_data) for header rows in MultiIndex parsing"""none_fill=lambdax: Noneifx==''elsexifindex_colisNone:
# no index col specified, trim data for inference pathreturnnone_fill(row[0]), row[1:]
else:
# pop out header name and fill w/ blanki=index_colifnotis_list_like(index_col) elsemax(index_col)
returnnone_fill(row[i]), row[:i] + [''] +row[i+1:]
There are two things in the function that I do not understand
index_col is None pulls out the first column and then treats it as an index.
index_col == 0 returns something similar but adds a column of '' to the front of the return row. Why are these almost the same?
Output of pd.show_versions()
# Paste the output here pd.show_versions() here
INSTALLED VERSIONS
commit: None
python: 2.7.13.final.0
python-bits: 32
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 69 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
Thanks for the report, this is a duplicate of #11733, definitely would like to solve this.
index_col=None, contrary to what you could might expect, is treated as "infer whether or not there is an index column" and is also the default for read_excel. So as mentioned in that issue, I think the right approach is change the default for index_col to a sentinel (e.g. 'infer') so that index_col=None can mean something. PRs / ideas for the api welcome!
read_excel()
seems to give the same results for a multi-index column file ifindex_col=None
orindex_col=0
. Using the file from /pandas/test/io/data:Expected Output
With
index_col=None
I would have expected the result to look something like:Problem description
This behavior is due to
pandas.io.excel._pop_header_name()
.There are two things in the function that I do not understand
index_col is None
pulls out the first column and then treats it as an index.index_col == 0
returns something similar but adds a column of''
to the front of the return row. Why are these almost the same?Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.13.final.0
python-bits: 32
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 69 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 30.3.0
Cython: 0.24
numpy: 1.12.0
scipy: 0.13.3
xarray: None
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: 1.3.2
bottleneck: 1.0.0
tables: 3.1.0
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: 2.3.4
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: 3.6.0
pandas_datareader: 0.3.0.post
The text was updated successfully, but these errors were encountered: