-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
NaN label in MultiIndex is assigned a non NaN value when writing to excel file #13511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This indeed seems like a bug (your example is a bit strange, as you end up with a empty dataframe with a mutli-index (as you did set all columns as the index), but it occurs as well for a non-empty dataframe). Interested in trying to look for a fix? |
I think the bug is in the method pandas.formats.format.ExcelFormatter._format_hierarchical_rows() which is called by the public method pandas.formats.format.ExcelFormatter.get_formatted_cells(). More precisely the problem is the statement
(here) where each of To fix the bug, the above statement could be replaced by
What follows is an example:
yields
yields
yields
I'd like to fix that bug, but I haven't found any unit tests for |
That looks like a very reasonable explanation! PR very welcome. For the tests, I don't think we have tests for ExcelFormatter directly, but tests for read_excel/to_excel are in https://github.com/pydata/pandas/blob/master/pandas/io/tests/test_excel.py. The basic tests are eg here: https://github.com/pydata/pandas/blob/master/pandas/io/tests/test_excel.py#L1311. Those typically read the written file back in to check it's correctness. That approach should be possible here as well. |
Given a DataFrame which has a MultiIndex. When a label of the MultiIndex has the value NaN and the DataFrame is written to an excel file, the label will have a value which is not NaN in the excel file.
Code Sample, a copy-pastable example if possible
returns a DataFrame where the first element of column 'c2' is NaN:
Set both columns as the index:
returns
Write DataFrame to excel file and read it back in:
returns
The first element of column 'c2' is now set to 'b' instead of NaN.
Expected Output
output of
pd.show_versions()
INSTALLED VERSIONS
commit: ac174349b0e1525475c2354e1c0b8ee1ed1cabad
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-88-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.1
nose: None
pip: 1.5.4
setuptools: 2.2
Cython: None
numpy: 1.11.0
scipy: 0.16.1
statsmodels: 0.6.1
xarray: None
IPython: 4.0.1
sphinx: None
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: 2.5.2
matplotlib: 1.5.0
openpyxl: 2.3.5
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: