.to_excel() cuts off columns #10982

balzer82 · 2015-09-03T14:55:26Z

Today I discovered a strange behavior: When I am writing a DataFrame with .to_excel(), it cuts columns. Compared with the same DataFrame with .to_csv() or .head(), you can see the difference, that the last 8 columns are missing.

You can reproduce this by downloading Features.pkl from here and then:

import pandas as pd
df = pd.read_pickle('Features.pkl')
df.head() # see the last 8 columns!
df.to_excel('Features.xlsx', index=False, header=False)
# see the Excel, you do not have these last 8 columns
# in a .to_csv() you have them

Funny part: If you df.ix[:,-71:].to_excel('Features.xlsx', index=False, header=False) you have one of the missing columns. If you do df.ix[:,-70:].to_excel('Features.xlsx', index=False, header=False) you have two and so on...

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8

pandas: 0.16.2
nose: 1.3.7
Cython: 0.22.1
numpy: 1.9.2
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 3.2.0
sphinx: 1.3.1
patsy: 0.3.0
dateutil: 2.4.2
pytz: 2015.4
bottleneck: 1.0.0
tables: 3.2.0
numexpr: 2.4.3
matplotlib: 1.4.3
openpyxl: 1.8.6
xlrd: 0.9.3
xlwt: 1.0.0
xlsxwriter: 0.7.3
lxml: 3.4.4
bs4: 4.3.2
html5lib: None
httplib2: 0.9
apiclient: None
sqlalchemy: 1.0.5
pymysql: None
psycopg2: None

The text was updated successfully, but these errors were encountered:

dsm054 · 2015-09-03T19:10:49Z

In an odd coincidence, I think this is another Excel-dup-column issue:

>>> len(df.columns)
380
>>> len(set(df.columns))
372

and there are eight columns missing. By changing the window into df, you change which columns (if any) are duplicated.

balzer82 · 2015-09-04T07:28:10Z

Thats it! There are columns with exactly the same name on this position [-70:]. So no export of duplicated column names to_excel?

jorisvandenbossche added Bug Prio-high IO Excel read_excel, to_excel labels Sep 4, 2015

terrytangyuan mentioned this issue Sep 4, 2015

ENH: Added warning when excel file contains duplicate column names #10991

Closed

jorisvandenbossche added this to the 0.17.0 milestone Sep 5, 2015

jorisvandenbossche mentioned this issue Sep 5, 2015

BUG: to_excel swaps order of values of duplicate columns #11007

Closed

jreback modified the milestones: Next Major Release, 0.17.0 Sep 8, 2015

chris-b1 mentioned this issue Oct 4, 2015

BUG: to_excel duplicate columns #11237

Merged

jreback modified the milestones: 0.17.1, Next Major Release Oct 5, 2015

jreback closed this as completed in #11237 Oct 10, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.to_excel() cuts off columns #10982

.to_excel() cuts off columns #10982

balzer82 commented Sep 3, 2015

dsm054 commented Sep 3, 2015

balzer82 commented Sep 4, 2015

.to_excel() cuts off columns #10982

.to_excel() cuts off columns #10982

Comments

balzer82 commented Sep 3, 2015

dsm054 commented Sep 3, 2015

balzer82 commented Sep 4, 2015