Skip to content

df.to_excel gives TypeError #30106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chhanganivarun opened this issue Dec 6, 2019 · 4 comments
Closed

df.to_excel gives TypeError #30106

chhanganivarun opened this issue Dec 6, 2019 · 4 comments
Labels
IO Excel read_excel, to_excel

Comments

@chhanganivarun
Copy link

chhanganivarun commented Dec 6, 2019

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
df = pd.read_csv('example')
df.to_excel('Excel_Sample.xlsx',sheet_name='Sheet1')

where output of df is:

    a   b   c   d
0   0   1   2   3
1   4   5   6   7
2   8   9  10  11
3  12  13  14  15

and file example is:

a,b,c,d
0,1,2,3
4,5,6,7
8,9,10,11
12,13,14,15

Problem description

The to_excel function of class dataframe must write the dataframe in the specified excel file instead it returns error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py", line 2256, in to_excel
    engine=engine,
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/formats/excel.py", line 742, in write
    writer.save()
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/excel/_openpyxl.py", line 44, in save
    return self.book.save(self.path)
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/workbook/workbook.py", line 408, in save
    save_workbook(self, filename)
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/writer/excel.py", line 293, in save_workbook
    writer.save()
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/writer/excel.py", line 275, in save
    self.write_data()
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/writer/excel.py", line 75, in write_data
    self._write_worksheets()
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/writer/excel.py", line 215, in _write_worksheets
    self.write_worksheet(ws)
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/writer/excel.py", line 200, in write_worksheet
    writer.write()
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/worksheet/_writer.py", line 354, in write
    self.write_top()
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/worksheet/_writer.py", line 98, in write_top
    self.write_properties()
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/worksheet/_writer.py", line 60, in write_properties
    self.xf.send(props.to_tree())
  File "/usr/local/lib/python3.6/dist-packages/openpyxl/worksheet/_writer.py", line 294, in get_stream
    xf.write(el)
  File "src/lxml/serializer.pxi", line 1230, in lxml.etree._IncrementalFileWriter.write
TypeError: got invalid input value of type <class 'xml.etree.ElementTree.Element'>, expected string or Element

trying to open written file using df = pd.read_excel('Excel_Sample.xlsx',sheet_name='Sheet1')
returns error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/pandas/util/_decorators.py", line 208, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/excel/_base.py", line 310, in read_excel
    io = ExcelFile(io, engine=engine)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/excel/_base.py", line 819, in __init__
    self._reader = self._engines[engine](self._io)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/excel/_xlrd.py", line 21, in __init__
    super().__init__(filepath_or_buffer)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/excel/_base.py", line 359, in __init__
    self.book = self.load_workbook(filepath_or_buffer)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/excel/_xlrd.py", line 36, in load_workbook
    return open_workbook(filepath_or_buffer)
  File "/usr/local/lib/python3.6/dist-packages/xlrd/__init__.py", line 145, in open_workbook
    raise XLRDError('ZIP file contents not a known type of workbook')
xlrd.biffh.XLRDError: ZIP file contents not a known type of workbook

Expected Output

The excel file should be written correctly with the dataframe's data.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.6.8.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-70-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_IN
LOCALE : en_IN.ISO8859-1

pandas : 0.25.3
numpy : 1.16.3
pytz : 2018.7
dateutil : 2.8.0
pip : 19.3.1
setuptools : 41.0.1
Cython : None
pytest : 3.9.3
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.3.3
html5lib : 1.0.1
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.0.1
pandas_datareader: None
bs4 : 4.8.0
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.3.3
matplotlib : 3.0.3
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.2
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.1.0
sqlalchemy : 1.3.11
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None

@Mxchaeltrxn
Copy link

Hi could I take this up as a first issue? I'm not quite sure how to assign it to myself. I didn't understand "For this reason, we implemented a workaround consisting of adding a comment with the exact text take. When you do it, a GitHub action will automatically assign you the issue" when reading the contributions page.

@jbrockmendel
Copy link
Member

This is a problem in openpyxl 3.0.2, you should be OK if you downgrade to 3.0.1

@jbrockmendel jbrockmendel added the IO Excel read_excel, to_excel label Dec 6, 2019
@chhanganivarun
Copy link
Author

chhanganivarun commented Dec 10, 2019

Hey @jbrockmendel sorry for delayed reply
My take on this is that shouldn't pandas be compatible with the newer library versions instead of asking users to use older version of a library?

@jbrockmendel
Copy link
Member

My take on this is that shouldn't pandas be compatible with the newer library versions instead of asking users to use older version of a library?

As a general rule, yes, but in this case there is a bug specific to openpyxl 3.0.2, see #29862

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel
Projects
None yet
Development

No branches or pull requests

3 participants