BUG: AttributeError raised when reading from excel file containing chart sheet #41448

Qrox · 2021-05-13T03:46:35Z

[Y] I have checked that this issue has not already been reported.
[Y] I have confirmed this bug exists on the latest version of pandas.
[N] (optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

import pandas
pandas.read_excel('chartsheet.xlsx', sheet_name=None)

Example sheet file:
chartsheet.xlsx

Problem description

When reading an excel file that contains a chart sheet (a sheet that contains a chart and no table), the following exception was raised:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-9-89c95e3cb7b1> in <module>
      1 import pandas
----> 2 pandas.read_excel('chartsheet.xlsx', sheet_name=None)

D:\Softwares\Anaconda\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
    297                 )
    298                 warnings.warn(msg, FutureWarning, stacklevel=stacklevel)
--> 299             return func(*args, **kwargs)
    300 
    301         return wrapper

D:\Softwares\Anaconda\lib\site-packages\pandas\io\excel\_base.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, storage_options)
    365             skipfooter=skipfooter,
    366             convert_float=convert_float,
--> 367             mangle_dupe_cols=mangle_dupe_cols,
    368         )
    369     finally:

D:\Softwares\Anaconda\lib\site-packages\pandas\io\excel\_base.py in parse(self, sheet_name, header, names, index_col, usecols, squeeze, converters, true_values, false_values, skiprows, nrows, na_values, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, **kwds)
   1188             convert_float=convert_float,
   1189             mangle_dupe_cols=mangle_dupe_cols,
-> 1190             **kwds,
   1191         )
   1192 

D:\Softwares\Anaconda\lib\site-packages\pandas\io\excel\_base.py in parse(self, sheet_name, header, names, index_col, usecols, squeeze, dtype, true_values, false_values, skiprows, nrows, na_values, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, **kwds)
    490                 sheet = self.get_sheet_by_index(asheetname)
    491 
--> 492             data = self.get_sheet_data(sheet, convert_float)
    493             usecols = maybe_convert_usecols(usecols)
    494 

D:\Softwares\Anaconda\lib\site-packages\pandas\io\excel\_openpyxl.py in get_sheet_data(self, sheet, convert_float)
    546         data: List[List[Scalar]] = []
    547         last_row_with_data = -1
--> 548         for row_number, row in enumerate(sheet.rows):
    549             converted_row = [self._convert_cell(cell, convert_float) for cell in row]
    550             if not all(cell == "" for cell in converted_row):

AttributeError: 'Chartsheet' object has no attribute 'rows'

Expected Output

I expect that either the chart sheet is ignored, returned as some appropriate object, or a warning is shown that the chart sheet cannot be read, instead of raising the exception.

In my case I'm reading from many zipped archives of excel files, so manually removing the charts from the files is not an option.

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit : 2cb9652 python : 3.7.4.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19041 machine : AMD64 processor : Intel64 Family 6 Model 158 Stepping 9, GenuineIntel byteorder : little LC_ALL : None LANG : zh_CN.UTF-8 LOCALE : None.None

pandas : 1.2.4
numpy : 1.19.4
pytz : 2019.3
dateutil : 2.8.1
pip : 21.0.1
setuptools : 51.0.0
Cython : 0.29.14
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.22.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.1
numexpr : None
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : None
pyxlsb : 1.0.8
s3fs : None
scipy : 1.5.4
sqlalchemy : None
tables : None
tabulate : 0.8.7
xarray : None
xlrd : 2.0.1
xlwt : None
numba : None

The text was updated successfully, but these errors were encountered:

ahawryluk · 2021-05-26T17:08:46Z

take

ahawryluk · 2021-05-26T17:19:39Z

I expect that either the chart sheet is ignored, returned as some appropriate object, or a warning is shown that the chart sheet cannot be read, instead of raising the exception.

For consistency between readers I believe we should always ignore chartsheets. Currently, we have the following ExcelFile behaviours:

	xls	xlsx/m	xlsb
chart sheets in sheet_names	n	y	y
chart sheets counted in index	n	n	y

Changing our xlsx engine to ignore chart sheets is easy. Changing our xlsb engine will have to wait until this bug is fixed: willtrnr/pyxlsb#33. This bug doesn't apply to to .ods files because the OpenDocument specification doesn't support chart sheets. Saving an excel file with a chart sheet as .ods produces an ordinary worksheet with a large graph and no data.

Qrox added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 13, 2021

github-actions bot assigned ahawryluk May 26, 2021

ahawryluk mentioned this issue May 28, 2021

BUG: Ignore chartsheets #41698

Merged

4 tasks

jbrockmendel added the IO Excel read_excel, to_excel label Jun 6, 2021

rhshadrach added this to the 1.4 milestone Jun 25, 2021

rhshadrach removed the Needs Triage Issue that has not been reviewed by a pandas team member label Jun 26, 2021

rhshadrach closed this as completed in #41698 Jul 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: AttributeError raised when reading from excel file containing chart sheet #41448

BUG: AttributeError raised when reading from excel file containing chart sheet #41448

Qrox commented May 13, 2021

ahawryluk commented May 26, 2021

ahawryluk commented May 26, 2021

BUG: AttributeError raised when reading from excel file containing chart sheet #41448

BUG: AttributeError raised when reading from excel file containing chart sheet #41448

Comments

Qrox commented May 13, 2021

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

ahawryluk commented May 26, 2021

ahawryluk commented May 26, 2021

Output of `pd.show_versions()`