Skip to content

BUG: openpyxl 3.1.1 breaks pandas read_excel #51537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
jonathanloske opened this issue Feb 21, 2023 · 2 comments
Closed
3 tasks done

BUG: openpyxl 3.1.1 breaks pandas read_excel #51537

jonathanloske opened this issue Feb 21, 2023 · 2 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@jonathanloske
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

file = ...

pd.read_excel(
  io=BytesIO(file.file.read()),
  sheet_name="sheet_name",
  header="header"
)

Issue Description

Since openpyxl 3.1.1, the code given above results in the following error:

Failed to process file: 'ReadOnlyWorksheet' object has no attribute 'defined_names'

Downgrading to openpyxl 3.1.0 fixes this issue. From the outside, it is not clear whether pandas or openpyxl needs to be fixed, even though it looks like a bug on openpyxl side. It is likely related to this issue which was marked as fixed in openpyxl 3.1.1: https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1947

Link to openpyxl changelog: https://openpyxl.readthedocs.io/en/stable/changes.html#id1

There is also a report on StackOverflow which has already gotten more than 4000 views: https://stackoverflow.com/questions/75440354/why-does-pandas-read-excel-fail-on-an-openpyxl-error-saying-readonlyworksheet

Expected Behavior

reading a file works just like with openpyxl 3.1.0

Installed Versions

INSTALLED VERSIONS ------------------ commit : 2e218d1 python : 3.11.1.final.0 python-bits : 64 OS : Darwin OS-release : 22.3.0 Version : Darwin Kernel Version 22.3.0: Mon Jan 30 20:38:37 PST 2023; root:xnu-8792.81.3~2/RELEASE_ARM64_T6000 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.5.3
numpy : 1.24.2
pytz : 2020.5
dateutil : 2.8.2
setuptools : 67.3.3
pip : 22.3.1
Cython : None
pytest : 7.2.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : 2.9.5
jinja2 : 3.1.2
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : 1.4.46
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None
tzdata : None

@jonathanloske jonathanloske added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 21, 2023
@phofl
Copy link
Member

phofl commented Feb 21, 2023

Hi, thanks for your report. We have already an issue for this, bug is indeed in openpyxl and not on our side

@phofl phofl closed this as completed Feb 21, 2023
@jonathanloske
Copy link
Author

This is the issue that tracks this: #51392

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants