Skip to content

BUG: read_excel() fails when checking __version__ of older xlrd versions #38955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
kcharlie2 opened this issue Jan 4, 2021 · 3 comments · Fixed by #39355
Closed
2 of 3 tasks

BUG: read_excel() fails when checking __version__ of older xlrd versions #38955

kcharlie2 opened this issue Jan 4, 2021 · 3 comments · Fixed by #39355
Assignees
Labels
Bug IO Excel read_excel, to_excel
Milestone

Comments

@kcharlie2
Copy link

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


When xlrd is installed but is an "older" version (e.g. 1.1.0):

pandas.read_excel("file.xlsx", engine="openpyxl")

raises

AttributeError: module 'xlrd' has no attribute '__version__'

This specifically occurs in pandas/io/excel/_base.py:336

Problem description

This bug occurs regardless of the engine input because pandas.read_excel() always checks xlrd.__version__ if any version of xlrd is installed. Older versions of xlrd use __VERSION__ instead of __version__ instead, which results in the AttributeError.

This exception does not occur and everything behaves as expected if:

  • xlrd is not installed
  • A "newer" version of xlrd is installed (that has the __version__ attribute)

Expected Output

According to the documentation,

Changed in version 1.2.0: When engine=None, the following logic will be used to determine the engine:
    
- If path_or_buffer is an OpenDocument format (.odf, .ods, .odt), then odf will be used.
- Otherwise if path_or_buffer is an xls format, xlrd will be used.
- Otherwise if openpyxl is installed, then openpyxl will be used.
- Otherwise if xlrd >= 2.0 is installed, a ValueError will be raised.
- Otherwise xlrd will be used and a FutureWarning will be raised. This case will raise a ValueError in a future version of pandas.

Based on this, I believe the intended behavior is:

  • read_excel() should not check the xlrd version if engine is "openpyxl"
  • If read_excel() does need to know the version of xlrd, it should try both __version__ and __VERSION__

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 3e89b4c
python : 3.7.4.final.0
python-bits : 64
OS : Linux
OS-release : 3.10.0-1062.4.1.el7.x86_64
Version : #1 SMP Wed Sep 25 09:42:57 EDT 2019
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US
LOCALE : en_US.ISO8859-1

pandas : 1.2.0
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.3.3
setuptools : 40.8.0
Cython : None
pytest : 6.0.1
hypothesis : None
sphinx : 3.2.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.19.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.5
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : None
tables : 3.6.1
tabulate : 0.8.7
xarray : None
xlrd : 1.1.0
xlwt : None
numba : None

@kcharlie2 kcharlie2 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 4, 2021
@kcharlie2
Copy link
Author

kcharlie2 commented Jan 4, 2021

This is related to the changes in #35029.

@phofl phofl added IO Excel read_excel, to_excel and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 4, 2021
@phofl
Copy link
Member

phofl commented Jan 4, 2021

Can reproduce this with a random excel file

@lithomas1
Copy link
Member

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants