Skip to content

read_excel and Excel File Engine conflict #26736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 10, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -665,6 +665,7 @@ I/O
- Bug in :func:`read_json` where date strings with ``Z`` were not converted to a UTC timezone (:issue:`26168`)
- Added ``cache_dates=True`` parameter to :meth:`read_csv`, which allows to cache unique dates when they are parsed (:issue:`25990`)
- :meth:`DataFrame.to_excel` now raises a ``ValueError`` when the caller's dimensions exceed the limitations of Excel (:issue:`26051`)
- :func:`read_excel` now raises a ``ValueError`` when input is of type :class:`pandas.io.excel.ExcelFile` and ``engine`` param is passed since :class:`pandas.io.excel.ExcelFile` has an engine defined (:issue:`26566`)

Plotting
^^^^^^^^
Expand Down
4 changes: 4 additions & 0 deletions pandas/io/excel/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,9 @@ def read_excel(io,

if not isinstance(io, ExcelFile):
io = ExcelFile(io, engine=engine)
elif engine and engine != io.engine:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is io.engine ever None? Coverage here might be a little suspect for cases like:

read_excel(ExcelFile(), engine='xlrd')

which I suppose should work fine (assuming xlrd is default) but may not be covered here explicitly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current behaviour is io.engine in this line will default to 'xlrd' since that's in the ExcelFile constuctor -> added a test case to make this more explicit.

raise ValueError("Engine should not be specified when passing "
"an ExcelFile - ExcelFile already has the engine set")

return io.parse(
sheet_name=sheet_name,
Expand Down Expand Up @@ -777,6 +780,7 @@ def __init__(self, io, engine=None):
if engine not in self._engines:
raise ValueError("Unknown engine: {engine}".format(engine=engine))

self.engine = engine
# could be a str, ExcelFile, Book, etc.
self.io = io
# Always a string
Expand Down
11 changes: 11 additions & 0 deletions pandas/tests/io/excel/test_readers.py
Original file line number Diff line number Diff line change
Expand Up @@ -834,3 +834,14 @@ def test_reader_closes_file(self, read_ext):
pd.read_excel(xlsx, 'Sheet1', index_col=0)

assert f.closed

@pytest.mark.parametrize('excel_engine', [
'xlrd',
None
])
def test_read_excel_engine_value(self, read_ext, excel_engine):
# GH 26566
xl = ExcelFile("test1" + read_ext, engine=excel_engine)
msg = "Engine should not be specified when passing an ExcelFile"
with pytest.raises(ValueError, match=msg):
pd.read_excel(xl, engine='openpyxl')