Skip to content

ENH: Add sheet name to error message when reading Excel #49186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 24, 2022
Merged

ENH: Add sheet name to error message when reading Excel #49186

merged 9 commits into from
Oct 24, 2022

Conversation

aaossa
Copy link
Contributor

@aaossa aaossa commented Oct 19, 2022


Description of changes

When reading Excel files with BaseExcelReader, is possible that TextParser raises an Exception on different situations. As the reader works on a sheet basis, including the name of the sheet that caused the exception can be useful for debugging.

The implemented solution captures any Exception that gets raised while parsing and reading a sheet from an Excel file, then adds the name of the sheet to the exception message and "re-raises" the exception.

@aaossa aaossa marked this pull request as draft October 19, 2022 16:49
When reading Excel files with `BaseExcelReader`, is possible that
`TextParser` raises `ValueError` on different situations. As the
reader works on a sheet basis, including the name of the sheet that
caused the exception can be useful for debugging.

The implemented solution captures any `ValueError` that gets raised
while parsing and reading a sheet from an Excel file, then adds the
name of the sheet to the exception message and "re-raises" the
exception using the `Exception.with_traceback` method.

Why use `with_traceback`? This allow-us to maintin the focus on the
actual line that raised the ValueError instead of directing the
attention to the introduced block of code.

Signed-off-by: Antonio Ossa Guerra <[email protected]>
The test performs a `read_excel` operation on a blank Excel file while
also passing `[1]` as header value. This operations results in a
`ValueError`.

Before the fix, the error message did not display any information
about the offending sheet, making it harder to debug Excel files with
several sheets. For this reason, the value of `sheet_name` was
explicitly set to its default value (`None`).

The test compares the `ValueError` message with the expected message
that contains the offending sheet name.

Signed-off-by: Antonio Ossa Guerra <[email protected]>
@mroeschke mroeschke added Error Reporting Incorrect or improved errors from pandas IO Excel read_excel, to_excel labels Oct 19, 2022
@aaossa aaossa changed the title WIP: Add sheet to error message when reading Excel ENH: Add sheet to error message when reading Excel Oct 20, 2022
The introduced enhancement of the error message on `read_excel` is now
included in the whatsnew

Signed-off-by: Antonio Ossa Guerra <[email protected]>
@aaossa aaossa marked this pull request as ready for review October 20, 2022 01:35
Instead of handling traceback of the capture exception manually, this
simpler syntax looks better, is clearer, and produces the same result

Signed-off-by: Antonio Ossa Guerra <[email protected]>
Instead of including the sheet name on the traceback when a ValueError
is raised, we'll do it for any kind of Exception. In particular, this
is because some of the arguments for `read_excel` are callables, and
they can raise any Exception. Including the sheet name might be
helpful in some of those cases, so we'll capture them too

Signed-off-by: Antonio Ossa Guerra <[email protected]>
The previous message only included `ValueError` as the messages that
displayed the sheet name on their traceback, but that behavior is now
present on any exception

Signed-off-by: Antonio Ossa Guerra <[email protected]>
@aaossa aaossa changed the title ENH: Add sheet to error message when reading Excel ENH: Add sheet name to error message when reading Excel Oct 24, 2022
@aaossa aaossa requested review from phofl and rhshadrach and removed request for phofl and rhshadrach October 24, 2022 15:54
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Merge when ready @phofl or @rhshadrach

@phofl phofl added this to the 2.0 milestone Oct 24, 2022
@phofl phofl merged commit 0bd52be into pandas-dev:main Oct 24, 2022
@phofl
Copy link
Member

phofl commented Oct 24, 2022

thx @aaossa

@aaossa aaossa deleted the issue-48706 branch October 25, 2022 12:35
noatamir pushed a commit to noatamir/pandas that referenced this pull request Nov 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Excel import crashes on simple file
4 participants