Skip to content

BUG: Ignore chartsheets #41698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 2, 2021
Merged

Conversation

ahawryluk
Copy link
Contributor

Issue #41448 relates specifically to our openpyxl engine, and is resolved by this commit. However, the bug also exists in our pyxlsb engine and cannot be resolved until the upstream willtrnr/pyxlsb#33 is addressed. I propose closing #41448 and leaving the pyxlsb case as an xfail for now. Let me know if there's a better way to track this.

@datapythonista datapythonista added IO Excel read_excel, to_excel Bug labels May 28, 2021
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ahawryluk, looks good. Can you have a look at the CI, and see if you can leave it green please?

@ahawryluk
Copy link
Contributor Author

@datapythonista I think #41665 tries to address the CI failure on the numpydev build

@ahawryluk ahawryluk changed the title Ignore chartsheets BUG: Ignore chartsheets Jun 2, 2021
@ahawryluk
Copy link
Contributor Author

@datapythonista CI is all green now. Thanks for taking a look at this.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good - I wonder if we should be changing the documentation for the sheet_names argument:

Strings are used for sheet names. Integers are used in zero-indexed sheet positions. Lists of strings/integers are used to request multiple sheets. Specify None to get all sheets.

Get all worksheets instead?

Also, what happens if the index specified by sheetnames is a chart-sheet? What is a previous sheet is a chart sheet - is the chart sheet counted in the indexing?

@ahawryluk
Copy link
Contributor Author

Hi @rhshadrach, thanks for the questions. I've attempted to address them in the docstring without getting too wordy. String and integer inputs are interpreted with respect to ExcelFile.sheet_names, which now excludes chart sheets, so the accepted inputs should be internally consistent and the default sheet_name=0 won't crash if there's a chart sheet in the first position.

@rhshadrach
Copy link
Member

Thanks @ahawryluk - makes sense. One last question - what is the error raised if the sheet name (as a string) is a chart sheet? If it's sensible, can you add a test for this.

@ahawryluk
Copy link
Contributor Author

@rhshadrach Good idea. It raises the same ValueError you would get for an invalid sheet name, so I've added the tests for loading by str or by an int out of range.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - can you resolve conflict and move whatsnew note to 1.4

@rhshadrach rhshadrach added this to the 1.4 milestone Jun 25, 2021
And move the whatsnew entry to 1.4.0
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rhshadrach rhshadrach merged commit 7178fbb into pandas-dev:master Jul 2, 2021
@rhshadrach
Copy link
Member

Thanks @ahawryluk!

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
@ahawryluk ahawryluk deleted the ignore_chartsheets branch June 7, 2022 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: AttributeError raised when reading from excel file containing chart sheet
3 participants