Skip to content

Commit b022a3b

Browse files
authored
BUG: read_excel raising uncontrolled IndexError when header references non-existing rows (#47399)
* BUG: read_excel raising uncontrolled IndexError when header references non-existing rows * Fix issue number * Change message
1 parent c5a640d commit b022a3b

File tree

4 files changed

+13
-0
lines changed

4 files changed

+13
-0
lines changed

doc/source/whatsnew/v1.5.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -864,6 +864,7 @@ I/O
864864
- Bug in :func:`read_csv` interpreting second row as :class:`Index` names even when ``index_col=False`` (:issue:`46569`)
865865
- Bug in :func:`read_parquet` when ``engine="pyarrow"`` which caused partial write to disk when column of unsupported datatype was passed (:issue:`44914`)
866866
- Bug in :func:`DataFrame.to_excel` and :class:`ExcelWriter` would raise when writing an empty DataFrame to a ``.ods`` file (:issue:`45793`)
867+
- Bug in :func:`read_excel` raising uncontrolled ``IndexError`` when ``header`` references non-existing rows (:issue:`43143`)
867868
- Bug in :func:`read_html` where elements surrounding ``<br>`` were joined without a space between them (:issue:`29528`)
868869
- Bug in :func:`read_csv` when data is longer than header leading to issues with callables in ``usecols`` expecting strings (:issue:`46997`)
869870
- Bug in Parquet roundtrip for Interval dtype with ``datetime64[ns]`` subtype (:issue:`45881`)

pandas/io/excel/_base.py

+6
Original file line numberDiff line numberDiff line change
@@ -774,6 +774,12 @@ def parse(
774774
assert isinstance(skiprows, int)
775775
row += skiprows
776776

777+
if row > len(data) - 1:
778+
raise ValueError(
779+
f"header index {row} exceeds maximum index "
780+
f"{len(data) - 1} of data.",
781+
)
782+
777783
data[row], control_row = fill_mi_header(data[row], control_row)
778784

779785
if index_col is not None:
5.47 KB
Binary file not shown.

pandas/tests/io/excel/test_readers.py

+6
Original file line numberDiff line numberDiff line change
@@ -1556,6 +1556,12 @@ def test_excel_read_binary_via_read_excel(self, read_ext, engine):
15561556
expected = pd.read_excel("test1" + read_ext, engine=engine)
15571557
tm.assert_frame_equal(result, expected)
15581558

1559+
def test_read_excel_header_index_out_of_range(self, engine):
1560+
# GH#43143
1561+
with open("df_header_oob.xlsx", "rb") as f:
1562+
with pytest.raises(ValueError, match="exceeds maximum"):
1563+
pd.read_excel(f, header=[0, 1])
1564+
15591565
@pytest.mark.parametrize("filename", ["df_empty.xlsx", "df_equals.xlsx"])
15601566
def test_header_with_index_col(self, filename):
15611567
# GH 33476

0 commit comments

Comments
 (0)