Skip to content

Commit 17386d1

Browse files
Backport PR pandas-dev#47121 on branch 1.4.x (BUG: read_excel loading some xlsx ints as floats) (pandas-dev#47271)
Backport PR pandas-dev#47121: BUG: read_excel loading some xlsx ints as floats Co-authored-by: Andrew Hawyrluk <[email protected]>
1 parent 3fdfa66 commit 17386d1

File tree

4 files changed

+17
-2
lines changed

4 files changed

+17
-2
lines changed

doc/source/whatsnew/v1.4.3.rst

+1
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Fixed regressions
2424
- Fixed regression in :func:`read_csv` with ``index_col=False`` identifying first row as index names when ``header=None`` (:issue:`46955`)
2525
- Fixed regression in :meth:`.DataFrameGroupBy.agg` when used with list-likes or dict-likes and ``axis=1`` that would give incorrect results; now raises ``NotImplementedError`` (:issue:`46995`)
2626
- Fixed regression in :meth:`DataFrame.resample` and :meth:`DataFrame.rolling` when used with list-likes or dict-likes and ``axis=1`` that would raise an unintuitive error message; now raises ``NotImplementedError`` (:issue:`46904`)
27+
- Fixed regression in :func:`read_excel` returning ints as floats on certain input sheets (:issue:`46988`)
2728
- Fixed regression in :meth:`DataFrame.shift` when ``axis`` is ``columns`` and ``fill_value`` is absent, ``freq`` is ignored (:issue:`47039`)
2829

2930
.. ---------------------------------------------------------------------------

pandas/io/excel/_openpyxl.py

+8-2
Original file line numberDiff line numberDiff line change
@@ -560,8 +560,14 @@ def _convert_cell(self, cell, convert_float: bool) -> Scalar:
560560
return "" # compat with xlrd
561561
elif cell.data_type == TYPE_ERROR:
562562
return np.nan
563-
elif not convert_float and cell.data_type == TYPE_NUMERIC:
564-
return float(cell.value)
563+
elif cell.data_type == TYPE_NUMERIC:
564+
# GH5394, GH46988
565+
if convert_float:
566+
val = int(cell.value)
567+
if val == cell.value:
568+
return val
569+
else:
570+
return float(cell.value)
565571

566572
return cell.value
567573

Binary file not shown.

pandas/tests/io/excel/test_openpyxl.py

+8
Original file line numberDiff line numberDiff line change
@@ -375,3 +375,11 @@ def test_read_empty_with_blank_row(datapath, ext, read_only):
375375
wb.close()
376376
expected = DataFrame()
377377
tm.assert_frame_equal(result, expected)
378+
379+
380+
def test_ints_spelled_with_decimals(datapath, ext):
381+
# GH 46988 - openpyxl returns this sheet with floats
382+
path = datapath("io", "data", "excel", f"ints_spelled_with_decimals{ext}")
383+
result = pd.read_excel(path)
384+
expected = DataFrame(range(2, 12), columns=[1])
385+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)