Skip to content

Commit 5229259

Browse files
ahawrylukjreback
authored andcommitted
BUG: read_excel loading some xlsx ints as floats (pandas-dev#47121)
* Reintoduce integer conversion when reading xlsx * whats new 1.4.3 Co-authored-by: Jeff Reback <[email protected]>
1 parent fc76f1a commit 5229259

File tree

4 files changed

+17
-2
lines changed

4 files changed

+17
-2
lines changed

doc/source/whatsnew/v1.4.3.rst

+1
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Fixed regressions
2424
- Fixed regression in :func:`read_csv` with ``index_col=False`` identifying first row as index names when ``header=None`` (:issue:`46955`)
2525
- Fixed regression in :meth:`.DataFrameGroupBy.agg` when used with list-likes or dict-likes and ``axis=1`` that would give incorrect results; now raises ``NotImplementedError`` (:issue:`46995`)
2626
- Fixed regression in :meth:`DataFrame.resample` and :meth:`DataFrame.rolling` when used with list-likes or dict-likes and ``axis=1`` that would raise an unintuitive error message; now raises ``NotImplementedError`` (:issue:`46904`)
27+
- Fixed regression in :func:`read_excel` returning ints as floats on certain input sheets (:issue:`46988`)
2728
- Fixed regression in :meth:`DataFrame.shift` when ``axis`` is ``columns`` and ``fill_value`` is absent, ``freq`` is ignored (:issue:`47039`)
2829

2930
.. ---------------------------------------------------------------------------

pandas/io/excel/_openpyxl.py

+8-2
Original file line numberDiff line numberDiff line change
@@ -583,8 +583,14 @@ def _convert_cell(self, cell, convert_float: bool) -> Scalar:
583583
return "" # compat with xlrd
584584
elif cell.data_type == TYPE_ERROR:
585585
return np.nan
586-
elif not convert_float and cell.data_type == TYPE_NUMERIC:
587-
return float(cell.value)
586+
elif cell.data_type == TYPE_NUMERIC:
587+
# GH5394, GH46988
588+
if convert_float:
589+
val = int(cell.value)
590+
if val == cell.value:
591+
return val
592+
else:
593+
return float(cell.value)
588594

589595
return cell.value
590596

Binary file not shown.

pandas/tests/io/excel/test_openpyxl.py

+8
Original file line numberDiff line numberDiff line change
@@ -388,3 +388,11 @@ def test_book_and_sheets_consistent(ext):
388388
assert writer.sheets == {}
389389
sheet = writer.book.create_sheet("test_name", 0)
390390
assert writer.sheets == {"test_name": sheet}
391+
392+
393+
def test_ints_spelled_with_decimals(datapath, ext):
394+
# GH 46988 - openpyxl returns this sheet with floats
395+
path = datapath("io", "data", "excel", f"ints_spelled_with_decimals{ext}")
396+
result = pd.read_excel(path)
397+
expected = DataFrame(range(2, 12), columns=[1])
398+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)