Skip to content

Commit d9d981d

Browse files
lithomas1im-vinicius
authored and
im-vinicius
committed
BUG/CoW: is_range_indexer can't handle very large arrays (pandas-dev#53672)
* BUG: is_range_indexer can't handle very large arrays * fix test on 32-bit
1 parent 2eebde9 commit d9d981d

File tree

3 files changed

+15
-2
lines changed

3 files changed

+15
-2
lines changed

doc/source/whatsnew/v2.1.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -406,7 +406,7 @@ Indexing
406406
^^^^^^^^
407407
- Bug in :meth:`DataFrame.__setitem__` losing dtype when setting a :class:`DataFrame` into duplicated columns (:issue:`53143`)
408408
- Bug in :meth:`DataFrame.__setitem__` with a boolean mask and :meth:`DataFrame.putmask` with mixed non-numeric dtypes and a value other than ``NaN`` incorrectly raising ``TypeError`` (:issue:`53291`)
409-
-
409+
- Bug in indexing methods (e.g. :meth:`DataFrame.__getitem__`) where taking the entire :class:`DataFrame`/:class:`Series` would raise an ``OverflowError`` when Copy on Write was enabled and the length of the array was over the maximum size a 32-bit integer can hold (:issue:`53616`)
410410

411411
Missing
412412
^^^^^^^

pandas/_libs/lib.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -668,7 +668,7 @@ ctypedef fused int6432_t:
668668

669669
@cython.wraparound(False)
670670
@cython.boundscheck(False)
671-
def is_range_indexer(ndarray[int6432_t, ndim=1] left, int n) -> bool:
671+
def is_range_indexer(ndarray[int6432_t, ndim=1] left, Py_ssize_t n) -> bool:
672672
"""
673673
Perform an element by element comparison on 1-d integer arrays, meant for indexer
674674
comparisons

pandas/tests/libs/test_lib.py

+13
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
lib,
77
writers as libwriters,
88
)
9+
from pandas.compat import IS64
910

1011
from pandas import Index
1112
import pandas._testing as tm
@@ -248,6 +249,18 @@ def test_is_range_indexer(self, dtype):
248249
left = np.arange(0, 100, dtype=dtype)
249250
assert lib.is_range_indexer(left, 100)
250251

252+
@pytest.mark.skipif(
253+
not IS64,
254+
reason="2**31 is too big for Py_ssize_t on 32-bit. "
255+
"It doesn't matter though since you cannot create an array that long on 32-bit",
256+
)
257+
@pytest.mark.parametrize("dtype", ["int64", "int32"])
258+
def test_is_range_indexer_big_n(self, dtype):
259+
# GH53616
260+
left = np.arange(0, 100, dtype=dtype)
261+
262+
assert not lib.is_range_indexer(left, 2**31)
263+
251264
@pytest.mark.parametrize("dtype", ["int64", "int32"])
252265
def test_is_range_indexer_not_equal(self, dtype):
253266
# GH#50592

0 commit comments

Comments
 (0)