Skip to content

Commit 48fc9d6

Browse files
david-hoffmanjreback
authored andcommitted
BUG: Fix overflow error in cartesian_product
When the numbers in `X` are large it can cause an overflow error on windows machine where the native `int` is 32 bit. Switching to np.intp alleviates this problem. Other fixes would include switching to np.uint32 or np.uint64. closes pandas-dev#15234 Author: David Hoffman <[email protected]> Closes pandas-dev#15265 from david-hoffman/patch-1 and squashes the following commits: c9c8d5e [David Hoffman] Update v0.19.2.txt d54583e [David Hoffman] Remove `test_large_input` because it's too big 47a6c6c [David Hoffman] Update test so that it will actually run on "normal" machine 7aeee85 [David Hoffman] Added tests for large numbers b196878 [David Hoffman] Fix overflow error in cartesian_product
1 parent c26e5bb commit 48fc9d6

File tree

2 files changed

+2
-1
lines changed

2 files changed

+2
-1
lines changed

doc/source/whatsnew/v0.20.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -444,6 +444,7 @@ Bug Fixes
444444
- Bug in compat for passing long integers to ``Timestamp.replace`` (:issue:`15030`)
445445
- Bug in ``.loc`` that would not return the correct dtype for scalar access for a DataFrame (:issue:`11617`)
446446
- Bug in ``GroupBy.get_group()`` failing with a categorical grouper (:issue:`15155`)
447+
- Bug in ``pandas.tools.utils.cartesian_product()`` with large input can cause overflow on windows (:issue:`15265`)
447448

448449

449450

pandas/tools/util.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ def cartesian_product(X):
5858
if len(X) == 0:
5959
return []
6060

61-
lenX = np.fromiter((len(x) for x in X), dtype=int)
61+
lenX = np.fromiter((len(x) for x in X), dtype=np.intp)
6262
cumprodX = np.cumproduct(lenX)
6363

6464
a = np.roll(cumprodX, 1)

0 commit comments

Comments
 (0)