-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH GH20601 raise error when pivot table's number of levels > int32 #20709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH GH20601 raise error when pivot table's number of levels > int32 #20709
Conversation
Hello @anhqle! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on April 16, 2018 at 06:07 Hours UTC |
@anhqle : So far so good, but you're missing a |
pandas/core/reshape/reshape.py
Outdated
@@ -162,6 +162,8 @@ def _make_selectors(self): | |||
self.full_shape = ngroups, stride | |||
|
|||
selector = self.sorted_labels[-1] + stride * comp_index + self.lift | |||
if np.prod(self.full_shape) > (2 ** 31 - 1): | |||
raise ValueError('Pivot table is too big, causing int32 overflow') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback : Is it okay to catch it here, or should we try to catch earlier as you mentioned before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to make any change, and would love to hear the reasoning for catching it earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally as soon as you know this is out of bounds you want to raise.
Codecov Report
@@ Coverage Diff @@
## master #20709 +/- ##
==========================================
- Coverage 91.84% 91.84% -0.01%
==========================================
Files 153 153
Lines 49279 49281 +2
==========================================
+ Hits 45259 45260 +1
- Misses 4020 4021 +1
Continue to review full report at Codecov.
|
HDFStore.select_column error reporting
As in #14832, use = (native) instead of < (little-endian)
pandas/tests/reshape/test_pivot.py
Outdated
@pytest.mark.slow | ||
def test_pivot_number_of_levels_larger_than_int32(self): | ||
# GH 20601 | ||
data = DataFrame({'ind1': list(range(1337600)) * 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using list(range
use np.arange
(and array ops)
pandas/core/reshape/reshape.py
Outdated
@@ -162,6 +162,8 @@ def _make_selectors(self): | |||
self.full_shape = ngroups, stride | |||
|
|||
selector = self.sorted_labels[-1] + stride * comp_index + self.lift | |||
if np.prod(self.full_shape) > (2 ** 31 - 1): | |||
raise ValueError('Pivot table is too big, causing int32 overflow') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally as soon as you know this is out of bounds you want to raise.
… larger than int32
…: in pivot_table and unstack
I messed up |
What's New: Raise an error when the number of levels in a pivot table is larger than int32
git diff upstream/master -u -- "*.py" | flake8 --diff