-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Bug: Raise ValueError with interpolate & fillna limit = 0 (#9217) #14994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/generic.py
Outdated
@@ -3704,6 +3704,8 @@ def interpolate(self, method='linear', axis=0, limit=None, inplace=False, | |||
""" | |||
Interpolate values according to different methods. | |||
""" | |||
if isinstance(limit, int) and not limit > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually might need this check on fillna as well (if it's not there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use is_integer instead of this check
Good idea! Added check to |
Codecov Report
@@ Coverage Diff @@
## master #14994 +/- ##
==========================================
+ Coverage 90.42% 90.42% +<.01%
==========================================
Files 134 134
Lines 49357 49365 +8
==========================================
+ Hits 44632 44640 +8
Misses 4725 4725
Continue to review full report at Codecov.
|
Ping all green |
ok these need to be change here instead: https://github.com/pandas-dev/pandas/blob/master/pandas/src/algos_common_helper.pxi.in#L85 (there are 4 places where this is hit). This is more appropriate as we are already checking the limit and its closer to the actual code path. I think you need some more tests for < 0 (which currently should raise), and == 0 (which I think you added) |
Sorry for the delay in response. I am not too familiar with the cython portion of the code base. Interestingly in master (0d3ecfa), if negative limits are used in
Added tests on my branch to test the negative limits. |
it should but might be trapped somewhere |
Ah, possibly lost in a |
@mroeschke somewhere, you will have to trace it. |
So I traced
And
For Line 142 in f3c5a42
For So if I've traced correctly, it doesn't appear that limit is passed through any |
can you rebase / update |
Rebasing. As noted in my previous comment, it doesn't appear that For Line 142 in f3c5a42
For |
pandas/core/generic.py
Outdated
@@ -3272,6 +3272,9 @@ def convert_objects(self, convert_dates=True, convert_numeric=False, | |||
@Appender(_shared_docs['fillna'] % _shared_doc_kwargs) | |||
def fillna(self, value=None, method=None, axis=None, inplace=False, | |||
limit=None, downcast=None): | |||
if is_integer(limit) and not limit > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a) I would like these to be handled in the current cython code. b) Alternatively you can take out the check there and just put an assertion. I don't really like these types of checks being separated from the actual code, though so prefer a)
So I discovered negative limits are caught by cython if you specifically specify the fill method to be
I am guessing this is true if the user specifies the Otherwise, |
9d6c2d2
to
cfc1c92
Compare
can you update? |
Sure thing. I will also rebase to master as well. Circling back to my last comment, I think the |
@mroeschke ideally I'd like to have these validation on args occur when they are actually used. So might need to add these checks in |
Ah I see, that makes sense. Okay I'll look into migrating the check to |
Migrated the |
pandas/core/missing.py
Outdated
if limit: | ||
if limit is not None: | ||
if is_integer(limit): | ||
assert limit > 0, ('`limit` keyword argument must be greater ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no simply raise ValueError
pandas/core/internals.py
Outdated
@@ -372,6 +372,9 @@ def fillna(self, value, limit=None, inplace=False, downcast=None, | |||
original_value = value | |||
mask = isnull(self.values) | |||
if limit is not None: | |||
if is_integer(limit): | |||
assert limit > 0, ('`limit` keyword argument must be greater ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValueError, or is this an actual assertion? (IOW it is checked at higher levels)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes this should be a ValueError, it's not checked at a higher level.
pandas/tests/series/test_missing.py
Outdated
# related GH 9217, make sure limit is greater than 0 | ||
for limit in [-1, 0]: | ||
s = Series([1, 2, 3, None]) | ||
tm.assertRaises(AssertionError, lambda: s.fillna(1, limit=limit)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we always raise ValueError, never AssertionError (which is an internal error and should not be user visible)
pandas/core/missing.py
Outdated
if limit: | ||
if limit is not None: | ||
if is_integer(limit) and limit < 1: | ||
raise ValueError('`limit` keyword argument must be greater ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this consistent with the cython:
raise ValueError('Limit must be non-negative')
so make the same message (you pick / combine)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is in algos_common_helper.pxi.in
you need to edit the |
Ah oops, apologies. I tried editing those error messages directly through github. I'll make those changes later tonight. |
add whatsnew Add similar log to fillna Update whatsnew with issue and fillna change Add test for a negative limit change if statements to assert move limit check closer to where the variable is used Changes assertion to ValueError
pandas/core/missing.py
Outdated
@@ -169,7 +169,9 @@ def _interp_limit(invalid, fw_limit, bw_limit): | |||
# the beginning (see issues #9218 and #10420) | |||
violate_limit = sorted(start_nans) | |||
|
|||
if limit: | |||
if limit is not None: | |||
if is_integer(limit) and limit < 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I think we should make this a stronger check. limit
MUST be an integer and > 0 if its not None. This would need checking both in cython and here.
@@ -83,8 +83,8 @@ def pad_{{name}}(ndarray[{{c_type}}] old, ndarray[{{c_type}}] new, | |||
if limit is None: | |||
lim = nright | |||
else: | |||
if limit < 0: | |||
raise ValueError('Limit must be non-negative') | |||
if limit < 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so here as well, something like
if not util.is_integer_object(limit):
raise ValueError('limit must be an integer')
if limit < 1:
raise ValueError(....)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively you can simply remove the cython check entirely (see if anything breaks), it might not even be hitting it (and instead catching it at a higher level, IOW your new checks in missing.py), not sure.
if limit is None:
.....
else:
lim = int(limit)
which is will fail hard if its not an integer, and will work (incorrrectly) if its a float, so have to be careful here.
Add stricter integer check with tests
Included |
thanks @mroeschke |
…#9217) closes pandas-dev#9217 Author: Matt Roeschke <[email protected]> Closes pandas-dev#14994 from mroeschke/fix_9217 and squashes the following commits: c1790ee [Matt Roeschke] Unify ValueError message and correct cython limits 6f041e6 [Matt Roeschke] Bug: Raise ValueError with interpolate limit = 0
git diff upstream/master | flake8 --diff
Add similar logic to
fillna