-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix interpolate -limit Add interpolate limit_direction='inside' #16307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -20,6 +20,12 @@ Check the :ref:`API Changes <whatsnew_0210.api_breaking>` and :ref:`deprecations | |||
New features | |||
~~~~~~~~~~~~ | |||
|
|||
- DataFrame.interpolate() has a new setting: limit_direction='inside'. | |||
This will cause the interpolation to fill missing values only when | |||
the missing value is surounded by valid values. It is useful when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
surrounded
pls add tests for any changes. |
Sorry for the oversight. I will get the tests added. I also realized I
probably should have done two separate pulls. On for the fix and one for
the enhancement.
Also, Baurzhan suggested a change from limit_direction='inside' to
surrounded.
Any thoughts on this? I'm thinking 'inside' sounds more like a direction,
but I will make it whatever.
…On May 10, 2017 7:15 AM, "Jeff Reback" ***@***.***> wrote:
pls add tests for any changes.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#16307 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APZUQdS16yqKS4i86LHkyuQ9djmR5tR0ks5r4ZxDgaJpZM4NVrqz>
.
|
I've added required tests for the limit direction feature. I see them in my fork and I see them referenced in the pull request. It should (I think) be ready to go, but I'm not sure if I need to do anything to update the pull. If I've missed anything else, just let me know. This is my first contribution, so please bear with me. Thanks |
Codecov Report
@@ Coverage Diff @@
## master #16307 +/- ##
==========================================
+ Coverage 90.38% 90.41% +0.02%
==========================================
Files 161 161
Lines 50999 51001 +2
==========================================
+ Hits 46096 46111 +15
+ Misses 4903 4890 -13
Continue to review full report at Codecov.
|
Hi Guys, Looks like I got all checks passed, coverage increased with new tests, I got my spelling error fixed, and I currently have no conflicts with the base branch. May I get a review on this? It is actually a fairly simple change. I just had to learn the process for submitting. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new limit_direction='inside'
seems a bit confusing to me. We use that to refer to the number of consecutive NaNs to fill. With this change, it's either that, or a special case for "out of valid bounds".
How about we add a new keyword that mimics fill_value
from https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html#scipy.interpolate.interp1d. We can call it either fill_value
or extrapolate
. I think that will require more discussion that the bugfix, so could you split it off of this change and make a new PR?
This will cause the interpolation to fill missing values only when | ||
the missing value is surrounded by valid values. It is useful when | ||
a series needs to be interpolated, but must not expand into NaN | ||
values that were outside the range of the original series. (GH16284) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference to the issue should be like
:issue:`16284`
- DataFrame.interpolate was not respecting limit_direction when | ||
limit=0 (unlimited). Specifically, it would always use | ||
limit_direction='forward' even when specified otherwise. Now | ||
default limit=0 will work with other directions. (GH16282) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about the reference
@@ -931,6 +931,40 @@ def test_interp_limit_forward(self): | |||
limit_direction='FORWARD') | |||
assert_series_equal(result, expected) | |||
|
|||
def test_interp_limit_inside(self): | |||
# these test are for issue #16307 'inside' direction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just change this comment to GH #16307
or the link to the issue.
Hi Tom, Please help me understand your comment. There are currently two parameters in use:
The new direction 'inside' fits with the other existing directions. Please help me understand the need to add a new parameter. Thanks! |
Your test kind of shows it (`s.interpolate(method='linear', limit=1, limit_direction='inside'): s = Series([np.nan, 1, 3, np.nan, np.nan, np.nan, 11, np.nan])
expected = Series([np.nan, 1, 3, 5, np.nan, 9, 11, np.nan]) It looks like that's a expected = Series([np.nan, 1, 3, 5, np.nan, np.nan, 11, np.nan])
# ^ this one changed to still be nan Basically, extrapolation is different that interpolation ;) |
Ah... right. Yes, I agree with you. My key issue was actually with the fact that we are not differentiating between extrapolating and interpolating. I was fixing one case, but a general solution would cover all the cases. By default scypi interpolates, and by default pandas interpolates AND extrapolates. Perhaps: Or better yet... maybe: Since I'm in the middle of this, I'd be happy to do this if you guys want it done. Just let me know what the core contributors decide. In the mean time, I will revert my code to master and just apply code required for the limit=n bug fix (I should not have combined these in the first place), then submit changes again. Thanks! |
git diff upstream/master --name-only -- '*.py' | flake8 --diff