-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: interpolate.limit_area() 16284 #16513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
9852ec4
4bacc45
80d67b7
d83246c
b24e488
61e808f
41af8e3
7c53e78
e91cf4f
596f145
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -330,6 +330,10 @@ Interpolation | |
|
||
The ``limit_direction`` keyword argument was added. | ||
|
||
.. versionadded:: 0.21.0 | ||
|
||
The ``limit_area`` keyword argument was added. | ||
|
||
Both Series and Dataframe objects have an ``interpolate`` method that, by default, | ||
performs linear interpolation at missing datapoints. | ||
|
||
|
@@ -454,33 +458,54 @@ at the new values. | |
.. _documentation: http://docs.scipy.org/doc/scipy/reference/interpolate.html#univariate-interpolation | ||
.. _guide: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html | ||
|
||
.. _missing_data.interp_limits: | ||
|
||
Interpolation Limits | ||
^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Like other pandas fill methods, ``interpolate`` accepts a ``limit`` keyword | ||
argument. Use this argument to limit the number of consecutive interpolations, | ||
keeping ``NaN`` values for interpolations that are too far from the last valid | ||
observation: | ||
argument. Use this argument to limit the number of consecutive ``NaN`` values | ||
filled since the last valid observation: | ||
|
||
.. ipython:: python | ||
|
||
ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13]) | ||
ser.interpolate(limit=2) | ||
ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13, np.nan, np.nan]) | ||
|
||
By default, ``limit`` applies in a forward direction, so that only ``NaN`` | ||
values after a non-``NaN`` value can be filled. If you provide ``'backward'`` or | ||
``'both'`` for the ``limit_direction`` keyword argument, you can fill ``NaN`` | ||
values before non-``NaN`` values, or both before and after non-``NaN`` values, | ||
respectively: | ||
# fill all consecutive values in a forward direction | ||
ser.interpolate() | ||
|
||
.. ipython:: python | ||
# fill one consecutive value in a forward direction | ||
ser.interpolate(limit=1) | ||
|
||
By default, ``NaN`` values are filled in a ``forward`` direction. Use | ||
``limit_direction`` parameter to fill ``backward`` or from ``both`` directions. | ||
|
||
ser.interpolate(limit=1) # limit_direction == 'forward' | ||
.. ipython:: python | ||
|
||
# fill one consecutive value backwards | ||
ser.interpolate(limit=1, limit_direction='backward') | ||
|
||
# fill one consecutive value in both directions | ||
ser.interpolate(limit=1, limit_direction='both') | ||
|
||
# fill all consecutive values in both directions | ||
ser.interpolate(limit_direction='both') | ||
|
||
By default, ``NaN`` values are filled whether they are inside (surrounded by) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. need to update this |
||
existing valid values, or outside existing valid values. Introduced in v0.21 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Introduced in v0.21" -> "Introduced in pandas 0.21, " There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
the ``limit_area`` parameter restricts filling to either inside or outside values. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe add some working about interpolation vs extrapolation here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe also when you would want to use / do this. |
||
|
||
.. ipython:: python | ||
|
||
# fill one consecutive inside value in both directions | ||
ser.interpolate(limit=1, limit_area='inside', limit_direction='both') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you put limit_area here also after limit_direction (to have it consistent with the other examples)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
|
||
# fill all consecutive outside values backward | ||
ser.interpolate(limit_direction='backward', limit_area='outside') | ||
|
||
# fill all consecutive outside values in both directions | ||
ser.interpolate(limit_direction='both', limit_area='outside') | ||
|
||
.. _missing_data.replace: | ||
|
||
Replacing Generic Values | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,6 +24,9 @@ New features | |
<https://www.python.org/dev/peps/pep-0519/>`_ on most readers and writers (:issue:`13823`) | ||
- Added `__fspath__` method to :class`:pandas.HDFStore`, :class:`pandas.ExcelFile`, | ||
and :class:`pandas.ExcelWriter` to work properly with the file system path protocol (:issue:`13823`) | ||
- Added `limit_area` parameter to `DataFrame.interpolate()` method allowing further control of which NaNs are replaced. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you use double backticks around limit_area and DataFrame.interpolate ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and just say There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. or you can do a :func: |
||
Use `limit-area='inside'` to fill only NaNs surrounded by valid values or use `limit-area='outside'` to fill only NaNs outside the existing valid values while preserving those inside. (:issue:`16284`) | ||
Full documentation and examples are :ref:`here <missing_data.interp_limits>`. | ||
|
||
.. _whatsnew_0210.enhancements.other: | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3883,11 +3883,16 @@ def replace(self, to_replace=None, value=None, inplace=False, limit=None, | |
limit : int, default None. | ||
Maximum number of consecutive NaNs to fill. Must be greater than 0. | ||
limit_direction : {'forward', 'backward', 'both'}, default 'forward' | ||
If limit is specified, consecutive NaNs will be filled in this | ||
direction. | ||
|
||
Consecutive NaNs will be filled in this direction. | ||
|
||
.. versionadded:: 0.17.0 | ||
|
||
limit_area : {'inside', 'outside'}, default None | ||
* None: (default) no fill restriction | ||
* 'inside' Only fill NaNs surrounded by valid values (interpolate). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would put a colon ( |
||
* 'outside' Only fill NaNs outside valid values (extrapolate). | ||
.. versionadded:: 0.21.0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. put the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback I also noticed and corrected the old .. versionadded tag on 3887 which was not being property replaced. It needed the blank lines to stop it from being combined with the normal paragraph above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you put a blank line above this one |
||
|
||
inplace : bool, default False | ||
Update the NDFrame in place if possible. | ||
downcast : optional, 'infer' or None, defaults to None | ||
|
@@ -3919,7 +3924,8 @@ def replace(self, to_replace=None, value=None, inplace=False, limit=None, | |
|
||
@Appender(_shared_docs['interpolate'] % _shared_doc_kwargs) | ||
def interpolate(self, method='linear', axis=0, limit=None, inplace=False, | ||
limit_direction='forward', downcast=None, **kwargs): | ||
limit_direction='forward', limit_area=None, | ||
downcast=None, **kwargs): | ||
""" | ||
Interpolate values according to different methods. | ||
""" | ||
|
@@ -3968,6 +3974,7 @@ def interpolate(self, method='linear', axis=0, limit=None, inplace=False, | |
new_data = data.interpolate(method=method, axis=ax, index=index, | ||
values=_maybe_transposed_self, limit=limit, | ||
limit_direction=limit_direction, | ||
limit_area=limit_area, | ||
inplace=inplace, downcast=downcast, | ||
**kwargs) | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -959,6 +959,45 @@ def test_interp_limit_bad_direction(self): | |
pytest.raises(ValueError, s.interpolate, method='linear', | ||
limit_direction='abc') | ||
|
||
# limit_area introduced GH #16284 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you put the comment inside the function |
||
def test_interp_limit_area(self): | ||
# These tests are for issue #9218 -- fill NaNs in both directions. | ||
s = Series([nan, nan, 3, nan, nan, nan, 7, nan, nan]) | ||
|
||
expected = Series([nan, nan, 3., 4., 5., 6., 7., nan, nan]) | ||
result = s.interpolate(method='linear', limit_area='inside') | ||
assert_series_equal(result, expected) | ||
|
||
expected = Series([nan, nan, 3., 4., nan, nan, 7., nan, nan]) | ||
result = s.interpolate(method='linear', limit_area='inside', | ||
limit=1) | ||
|
||
expected = Series([nan, nan, 3., 4., nan, 6., 7., nan, nan]) | ||
result = s.interpolate(method='linear', limit_area='inside', | ||
limit_direction='both', limit=1) | ||
assert_series_equal(result, expected) | ||
|
||
expected = Series([nan, nan, 3., nan, nan, nan, 7., 7., 7.]) | ||
result = s.interpolate(method='linear', limit_area='outside') | ||
assert_series_equal(result, expected) | ||
|
||
expected = Series([nan, nan, 3., nan, nan, nan, 7., 7., nan]) | ||
result = s.interpolate(method='linear', limit_area='outside', | ||
limit=1) | ||
|
||
expected = Series([nan, 3., 3., nan, nan, nan, 7., 7., nan]) | ||
result = s.interpolate(method='linear', limit_area='outside', | ||
limit_direction='both', limit=1) | ||
assert_series_equal(result, expected) | ||
|
||
expected = Series([3., 3., 3., nan, nan, nan, 7., nan, nan]) | ||
result = s.interpolate(method='linear', limit_area='outside', | ||
direction='backward') | ||
|
||
# raises an error even if limit type is wrong. | ||
pytest.raises(ValueError, s.interpolate, method='linear', | ||
limit_area='abc') | ||
|
||
def test_interp_limit_direction(self): | ||
# These tests are for issue #9218 -- fill NaNs in both directions. | ||
s = Series([1, 3, np.nan, np.nan, np.nan, 11]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't make sense w/o an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jeff,
The examples for both limit_direction and limit_area are below in the "interpolation limits" sub-section.
I'm mostly trying to get the correct style from inference, so I basically reproduced what had been done in the past for limit_direction.
There is a location (.. _missing_data.interp_limits:) below these versionadded references to which both limit_direction and limit_area can be linked if that is the right style.
Honestly, since version added is part of the docstrings, I'm not sure it needs to be reproduced here at all, but again, that is a bigger style question above my pay grade. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A link to below sounds good. You can make a new one specifically for
_missing_data.interp_limit_area
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this, I would just remove it here.