Skip to content

Fix interpolate -limit Add interpolate limit_direction='inside' #16307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,12 @@ Check the :ref:`API Changes <whatsnew_0210.api_breaking>` and :ref:`deprecations
New features
~~~~~~~~~~~~

- DataFrame.interpolate() has a new setting: limit_direction='inside'.
This will cause the interpolation to fill missing values only when
the missing value is surounded by valid values. It is useful when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surrounded

a series needs to be interpolated, but must not expand into NaN
values that were outside the range of the original series. (GH16284)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference to the issue should be like

:issue:`16284`




.. _whatsnew_0210.enhancements.other:
Expand Down Expand Up @@ -107,6 +113,11 @@ Reshaping
Numeric
^^^^^^^

- DataFrame.interpolate was not respecting limit_direction when
limit=0 (unlimited). Specifically, it would always use
limit_direction='forward' even when specified otherwise. Now
default limit=0 will work with other directions. (GH16282)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about the reference




Other
Expand Down
37 changes: 19 additions & 18 deletions pandas/core/missing.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ def _interp_limit(invalid, fw_limit, bw_limit):
if invalid[max(0, x - fw_limit):x + bw_limit + 1].all():
yield x

valid_limit_directions = ['forward', 'backward', 'both']
valid_limit_directions = ['forward', 'backward', 'both', 'inside']
limit_direction = limit_direction.lower()
if limit_direction not in valid_limit_directions:
raise ValueError('Invalid limit_direction: expecting one of %r, got '
Expand All @@ -172,23 +172,24 @@ def _interp_limit(invalid, fw_limit, bw_limit):
# c) Limit is nonzero and it is further than limit from the nearest non-NaN
# value (with respect to the limit_direction setting).
#
# The default behavior is to fill forward with no limit, ignoring NaNs at
# the beginning (see issues #9218 and #10420)
violate_limit = sorted(start_nans)

if limit is not None:
if not is_integer(limit):
raise ValueError('Limit must be an integer')
if limit < 1:
raise ValueError('Limit must be greater than 0')
if limit_direction == 'forward':
violate_limit = sorted(start_nans | set(_interp_limit(invalid,
limit, 0)))
if limit_direction == 'backward':
violate_limit = sorted(end_nans | set(_interp_limit(invalid, 0,
limit)))
if limit_direction == 'both':
violate_limit = sorted(_interp_limit(invalid, limit, limit))
# If Limit is not an integer greater than 0, then use default behavior
# of filling without limit in the direction specified by limit_direction

if not (is_integer(limit) and limit > 0):
limit = len(xvalues)

# each possible limit_direction
if limit_direction == 'forward':
violate_limit = sorted(start_nans |
set(_interp_limit(invalid, limit, 0)))
elif limit_direction == 'backward':
violate_limit = sorted(end_nans |
set(_interp_limit(invalid, 0, limit)))
elif limit_direction == 'both':
violate_limit = sorted(_interp_limit(invalid, limit, limit))
elif limit_direction == 'inside':
violate_limit = sorted(start_nans | end_nans |
set(_interp_limit(invalid, limit, limit)))

xvalues = getattr(xvalues, 'values', xvalues)
yvalues = getattr(yvalues, 'values', yvalues)
Expand Down