Skip to content

DOC: update the pandas.Series/DataFrame.interpolate docstring #20270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 152 additions & 50 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -6081,89 +6081,191 @@ def replace(self, to_replace=None, value=None, inplace=False, limit=None,

_shared_docs['interpolate'] = """
Please note that only ``method='linear'`` is supported for
DataFrames/Series with a MultiIndex.
DataFrame/Series with a MultiIndex.

Parameters
----------
method : {'linear', 'time', 'index', 'values', 'nearest', 'zero',
'slinear', 'quadratic', 'cubic', 'barycentric', 'krogh',
'polynomial', 'spline', 'piecewise_polynomial',
'from_derivatives', 'pchip', 'akima'}
method : str, default 'linear'
Interpolation technique to use. One of:

* 'linear': ignore the index and treat the values as equally
* 'linear': Ignore the index and treat the values as equally
spaced. This is the only method supported on MultiIndexes.
default
* 'time': interpolation works on daily and higher resolution
data to interpolate given length of interval
* 'index', 'values': use the actual numerical values of the index
* 'nearest', 'zero', 'slinear', 'quadratic', 'cubic',
'barycentric', 'polynomial' is passed to
:class:`scipy.interpolate.interp1d`. Both 'polynomial' and
'spline' require that you also specify an `order` (int),
* 'time': Works on daily and higher resolution data to interpolate
given length of interval.
* 'index', 'values': use the actual numerical values of the index.
* 'pad': Fill in NaNs using existing values.
* 'nearest', 'zero', 'slinear', 'quadratic', 'cubic', 'spline',
'barycentric', 'polynomial': Passed to
`scipy.interpolate.interp1d`. Both 'polynomial' and 'spline'
require that you also specify an `order` (int),
e.g. ``df.interpolate(method='polynomial', order=4)``.
These use the actual numerical values of the index.
* 'krogh', 'piecewise_polynomial', 'spline', 'pchip' and 'akima'
are all wrappers around the scipy interpolation methods of
similar names. These use the actual numerical values of the
index. For more information on their behavior, see the
`scipy documentation
<http://docs.scipy.org/doc/scipy/reference/interpolate.html#univariate-interpolation>`__
and `tutorial documentation
<http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html>`__
* 'from_derivatives' refers to
:meth:`scipy.interpolate.BPoly.from_derivatives` which
These use the numerical values of the index.
* 'krogh', 'piecewise_polynomial', 'spline', 'pchip', 'akima':
Wrappers around the SciPy interpolation methods of similar
names. See `Notes`.
* 'from_derivatives': Refers to
`scipy.interpolate.BPoly.from_derivatives` which
replaces 'piecewise_polynomial' interpolation method in
scipy 0.18
scipy 0.18.

.. versionadded:: 0.18.1

Added support for the 'akima' method.
Added interpolate method 'from_derivatives' which replaces
'piecewise_polynomial' in scipy 0.18; backwards-compatible with
scipy < 0.18

axis : {0, 1}, default 0
* 0: fill column-by-column
* 1: fill row-by-row
limit : int, default None.
Maximum number of consecutive NaNs to fill. Must be greater than 0.
limit_direction : {'forward', 'backward', 'both'}, default 'forward'
limit_area : {'inside', 'outside'}, default None
* None: (default) no fill restriction
* 'inside' Only fill NaNs surrounded by valid values (interpolate).
* 'outside' Only fill NaNs outside valid values (extrapolate).
'piecewise_polynomial' in SciPy 0.18; backwards-compatible with
SciPy < 0.18

axis : {0 or 'index', 1 or 'columns', None}, default None
Axis to interpolate along.
limit : int, optional
Maximum number of consecutive NaNs to fill. Must be greater than
0.
inplace : bool, default False
Update the data in place if possible.
limit_direction : {'forward', 'backward', 'both'}, default 'forward'
If limit is specified, consecutive NaNs will be filled in this
direction.
limit_area : {`None`, 'inside', 'outside'}, default None
If limit is specified, consecutive NaNs will be filled with this
restriction.

* ``None``: No fill restriction.
* 'inside': Only fill NaNs surrounded by valid values
(interpolate).
* 'outside': Only fill NaNs outside valid values (extrapolate).

.. versionadded:: 0.21.0
inplace : bool, default False
Update the NDFrame in place if possible.

downcast : optional, 'infer' or None, defaults to None
Downcast dtypes if possible.
kwargs : keyword arguments to pass on to the interpolating function.
**kwargs
Keyword arguments to pass on to the interpolating function.

Returns
-------
Series or DataFrame of same shape interpolated at the NaNs
Series or DataFrame
Returns the same object type as the caller, interpolated at
some or all ``NaN`` values

See Also
--------
reindex, replace, fillna
fillna : Fill missing values using different methods.
scipy.interpolate.Akima1DInterpolator : Piecewise cubic polynomials
(Akima interpolator).
scipy.interpolate.BPoly.from_derivatives : Piecewise polynomial in the
Bernstein basis.
scipy.interpolate.interp1d : Interpolate a 1-D function.
scipy.interpolate.KroghInterpolator : Interpolate polynomial (Krogh
interpolator).
scipy.interpolate.PchipInterpolator : PCHIP 1-d monotonic cubic
interpolation.
scipy.interpolate.CubicSpline : Cubic spline data interpolator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this method is referenced here, I expected it to be available just like any of the other ones, but I have combed the source code and I am unable to find the place where this method is used. Was it added because it was planned to add future support for CubicSpline (with a wrapper such as Akima's)?


Notes
-----
The 'krogh', 'piecewise_polynomial', 'spline', 'pchip' and 'akima'
methods are wrappers around the respective SciPy implementations of
similar names. These use the actual numerical values of the index.
For more information on their behavior, see the
`SciPy documentation
<http://docs.scipy.org/doc/scipy/reference/interpolate.html#univariate-interpolation>`__
and `SciPy tutorial
<http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html>`__.

Examples
--------

Filling in NaNs
Filling in ``NaN`` in a :class:`~pandas.Series` via linear
interpolation.

>>> s = pd.Series([0, 1, np.nan, 3])
>>> s
0 0.0
1 1.0
2 NaN
3 3.0
dtype: float64
>>> s.interpolate()
0 0
1 1
2 2
3 3
0 0.0
1 1.0
2 2.0
3 3.0
dtype: float64

Filling in ``NaN`` in a Series by padding, but filling at most two
consecutive ``NaN`` at a time.

>>> s = pd.Series([np.nan, "single_one", np.nan,
... "fill_two_more", np.nan, np.nan, np.nan,
... 4.71, np.nan])
>>> s
0 NaN
1 single_one
2 NaN
3 fill_two_more
4 NaN
5 NaN
6 NaN
7 4.71
8 NaN
dtype: object
>>> s.interpolate(method='pad', limit=2)
0 NaN
1 single_one
2 single_one
3 fill_two_more
4 fill_two_more
5 fill_two_more
6 NaN
7 4.71
8 4.71
dtype: object

Filling in ``NaN`` in a Series via polynomial interpolation or splines:
Both 'polynomial' and 'spline' methods require that you also specify
an ``order`` (int).

>>> s = pd.Series([0, 2, np.nan, 8])
>>> s.interpolate(method='polynomial', order=2)
0 0.000000
1 2.000000
2 4.666667
3 8.000000
dtype: float64

Fill the DataFrame forward (that is, going down) along each column
using linear interpolation.

Note how the last entry in column 'a' is interpolated differently,
because there is no entry after it to use for interpolation.
Note how the first entry in column 'b' remains ``NaN``, because there
is no entry befofe it to use for interpolation.

>>> df = pd.DataFrame([(0.0, np.nan, -1.0, 1.0),
... (np.nan, 2.0, np.nan, np.nan),
... (2.0, 3.0, np.nan, 9.0),
... (np.nan, 4.0, -4.0, 16.0)],
... columns=list('abcd'))
>>> df
a b c d
0 0.0 NaN -1.0 1.0
1 NaN 2.0 NaN NaN
2 2.0 3.0 NaN 9.0
3 NaN 4.0 -4.0 16.0
>>> df.interpolate(method='linear', limit_direction='forward', axis=0)
a b c d
0 0.0 NaN -1.0 1.0
1 1.0 2.0 -2.0 5.0
2 2.0 3.0 -3.0 9.0
3 2.0 4.0 -4.0 16.0

Using polynomial interpolation.

>>> df['d'].interpolate(method='polynomial', order=2)
0 1.0
1 4.0
2 9.0
3 16.0
Name: d, dtype: float64
"""

@Appender(_shared_docs['interpolate'] % _shared_doc_kwargs)
Expand Down