-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the pandas.Series/DataFrame.interpolate docstring #20270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 12 commits
b39289f
0f5f666
5358af9
6cee318
ba8cfd2
272d5e2
d64f5f3
0734c3b
1eab0a8
3ca95ec
b123846
3af4306
3f00e93
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6081,89 +6081,191 @@ def replace(self, to_replace=None, value=None, inplace=False, limit=None, | |
|
||
_shared_docs['interpolate'] = """ | ||
Please note that only ``method='linear'`` is supported for | ||
DataFrames/Series with a MultiIndex. | ||
DataFrame/Series with a MultiIndex. | ||
|
||
Parameters | ||
---------- | ||
method : {'linear', 'time', 'index', 'values', 'nearest', 'zero', | ||
'slinear', 'quadratic', 'cubic', 'barycentric', 'krogh', | ||
'polynomial', 'spline', 'piecewise_polynomial', | ||
'from_derivatives', 'pchip', 'akima'} | ||
method : str, default 'linear' | ||
Interpolation technique to use. One of: | ||
|
||
* 'linear': ignore the index and treat the values as equally | ||
* 'linear': Ignore the index and treat the values as equally | ||
spaced. This is the only method supported on MultiIndexes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand why you added these, but generally do not put punctuation at the end of bullet points. If you get an error as a result OK to ignore |
||
default | ||
* 'time': interpolation works on daily and higher resolution | ||
data to interpolate given length of interval | ||
* 'index', 'values': use the actual numerical values of the index | ||
* 'nearest', 'zero', 'slinear', 'quadratic', 'cubic', | ||
'barycentric', 'polynomial' is passed to | ||
:class:`scipy.interpolate.interp1d`. Both 'polynomial' and | ||
'spline' require that you also specify an `order` (int), | ||
* 'time': Works on daily and higher resolution data to interpolate | ||
given length of interval. | ||
* 'index', 'values': use the actual numerical values of the index. | ||
* 'pad': Fill in NaNs using existing values. | ||
* 'nearest', 'zero', 'slinear', 'quadratic', 'cubic', 'spline', | ||
'barycentric', 'polynomial': Passed to | ||
``scipy.interpolate.interp1d``. Both 'polynomial' and 'spline' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would do the same thing here you did for 'krogh' and move some of the implementation details down to the Notes section There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should just be single backticks no? |
||
require that you also specify an `order` (int), | ||
e.g. ``df.interpolate(method='polynomial', order=4)``. | ||
These use the actual numerical values of the index. | ||
* 'krogh', 'piecewise_polynomial', 'spline', 'pchip' and 'akima' | ||
are all wrappers around the scipy interpolation methods of | ||
similar names. These use the actual numerical values of the | ||
index. For more information on their behavior, see the | ||
`scipy documentation | ||
<http://docs.scipy.org/doc/scipy/reference/interpolate.html#univariate-interpolation>`__ | ||
and `tutorial documentation | ||
<http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html>`__ | ||
* 'from_derivatives' refers to | ||
:meth:`scipy.interpolate.BPoly.from_derivatives` which | ||
These use the numerical values of the index. | ||
* 'krogh', 'piecewise_polynomial', 'spline', 'pchip', 'akima': | ||
Wrappers around the SciPy interpolation methods of similar | ||
names. See `Notes`. | ||
* 'from_derivatives': Refers to | ||
``scipy.interpolate.BPoly.from_derivatives`` which | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Single backtick too? |
||
replaces 'piecewise_polynomial' interpolation method in | ||
scipy 0.18 | ||
scipy 0.18. | ||
|
||
.. versionadded:: 0.18.1 | ||
|
||
Added support for the 'akima' method. | ||
Added interpolate method 'from_derivatives' which replaces | ||
'piecewise_polynomial' in scipy 0.18; backwards-compatible with | ||
scipy < 0.18 | ||
|
||
axis : {0, 1}, default 0 | ||
* 0: fill column-by-column | ||
* 1: fill row-by-row | ||
limit : int, default None. | ||
Maximum number of consecutive NaNs to fill. Must be greater than 0. | ||
limit_direction : {'forward', 'backward', 'both'}, default 'forward' | ||
limit_area : {'inside', 'outside'}, default None | ||
* None: (default) no fill restriction | ||
* 'inside' Only fill NaNs surrounded by valid values (interpolate). | ||
* 'outside' Only fill NaNs outside valid values (extrapolate). | ||
'piecewise_polynomial' in SciPy 0.18; backwards-compatible with | ||
SciPy < 0.18 | ||
|
||
axis : {0 or 'index', 1 or 'columns', None}, default None | ||
Axis to interpolate along. | ||
limit : int, optional | ||
Maximum number of consecutive NaNs to fill. Must be greater than | ||
0. | ||
inplace : bool, default False | ||
Update the data in place if possible. | ||
limit_direction : {'forward', 'backward', 'both'}, default 'forward' | ||
If limit is specified, consecutive NaNs will be filled in this | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Put back ticks around `NaN` |
||
direction. | ||
limit_area : {`None`, 'inside', 'outside'}, default None | ||
If limit is specified, consecutive NaNs will be filled with this | ||
restriction. | ||
|
||
* None: No fill restriction. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe double backticks here to render literal value? |
||
* 'inside': Only fill NaNs surrounded by valid values | ||
(interpolate). | ||
* 'outside': Only fill NaNs outside valid values (extrapolate). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be good to add an example for 'outside' |
||
|
||
.. versionadded:: 0.21.0 | ||
inplace : bool, default False | ||
Update the NDFrame in place if possible. | ||
|
||
downcast : optional, 'infer' or None, defaults to None | ||
Downcast dtypes if possible. | ||
kwargs : keyword arguments to pass on to the interpolating function. | ||
**kwargs | ||
Keyword arguments to pass on to the interpolating function. | ||
|
||
Returns | ||
------- | ||
Series or DataFrame of same shape interpolated at the NaNs | ||
Series or DataFrame | ||
Returns the same object type as the caller, interpolated at | ||
some or all `NaN` values | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Double ticks |
||
|
||
See Also | ||
-------- | ||
reindex, replace, fillna | ||
fillna : Fill missing values using different methods. | ||
scipy.interpolate.Akima1DInterpolator : Piecewise cubic polynomials | ||
(Akima interpolator). | ||
scipy.interpolate.BPoly.from_derivatives : Piecewise polynomial in the | ||
Bernstein basis. | ||
scipy.interpolate.interp1d : Interpolate a 1-D function. | ||
scipy.interpolate.KroghInterpolator : Interpolate polynomial (Krogh | ||
interpolator). | ||
scipy.interpolate.PchipInterpolator : PCHIP 1-d monotonic cubic | ||
interpolation. | ||
scipy.interpolate.CubicSpline : Cubic spline data interpolator. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As this method is referenced here, I expected it to be available just like any of the other ones, but I have combed the source code and I am unable to find the place where this method is used. Was it added because it was planned to add future support for CubicSpline (with a wrapper such as Akima's)? |
||
|
||
Notes | ||
----- | ||
The 'krogh', 'piecewise_polynomial', 'spline', 'pchip' and 'akima' | ||
methods are wrappers around the respective SciPy implementations of | ||
similar names. These use the actual numerical values of the index. | ||
For more information on their behavior, see the | ||
`SciPy documentation | ||
<http://docs.scipy.org/doc/scipy/reference/interpolate.html#univariate-interpolation>`__ | ||
and `SciPy tutorial | ||
<http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html>`__. | ||
|
||
Examples | ||
-------- | ||
|
||
Filling in NaNs | ||
Filling in `NaN` in a :class:`~pandas.Series` via linear | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Double |
||
interpolation. | ||
|
||
>>> s = pd.Series([0, 1, np.nan, 3]) | ||
>>> s | ||
0 0.0 | ||
1 1.0 | ||
2 NaN | ||
3 3.0 | ||
dtype: float64 | ||
>>> s.interpolate() | ||
0 0 | ||
1 1 | ||
2 2 | ||
3 3 | ||
0 0.0 | ||
1 1.0 | ||
2 2.0 | ||
3 3.0 | ||
dtype: float64 | ||
|
||
Filling in `NaN` in a Series by padding, but filling at most two | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Double ticks (here and next line) |
||
consecutive `NaN` at a time. | ||
|
||
>>> s = pd.Series([np.nan, "single_one", np.nan, | ||
... "fill_two_more", np.nan, np.nan, np.nan, | ||
... 4.71, np.nan]) | ||
>>> s | ||
0 NaN | ||
1 single_one | ||
2 NaN | ||
3 fill_two_more | ||
4 NaN | ||
5 NaN | ||
6 NaN | ||
7 4.71 | ||
8 NaN | ||
dtype: object | ||
>>> s.interpolate(method='pad', limit=2) | ||
0 NaN | ||
1 single_one | ||
2 single_one | ||
3 fill_two_more | ||
4 fill_two_more | ||
5 fill_two_more | ||
6 NaN | ||
7 4.71 | ||
8 4.71 | ||
dtype: object | ||
|
||
Filling in `NaN` in a Series via polynomial interpolation or splines: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Double backtick |
||
Both `polynomial` and `spline` methods require that you also specify | ||
an `order` (int). | ||
|
||
>>> s = pd.Series([0, 2, np.nan, 8]) | ||
>>> s.interpolate(method='polynomial', order=2) | ||
0 0.000000 | ||
1 2.000000 | ||
2 4.666667 | ||
3 8.000000 | ||
dtype: float64 | ||
|
||
Fill the DataFrame forward (that is, going down) along each column | ||
using linear interpolation. | ||
|
||
Note how the last entry in column `a` is interpolated differently, | ||
because there is no entry after it to use for interpolation. | ||
Note how the first entry in column `b` remains `NaN`, because there | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Double backticks |
||
is no entry befofe it to use for interpolation. | ||
|
||
>>> df = pd.DataFrame([(0.0, np.nan, -1.0, 1.0), | ||
... (np.nan, 2.0, np.nan, np.nan), | ||
... (2.0, 3.0, np.nan, 9.0), | ||
... (np.nan, 4.0, -4.0, 16.0)], | ||
... columns=list('abcd')) | ||
>>> df | ||
a b c d | ||
0 0.0 NaN -1.0 1.0 | ||
1 NaN 2.0 NaN NaN | ||
2 2.0 3.0 NaN 9.0 | ||
3 NaN 4.0 -4.0 16.0 | ||
>>> df.interpolate(method='linear', limit_direction='forward', axis=0) | ||
a b c d | ||
0 0.0 NaN -1.0 1.0 | ||
1 1.0 2.0 -2.0 5.0 | ||
2 2.0 3.0 -3.0 9.0 | ||
3 2.0 4.0 -4.0 16.0 | ||
|
||
Using polynomial interpolation. | ||
|
||
>>> df['d'].interpolate(method='polynomial', order=2) | ||
0 1.0 | ||
1 4.0 | ||
2 9.0 | ||
3 16.0 | ||
Name: d, dtype: float64 | ||
""" | ||
|
||
@Appender(_shared_docs['interpolate'] % _shared_doc_kwargs) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally shouldn't need periods at the end of bullet points