Skip to content

DOC: update the pandas.core.resample.Resampler.backfill docstring #20083

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 12, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 88 additions & 5 deletions pandas/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -519,21 +519,104 @@ def nearest(self, limit=None):

def backfill(self, limit=None):
"""
Backward fill the values
Backward fill the new missing values in the resampled data.

In statistics, imputation is the process of replacing missing data with
substituted values [1]_. When resampling data, missing values may
appear (e.g., when the resampling frequency is higher than the original
frequency). The backward fill will replace NaN values that appeared in
the resampled data with the next value in the original sequence.
Missing values that existed in the orginal data will not be modified.

Parameters
----------
limit : integer, optional
limit of how many values to fill
Limit of how many values to fill.

Returns
-------
an upsampled Series
Series, DataFrame
An upsampled Series or DataFrame with backward filled NaN values.

See Also
--------
Series.fillna
DataFrame.fillna
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a Resampler.pad, nearest, and fillna refs @jorisvandenbossche how do we reference these exactly here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be you can simply refer to them as 'pad', 'nearest', .. because they live on the same class. But need to check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I run python make.py --single pandas.core.resample.Resampler.backfill, the links are never created. But if I run python make.py html, the links are created correctly if I use: pandas.Series.fillna (Series.fillna and pandas.core.series.Series.fillna don't work) and fillna or Resampler.fillna.

bfill : Alias of backfill.
fillna : Fill NaN values using the specified method, which can be
'backfill'.
nearest : Fill NaN values with nearest neighbor starting from center.
pad : Forward fill NaN values.
pandas.Series.fillna : Fill NaN values in the Series using the
specified method, which can be 'backfill'.
pandas.DataFrame.fillna : Fill NaN values in the DataFrame using the
specified method, which can be 'backfill'.

References
----------
.. [1] https://en.wikipedia.org/wiki/Imputation_(statistics)

Examples
--------

Resampling a Series:

>>> s = pd.Series([1, 2, 3],
... index=pd.date_range('20180101', periods=3, freq='h'))
>>> s
2018-01-01 00:00:00 1
2018-01-01 01:00:00 2
2018-01-01 02:00:00 3
Freq: H, dtype: int64

>>> s.resample('30min').backfill()
2018-01-01 00:00:00 1
2018-01-01 00:30:00 2
2018-01-01 01:00:00 2
2018-01-01 01:30:00 3
2018-01-01 02:00:00 3
Freq: 30T, dtype: int64

>>> s.resample('15min').backfill(limit=2)
2018-01-01 00:00:00 1.0
2018-01-01 00:15:00 NaN
2018-01-01 00:30:00 2.0
2018-01-01 00:45:00 2.0
2018-01-01 01:00:00 2.0
2018-01-01 01:15:00 NaN
2018-01-01 01:30:00 3.0
2018-01-01 01:45:00 3.0
2018-01-01 02:00:00 3.0
Freq: 15T, dtype: float64

Resampling a DataFrame that has missing values:

>>> df = pd.DataFrame({'a': [2, np.nan, 6], 'b': [1, 3, 5]},
... index=pd.date_range('20180101', periods=3,
... freq='h'))
>>> df
a b
2018-01-01 00:00:00 2.0 1
2018-01-01 01:00:00 NaN 3
2018-01-01 02:00:00 6.0 5

>>> df.resample('30min').backfill()
a b
2018-01-01 00:00:00 2.0 1
2018-01-01 00:30:00 NaN 3
2018-01-01 01:00:00 NaN 3
2018-01-01 01:30:00 6.0 5
2018-01-01 02:00:00 6.0 5

>>> df.resample('15min').backfill(limit=2)
a b
2018-01-01 00:00:00 2.0 1.0
2018-01-01 00:15:00 NaN NaN
2018-01-01 00:30:00 NaN 3.0
2018-01-01 00:45:00 NaN 3.0
2018-01-01 01:00:00 NaN 3.0
2018-01-01 01:15:00 NaN NaN
2018-01-01 01:30:00 6.0 5.0
2018-01-01 01:45:00 6.0 5.0
2018-01-01 02:00:00 6.0 5.0
"""
return self._upsample('backfill', limit=limit)
bfill = backfill
Expand Down