Skip to content

Commit 1cecfdf

Browse files
Guilherme BeltraminiTomAugspurger
Guilherme Beltramini
authored andcommitted
DOC: update the pandas.core.resample.Resampler.backfill docstring (#20083)
* DOC: Resampler.backfill docstring * Add another example and line break * Add better description, returns description and reference * Add citation * DOC: Improve description, add example
1 parent 5d35f0f commit 1cecfdf

File tree

1 file changed

+88
-5
lines changed

1 file changed

+88
-5
lines changed

pandas/core/resample.py

+88-5
Original file line numberDiff line numberDiff line change
@@ -519,21 +519,104 @@ def nearest(self, limit=None):
519519

520520
def backfill(self, limit=None):
521521
"""
522-
Backward fill the values
522+
Backward fill the new missing values in the resampled data.
523+
524+
In statistics, imputation is the process of replacing missing data with
525+
substituted values [1]_. When resampling data, missing values may
526+
appear (e.g., when the resampling frequency is higher than the original
527+
frequency). The backward fill will replace NaN values that appeared in
528+
the resampled data with the next value in the original sequence.
529+
Missing values that existed in the orginal data will not be modified.
523530
524531
Parameters
525532
----------
526533
limit : integer, optional
527-
limit of how many values to fill
534+
Limit of how many values to fill.
528535
529536
Returns
530537
-------
531-
an upsampled Series
538+
Series, DataFrame
539+
An upsampled Series or DataFrame with backward filled NaN values.
532540
533541
See Also
534542
--------
535-
Series.fillna
536-
DataFrame.fillna
543+
bfill : Alias of backfill.
544+
fillna : Fill NaN values using the specified method, which can be
545+
'backfill'.
546+
nearest : Fill NaN values with nearest neighbor starting from center.
547+
pad : Forward fill NaN values.
548+
pandas.Series.fillna : Fill NaN values in the Series using the
549+
specified method, which can be 'backfill'.
550+
pandas.DataFrame.fillna : Fill NaN values in the DataFrame using the
551+
specified method, which can be 'backfill'.
552+
553+
References
554+
----------
555+
.. [1] https://en.wikipedia.org/wiki/Imputation_(statistics)
556+
557+
Examples
558+
--------
559+
560+
Resampling a Series:
561+
562+
>>> s = pd.Series([1, 2, 3],
563+
... index=pd.date_range('20180101', periods=3, freq='h'))
564+
>>> s
565+
2018-01-01 00:00:00 1
566+
2018-01-01 01:00:00 2
567+
2018-01-01 02:00:00 3
568+
Freq: H, dtype: int64
569+
570+
>>> s.resample('30min').backfill()
571+
2018-01-01 00:00:00 1
572+
2018-01-01 00:30:00 2
573+
2018-01-01 01:00:00 2
574+
2018-01-01 01:30:00 3
575+
2018-01-01 02:00:00 3
576+
Freq: 30T, dtype: int64
577+
578+
>>> s.resample('15min').backfill(limit=2)
579+
2018-01-01 00:00:00 1.0
580+
2018-01-01 00:15:00 NaN
581+
2018-01-01 00:30:00 2.0
582+
2018-01-01 00:45:00 2.0
583+
2018-01-01 01:00:00 2.0
584+
2018-01-01 01:15:00 NaN
585+
2018-01-01 01:30:00 3.0
586+
2018-01-01 01:45:00 3.0
587+
2018-01-01 02:00:00 3.0
588+
Freq: 15T, dtype: float64
589+
590+
Resampling a DataFrame that has missing values:
591+
592+
>>> df = pd.DataFrame({'a': [2, np.nan, 6], 'b': [1, 3, 5]},
593+
... index=pd.date_range('20180101', periods=3,
594+
... freq='h'))
595+
>>> df
596+
a b
597+
2018-01-01 00:00:00 2.0 1
598+
2018-01-01 01:00:00 NaN 3
599+
2018-01-01 02:00:00 6.0 5
600+
601+
>>> df.resample('30min').backfill()
602+
a b
603+
2018-01-01 00:00:00 2.0 1
604+
2018-01-01 00:30:00 NaN 3
605+
2018-01-01 01:00:00 NaN 3
606+
2018-01-01 01:30:00 6.0 5
607+
2018-01-01 02:00:00 6.0 5
608+
609+
>>> df.resample('15min').backfill(limit=2)
610+
a b
611+
2018-01-01 00:00:00 2.0 1.0
612+
2018-01-01 00:15:00 NaN NaN
613+
2018-01-01 00:30:00 NaN 3.0
614+
2018-01-01 00:45:00 NaN 3.0
615+
2018-01-01 01:00:00 NaN 3.0
616+
2018-01-01 01:15:00 NaN NaN
617+
2018-01-01 01:30:00 6.0 5.0
618+
2018-01-01 01:45:00 6.0 5.0
619+
2018-01-01 02:00:00 6.0 5.0
537620
"""
538621
return self._upsample('backfill', limit=limit)
539622
bfill = backfill

0 commit comments

Comments
 (0)