Skip to content

Commit 43928d4

Browse files
adrian-stepienjorisvandenbossche
authored andcommitted
DOC: Improved links between expanding and cum* (GH12651)
- [x] closes #12651 - [x] passes `git diff upstream/master | flake8 --diff` Author: adrian-stepien <[email protected]> Closes #14098 from adrian-stepien/doc/12651 and squashes the following commits: 4427e28 [adrian-stepien] DOC: Improved links between expanding and cum* (#12651) 8466669 [adrian-stepien] DOC: Improved links between expanding and cum* (#12651) 30164f3 [adrian-stepien] DOC: Correct link from b/ffill to fillna
1 parent 510dd67 commit 43928d4

File tree

3 files changed

+52
-17
lines changed

3 files changed

+52
-17
lines changed

doc/source/basics.rst

+3-1
Original file line numberDiff line numberDiff line change
@@ -486,7 +486,9 @@ standard deviation 1), very concisely:
486486
xs_stand.std(1)
487487
488488
Note that methods like :meth:`~DataFrame.cumsum` and :meth:`~DataFrame.cumprod`
489-
preserve the location of NA values:
489+
preserve the location of ``NaN`` values. This is somewhat different from
490+
:meth:`~DataFrame.expanding` and :meth:`~DataFrame.rolling`.
491+
For more details please see :ref:`this note <stats.moments.expanding.note>`.
490492

491493
.. ipython:: python
492494

doc/source/computation.rst

+25-6
Original file line numberDiff line numberDiff line change
@@ -691,6 +691,8 @@ Method Summary
691691
:meth:`~Expanding.cov`, Unbiased covariance (binary)
692692
:meth:`~Expanding.corr`, Correlation (binary)
693693

694+
.. currentmodule:: pandas
695+
694696
Aside from not having a ``window`` parameter, these functions have the same
695697
interfaces as their ``.rolling`` counterparts. Like above, the parameters they
696698
all accept are:
@@ -700,18 +702,34 @@ all accept are:
700702
``min_periods`` non-null data points have been seen.
701703
- ``center``: boolean, whether to set the labels at the center (default is False)
702704

705+
.. _stats.moments.expanding.note:
703706
.. note::
704707

705708
The output of the ``.rolling`` and ``.expanding`` methods do not return a
706709
``NaN`` if there are at least ``min_periods`` non-null values in the current
707-
window. This differs from ``cumsum``, ``cumprod``, ``cummax``, and
708-
``cummin``, which return ``NaN`` in the output wherever a ``NaN`` is
709-
encountered in the input.
710+
window. This differs from :meth:`~DataFrame.cumsum`,
711+
:meth:`~DataFrame.cumprod`, :meth:`~DataFrame.cummax`,
712+
and :meth:`~DataFrame.cummin`, which return ``NaN`` in the output wherever
713+
a ``NaN`` is encountered in the input.
714+
715+
Please see the example below. In order to match the output of ``cumsum``
716+
with ``expanding``, use :meth:`~DataFrame.fillna`.
717+
718+
.. ipython:: python
719+
720+
sn = pd.Series([1,2,np.nan,3,np.nan,4])
721+
722+
sn.expanding().sum()
723+
724+
sn.cumsum()
725+
726+
sn.cumsum().fillna(method='ffill')
727+
710728
711729
An expanding window statistic will be more stable (and less responsive) than
712730
its rolling window counterpart as the increasing window size decreases the
713731
relative impact of an individual data point. As an example, here is the
714-
:meth:`~Expanding.mean` output for the previous time series dataset:
732+
:meth:`~core.window.Expanding.mean` output for the previous time series dataset:
715733

716734
.. ipython:: python
717735
:suppress:
@@ -731,13 +749,14 @@ relative impact of an individual data point. As an example, here is the
731749
Exponentially Weighted Windows
732750
------------------------------
733751

752+
.. currentmodule:: pandas.core.window
753+
734754
A related set of functions are exponentially weighted versions of several of
735755
the above statistics. A similar interface to ``.rolling`` and ``.expanding`` is accessed
736-
thru the ``.ewm`` method to receive an :class:`~pandas.core.window.EWM` object.
756+
through the ``.ewm`` method to receive an :class:`~EWM` object.
737757
A number of expanding EW (exponentially weighted)
738758
methods are provided:
739759

740-
.. currentmodule:: pandas.core.window
741760

742761
.. csv-table::
743762
:header: "Function", "Description"

pandas/core/generic.py

+24-10
Original file line numberDiff line numberDiff line change
@@ -3354,12 +3354,16 @@ def fillna(self, value=None, method=None, axis=None, inplace=False,
33543354
return self._constructor(new_data).__finalize__(self)
33553355

33563356
def ffill(self, axis=None, inplace=False, limit=None, downcast=None):
3357-
"""Synonym for NDFrame.fillna(method='ffill')"""
3357+
"""
3358+
Synonym for :meth:`DataFrame.fillna(method='ffill') <DataFrame.fillna>`
3359+
"""
33583360
return self.fillna(method='ffill', axis=axis, inplace=inplace,
33593361
limit=limit, downcast=downcast)
33603362

33613363
def bfill(self, axis=None, inplace=False, limit=None, downcast=None):
3362-
"""Synonym for NDFrame.fillna(method='bfill')"""
3364+
"""
3365+
Synonym for :meth:`DataFrame.fillna(method='bfill') <DataFrame.fillna>`
3366+
"""
33633367
return self.fillna(method='bfill', axis=axis, inplace=inplace,
33643368
limit=limit, downcast=downcast)
33653369

@@ -5477,16 +5481,18 @@ def compound(self, axis=None, skipna=None, level=None):
54775481

54785482
cls.cummin = _make_cum_function(
54795483
cls, 'cummin', name, name2, axis_descr, "cumulative minimum",
5480-
lambda y, axis: np.minimum.accumulate(y, axis), np.inf, np.nan)
5484+
lambda y, axis: np.minimum.accumulate(y, axis), "min",
5485+
np.inf, np.nan)
54815486
cls.cumsum = _make_cum_function(
54825487
cls, 'cumsum', name, name2, axis_descr, "cumulative sum",
5483-
lambda y, axis: y.cumsum(axis), 0., np.nan)
5488+
lambda y, axis: y.cumsum(axis), "sum", 0., np.nan)
54845489
cls.cumprod = _make_cum_function(
54855490
cls, 'cumprod', name, name2, axis_descr, "cumulative product",
5486-
lambda y, axis: y.cumprod(axis), 1., np.nan)
5491+
lambda y, axis: y.cumprod(axis), "prod", 1., np.nan)
54875492
cls.cummax = _make_cum_function(
54885493
cls, 'cummax', name, name2, axis_descr, "cumulative max",
5489-
lambda y, axis: np.maximum.accumulate(y, axis), -np.inf, np.nan)
5494+
lambda y, axis: np.maximum.accumulate(y, axis), "max",
5495+
-np.inf, np.nan)
54905496

54915497
cls.sum = _make_stat_function(
54925498
cls, 'sum', name, name2, axis_descr,
@@ -5674,7 +5680,15 @@ def _doc_parms(cls):
56745680
56755681
Returns
56765682
-------
5677-
%(outname)s : %(name1)s\n"""
5683+
%(outname)s : %(name1)s\n
5684+
5685+
5686+
See also
5687+
--------
5688+
pandas.core.window.Expanding.%(accum_func_name)s : Similar functionality
5689+
but ignores ``NaN`` values.
5690+
5691+
"""
56785692

56795693

56805694
def _make_stat_function(cls, name, name1, name2, axis_descr, desc, f):
@@ -5717,10 +5731,10 @@ def stat_func(self, axis=None, skipna=None, level=None, ddof=1,
57175731
return set_function_name(stat_func, name, cls)
57185732

57195733

5720-
def _make_cum_function(cls, name, name1, name2, axis_descr, desc, accum_func,
5721-
mask_a, mask_b):
5734+
def _make_cum_function(cls, name, name1, name2, axis_descr, desc,
5735+
accum_func, accum_func_name, mask_a, mask_b):
57225736
@Substitution(outname=name, desc=desc, name1=name1, name2=name2,
5723-
axis_descr=axis_descr)
5737+
axis_descr=axis_descr, accum_func_name=accum_func_name)
57245738
@Appender("Return {0} over requested axis.".format(desc) +
57255739
_cnum_doc)
57265740
def cum_func(self, axis=None, skipna=True, *args, **kwargs):

0 commit comments

Comments
 (0)