Skip to content

DOC: update the .agg doc-string with examples #16188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 2, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions doc/source/whatsnew/v0.20.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ users upgrade to this version.

Highlights include:

- new ``.agg()`` API for Series/DataFrame similar to the groupby-rolling-resample API's, see :ref:`here <whatsnew_0200.enhancements.agg>`
- New ``.agg()`` API for Series/DataFrame similar to the groupby-rolling-resample API's, see :ref:`here <whatsnew_0200.enhancements.agg>`
- Integration with the ``feather-format``, including a new top-level ``pd.read_feather()`` and ``DataFrame.to_feather()`` method, see :ref:`here <io.feather>`.
- The ``.ix`` indexer has been deprecated, see :ref:`here <whatsnew_0200.api_breaking.deprecate_ix>`
- ``Panel`` has been deprecated, see :ref:`here <whatsnew_0200.api_breaking.deprecate_panel>`
Expand Down Expand Up @@ -45,8 +45,8 @@ New features
^^^^^^^^^^^

Series & DataFrame have been enhanced to support the aggregation API. This is an already familiar API that
is supported for groupby, window operations, and resampling. This allows one to express, possibly multiple,
aggregation operations in a single concise way by using :meth:`~DataFrame.agg`,
is supported for groupby, window operations, and resampling. This allows one to express aggregation operations
in a single concise way by using :meth:`~DataFrame.agg`,
and :meth:`~DataFrame.transform`. The full documentation is :ref:`here <basics.aggregate>` (:issue:`1623`).

Here is a sample
Expand Down
47 changes: 10 additions & 37 deletions pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -370,42 +370,6 @@ def _gotitem(self, key, ndim, subset=None):
"""
raise AbstractMethodError(self)

_agg_doc = """Aggregate using input function or dict of {column ->
function}

Parameters
----------
arg : function or dict
Function to use for aggregating groups. If a function, must either
work when passed a DataFrame or when passed to DataFrame.apply. If
passed a dict, the keys must be DataFrame column names.

Accepted Combinations are:
- string cythonized function name
- function
- list of functions
- dict of columns -> functions
- nested dict of names -> dicts of functions

Notes
-----
Numpy functions mean/median/prod/sum/std/var are special cased so the
default behavior is applying the function along axis=0
(e.g., np.mean(arr_2d, axis=0)) as opposed to
mimicking the default Numpy behavior (e.g., np.mean(arr_2d)).

Returns
-------
aggregated : DataFrame
"""

_see_also_template = """
See also
--------
pandas.Series.%(name)s
pandas.DataFrame.%(name)s
"""

def aggregate(self, func, *args, **kwargs):
raise AbstractMethodError(self)

Expand Down Expand Up @@ -1150,30 +1114,39 @@ def factorize(self, sort=False, na_sentinel=-1):

Examples
--------

>>> x = pd.Series([1, 2, 3])
>>> x
0 1
1 2
2 3
dtype: int64

>>> x.searchsorted(4)
array([3])

>>> x.searchsorted([0, 4])
array([0, 3])

>>> x.searchsorted([1, 3], side='left')
array([0, 2])

>>> x.searchsorted([1, 3], side='right')
array([1, 3])
>>>

>>> x = pd.Categorical(['apple', 'bread', 'bread', 'cheese', 'milk' ])
[apple, bread, bread, cheese, milk]
Categories (4, object): [apple < bread < cheese < milk]

>>> x.searchsorted('bread')
array([1]) # Note: an array, not a scalar

>>> x.searchsorted(['bread'])
array([1])

>>> x.searchsorted(['bread', 'eggs'])
array([1, 4])

>>> x.searchsorted(['bread', 'eggs'], side='right')
array([3, 4]) # eggs before milk
""")
Expand Down
41 changes: 39 additions & 2 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import sys
import types
import warnings
from textwrap import dedent

from numpy import nan as NA
import numpy as np
Expand Down Expand Up @@ -4200,7 +4201,43 @@ def _gotitem(self, key, ndim, subset=None):
# TODO: _shallow_copy(subset)?
return self[key]

@Appender(_shared_docs['aggregate'] % _shared_doc_kwargs)
_agg_doc = dedent("""
Examples
--------

>>> df = pd.DataFrame(np.random.randn(10, 3), columns=['A', 'B', 'C'],
... index=pd.date_range('1/1/2000', periods=10))
>>> df.iloc[3:7] = np.nan

Aggregate these functions across all columns

>>> df.agg(['sum', 'min'])
A B C
sum -0.182253 -0.614014 -2.909534
min -1.916563 -1.460076 -1.568297

Different aggregations per column

>>> df.agg({'A' : ['sum', 'min'], 'B' : ['min', 'max']})
A B
max NaN 1.514318
min -1.916563 -1.460076
sum -0.182253 NaN

See also
--------
pandas.DataFrame.apply
pandas.DataFrame.transform
pandas.DataFrame.groupby.aggregate
pandas.DataFrame.resample.aggregate
pandas.DataFrame.rolling.aggregate

""")

@Appender(_agg_doc)
@Appender(_shared_docs['aggregate'] % dict(
versionadded='.. versionadded:: 0.20.0',
**_shared_doc_kwargs))
def aggregate(self, func, axis=0, *args, **kwargs):
axis = self._get_axis_number(axis)

Expand Down Expand Up @@ -4272,7 +4309,7 @@ def apply(self, func, axis=0, broadcast=False, raw=False, reduce=None,
See also
--------
DataFrame.applymap: For elementwise operations
DataFrame.agg: only perform aggregating type operations
DataFrame.aggregate: only perform aggregating type operations
DataFrame.transform: only perform transformating type operations

Returns
Expand Down
43 changes: 32 additions & 11 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2854,19 +2854,19 @@ def pipe(self, func, *args, **kwargs):
return func(self, *args, **kwargs)

_shared_docs['aggregate'] = ("""
Aggregate using input function or dict of {column ->
function}
Aggregate using callable, string, dict, or list of string/callables

.. versionadded:: 0.20.0
%(versionadded)s

Parameters
----------
func : callable, string, dictionary, or list of string/callables
Function to use for aggregating the data. If a function, must either
work when passed a DataFrame or when passed to DataFrame.apply. If
passed a dict, the keys must be DataFrame column names.
work when passed a %(klass)s or when passed to %(klass)s.apply. For
a DataFrame, can pass a dict, if the keys are DataFrame column names.

Accepted Combinations are:

- string function name
- function
- list of functions
Expand All @@ -2879,12 +2879,11 @@ def pipe(self, func, *args, **kwargs):
(e.g., np.mean(arr_2d, axis=0)) as opposed to
mimicking the default Numpy behavior (e.g., np.mean(arr_2d)).

agg is an alias for aggregate. Use it.

Returns
-------
aggregated : %(klass)s

See also
--------
""")

_shared_docs['transform'] = ("""
Expand All @@ -2899,18 +2898,40 @@ def pipe(self, func, *args, **kwargs):
To apply to column

Accepted Combinations are:

- string function name
- function
- list of functions
- dict of column names -> functions (or list of functions)

Returns
-------
transformed : %(klass)s

Examples
--------
>>> df = pd.DataFrame(np.random.randn(10, 3), columns=['A', 'B', 'C'],
... index=pd.date_range('1/1/2000', periods=10))
df.iloc[3:7] = np.nan

>>> df.transform(lambda x: (x - x.mean()) / x.std())
A B C
2000-01-01 0.579457 1.236184 0.123424
2000-01-02 0.370357 -0.605875 -1.231325
2000-01-03 1.455756 -0.277446 0.288967
2000-01-04 NaN NaN NaN
2000-01-05 NaN NaN NaN
2000-01-06 NaN NaN NaN
2000-01-07 NaN NaN NaN
2000-01-08 -0.498658 1.274522 1.642524
2000-01-09 -0.540524 -1.012676 -0.828968
2000-01-10 -1.366388 -0.614710 0.005378

See also
--------
pandas.%(klass)s.aggregate
pandas.%(klass)s.apply

Returns
-------
transformed : %(klass)s
""")

# ----------------------------------------------------------------------
Expand Down
Loading