Skip to content

Commit 946de44

Browse files
ajdykavictor
authored and
victor
committed
DOC: update the GroupBy.apply docstring (pandas-dev#20098)
1 parent 7468768 commit 946de44

File tree

2 files changed

+54
-49
lines changed

2 files changed

+54
-49
lines changed

pandas/core/groupby/groupby.py

+53-48
Original file line numberDiff line numberDiff line change
@@ -55,38 +55,38 @@ class providing the base-class of operations.
5555

5656
_apply_docs = dict(
5757
template="""
58-
Apply function ``func`` group-wise and combine the results together.
58+
Apply function `func` group-wise and combine the results together.
5959
60-
The function passed to ``apply`` must take a {input} as its first
61-
argument and return a dataframe, a series or a scalar. ``apply`` will
60+
The function passed to `apply` must take a {input} as its first
61+
argument and return a DataFrame, Series or scalar. `apply` will
6262
then take care of combining the results back together into a single
63-
dataframe or series. ``apply`` is therefore a highly flexible
63+
dataframe or series. `apply` is therefore a highly flexible
6464
grouping method.
6565
66-
While ``apply`` is a very flexible method, its downside is that
67-
using it can be quite a bit slower than using more specific methods.
68-
Pandas offers a wide range of method that will be much faster
69-
than using ``apply`` for their specific purposes, so try to use them
70-
before reaching for ``apply``.
66+
While `apply` is a very flexible method, its downside is that
67+
using it can be quite a bit slower than using more specific methods
68+
like `agg` or `transform`. Pandas offers a wide range of method that will
69+
be much faster than using `apply` for their specific purposes, so try to
70+
use them before reaching for `apply`.
7171
7272
Parameters
7373
----------
74-
func : function
74+
func : callable
7575
A callable that takes a {input} as its first argument, and
7676
returns a dataframe, a series or a scalar. In addition the
77-
callable may take positional and keyword arguments
77+
callable may take positional and keyword arguments.
7878
args, kwargs : tuple and dict
79-
Optional positional and keyword arguments to pass to ``func``
79+
Optional positional and keyword arguments to pass to `func`.
8080
8181
Returns
8282
-------
8383
applied : Series or DataFrame
8484
8585
Notes
8686
-----
87-
In the current implementation ``apply`` calls func twice on the
87+
In the current implementation `apply` calls `func` twice on the
8888
first group to decide whether it can take a fast or slow code
89-
path. This can lead to unexpected behavior if func has
89+
path. This can lead to unexpected behavior if `func` has
9090
side-effects, as they will take effect twice for the first
9191
group.
9292
@@ -98,38 +98,43 @@ class providing the base-class of operations.
9898
--------
9999
pipe : Apply function to the full GroupBy object instead of to each
100100
group.
101-
aggregate, transform
101+
aggregate : Apply aggregate function to the GroupBy object.
102+
transform : Apply function column-by-column to the GroupBy object.
103+
Series.apply : Apply a function to a Series.
104+
DataFrame.apply : Apply a function to each row or column of a DataFrame.
102105
""",
103106
dataframe_examples="""
104-
>>> df = pd.DataFrame({'A': 'a a b'.split(), 'B': [1,2,3], 'C': [4,6, 5]})
107+
>>> df = pd.DataFrame({'A': 'a a b'.split(),
108+
'B': [1,2,3],
109+
'C': [4,6, 5]})
105110
>>> g = df.groupby('A')
106111
107-
From ``df`` above we can see that ``g`` has two groups, ``a``, ``b``.
108-
Calling ``apply`` in various ways, we can get different grouping results:
112+
Notice that ``g`` has two groups, ``a`` and ``b``.
113+
Calling `apply` in various ways, we can get different grouping results:
109114
110-
Example 1: below the function passed to ``apply`` takes a dataframe as
111-
its argument and returns a dataframe. ``apply`` combines the result for
112-
each group together into a new dataframe:
115+
Example 1: below the function passed to `apply` takes a DataFrame as
116+
its argument and returns a DataFrame. `apply` combines the result for
117+
each group together into a new DataFrame:
113118
114-
>>> g.apply(lambda x: x / x.sum())
119+
>>> g[['B', 'C']].apply(lambda x: x / x.sum())
115120
B C
116121
0 0.333333 0.4
117122
1 0.666667 0.6
118123
2 1.000000 1.0
119124
120-
Example 2: The function passed to ``apply`` takes a dataframe as
121-
its argument and returns a series. ``apply`` combines the result for
122-
each group together into a new dataframe:
125+
Example 2: The function passed to `apply` takes a DataFrame as
126+
its argument and returns a Series. `apply` combines the result for
127+
each group together into a new DataFrame:
123128
124-
>>> g.apply(lambda x: x.max() - x.min())
129+
>>> g[['B', 'C']].apply(lambda x: x.max() - x.min())
125130
B C
126131
A
127132
a 1 2
128133
b 0 0
129134
130-
Example 3: The function passed to ``apply`` takes a dataframe as
131-
its argument and returns a scalar. ``apply`` combines the result for
132-
each group together into a series, including setting the index as
135+
Example 3: The function passed to `apply` takes a DataFrame as
136+
its argument and returns a scalar. `apply` combines the result for
137+
each group together into a Series, including setting the index as
133138
appropriate:
134139
135140
>>> g.apply(lambda x: x.C.max() - x.B.min())
@@ -139,25 +144,25 @@ class providing the base-class of operations.
139144
dtype: int64
140145
""",
141146
series_examples="""
142-
>>> ser = pd.Series([0, 1, 2], index='a a b'.split())
143-
>>> g = ser.groupby(ser.index)
147+
>>> s = pd.Series([0, 1, 2], index='a a b'.split())
148+
>>> g = s.groupby(s.index)
144149
145-
From ``ser`` above we can see that ``g`` has two groups, ``a``, ``b``.
146-
Calling ``apply`` in various ways, we can get different grouping results:
150+
From ``s`` above we can see that ``g`` has two groups, ``a`` and ``b``.
151+
Calling `apply` in various ways, we can get different grouping results:
147152
148-
Example 1: The function passed to ``apply`` takes a series as
149-
its argument and returns a series. ``apply`` combines the result for
150-
each group together into a new series:
153+
Example 1: The function passed to `apply` takes a Series as
154+
its argument and returns a Series. `apply` combines the result for
155+
each group together into a new Series:
151156
152157
>>> g.apply(lambda x: x*2 if x.name == 'b' else x/2)
153158
0 0.0
154159
1 0.5
155160
2 4.0
156161
dtype: float64
157162
158-
Example 2: The function passed to ``apply`` takes a series as
159-
its argument and returns a scalar. ``apply`` combines the result for
160-
each group together into a series, including setting the index as
163+
Example 2: The function passed to `apply` takes a Series as
164+
its argument and returns a scalar. `apply` combines the result for
165+
each group together into a Series, including setting the index as
161166
appropriate:
162167
163168
>>> g.apply(lambda x: x.max() - x.min())
@@ -167,12 +172,12 @@ class providing the base-class of operations.
167172
""")
168173

169174
_pipe_template = """\
170-
Apply a function ``func`` with arguments to this %(klass)s object and return
175+
Apply a function `func` with arguments to this %(klass)s object and return
171176
the function's result.
172177
173178
%(versionadded)s
174179
175-
Use ``.pipe`` when you want to improve readability by chaining together
180+
Use `.pipe` when you want to improve readability by chaining together
176181
functions that expect Series, DataFrames, GroupBy or Resampler objects.
177182
Instead of writing
178183
@@ -191,17 +196,17 @@ class providing the base-class of operations.
191196
----------
192197
func : callable or tuple of (callable, string)
193198
Function to apply to this %(klass)s object or, alternatively,
194-
a ``(callable, data_keyword)`` tuple where ``data_keyword`` is a
195-
string indicating the keyword of ``callable`` that expects the
199+
a `(callable, data_keyword)` tuple where `data_keyword` is a
200+
string indicating the keyword of `callable` that expects the
196201
%(klass)s object.
197202
args : iterable, optional
198-
positional arguments passed into ``func``.
203+
positional arguments passed into `func`.
199204
kwargs : dict, optional
200-
a dictionary of keyword arguments passed into ``func``.
205+
a dictionary of keyword arguments passed into `func`.
201206
202207
Returns
203208
-------
204-
object : the return type of ``func``.
209+
object : the return type of `func`.
205210
206211
Notes
207212
-----
@@ -1442,7 +1447,7 @@ def nth(self, n, dropna=None):
14421447
2 3.0
14431448
2 5.0
14441449
1445-
Specifying ``dropna`` allows count ignoring NaN
1450+
Specifying `dropna` allows count ignoring ``NaN``
14461451
14471452
>>> g.nth(0, dropna='any')
14481453
B
@@ -1458,7 +1463,7 @@ def nth(self, n, dropna=None):
14581463
1 NaN
14591464
2 NaN
14601465
1461-
Specifying ``as_index=False`` in ``groupby`` keeps the original index.
1466+
Specifying `as_index=False` in `groupby` keeps the original index.
14621467
14631468
>>> df.groupby('A', as_index=False).nth(1)
14641469
A B

pandas/core/groupby/grouper.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ class Grouper(object):
5959
sort : boolean, default to False
6060
whether to sort the resulting labels
6161
62-
additional kwargs to control time-like groupers (when ``freq`` is passed)
62+
additional kwargs to control time-like groupers (when `freq` is passed)
6363
6464
closed : closed end of interval; 'left' or 'right'
6565
label : interval boundary to use for labeling; 'left' or 'right'

0 commit comments

Comments
 (0)