Skip to content

Commit 5f244d8

Browse files
mdebocjorisvandenbossche
mdeboc
authored andcommitted
DOC: update the pandas.DataFrame.apply docstring (#20202)
1 parent 6e88d3b commit 5f244d8

File tree

1 file changed

+77
-85
lines changed

1 file changed

+77
-85
lines changed

pandas/core/frame.py

+77-85
Original file line numberDiff line numberDiff line change
@@ -5003,54 +5003,68 @@ def aggregate(self, func, axis=0, *args, **kwargs):
50035003

50045004
def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
50055005
result_type=None, args=(), **kwds):
5006-
"""Applies function along an axis of the DataFrame.
5006+
"""
5007+
Apply a function along an axis of the DataFrame.
50075008
5008-
Objects passed to functions are Series objects having index
5009-
either the DataFrame's index (axis=0) or the columns (axis=1).
5010-
Final return type depends on the return type of the applied function,
5011-
or on the `result_type` argument.
5009+
Objects passed to the function are Series objects whose index is
5010+
either the DataFrame's index (``axis=0``) or the DataFrame's columns
5011+
(``axis=1``). By default (``result_type=None``), the final return type
5012+
is inferred from the return type of the applied function. Otherwise,
5013+
it depends on the `result_type` argument.
50125014
50135015
Parameters
50145016
----------
50155017
func : function
5016-
Function to apply to each column/row
5018+
Function to apply to each column or row.
50175019
axis : {0 or 'index', 1 or 'columns'}, default 0
5018-
* 0 or 'index': apply function to each column
5019-
* 1 or 'columns': apply function to each row
5020-
broadcast : boolean, optional
5021-
For aggregation functions, return object of same size with values
5022-
propagated
5020+
Axis along which the function is applied:
5021+
5022+
* 0 or 'index': apply function to each column.
5023+
* 1 or 'columns': apply function to each row.
5024+
broadcast : bool, optional
5025+
Only relevant for aggregation functions:
5026+
5027+
* ``False`` or ``None`` : returns a Series whose length is the
5028+
length of the index or the number of columns (based on the
5029+
`axis` parameter)
5030+
* ``True`` : results will be broadcast to the original shape
5031+
of the frame, the original index and columns will be retained.
50235032
50245033
.. deprecated:: 0.23.0
50255034
This argument will be removed in a future version, replaced
50265035
by result_type='broadcast'.
50275036
5028-
raw : boolean, default False
5029-
If False, convert each row or column into a Series. If raw=True the
5030-
passed function will receive ndarray objects instead. If you are
5031-
just applying a NumPy reduction function this will achieve much
5032-
better performance
5033-
reduce : boolean or None, default None
5037+
raw : bool, default False
5038+
* ``False`` : passes each row or column as a Series to the
5039+
function.
5040+
* ``True`` : the passed function will receive ndarray objects
5041+
instead.
5042+
If you are just applying a NumPy reduction function this will
5043+
achieve much better performance.
5044+
reduce : bool or None, default None
50345045
Try to apply reduction procedures. If the DataFrame is empty,
5035-
apply will use reduce to determine whether the result should be a
5036-
Series or a DataFrame. If reduce is None (the default), apply's
5037-
return value will be guessed by calling func an empty Series (note:
5038-
while guessing, exceptions raised by func will be ignored). If
5039-
reduce is True a Series will always be returned, and if False a
5040-
DataFrame will always be returned.
5046+
`apply` will use `reduce` to determine whether the result
5047+
should be a Series or a DataFrame. If ``reduce=None`` (the
5048+
default), `apply`'s return value will be guessed by calling
5049+
`func` on an empty Series
5050+
(note: while guessing, exceptions raised by `func` will be
5051+
ignored).
5052+
If ``reduce=True`` a Series will always be returned, and if
5053+
``reduce=False`` a DataFrame will always be returned.
50415054
50425055
.. deprecated:: 0.23.0
50435056
This argument will be removed in a future version, replaced
5044-
by result_type='reduce'.
5057+
by ``result_type='reduce'``.
50455058
5046-
result_type : {'expand', 'reduce', 'broadcast, None}
5047-
These only act when axis=1 {columns}:
5059+
result_type : {'expand', 'reduce', 'broadcast', None}, default None
5060+
These only act when ``axis=1`` (columns):
50485061
50495062
* 'expand' : list-like results will be turned into columns.
5050-
* 'reduce' : return a Series if possible rather than expanding
5051-
list-like results. This is the opposite to 'expand'.
5063+
* 'reduce' : returns a Series if possible rather than expanding
5064+
list-like results. This is the opposite of 'expand'.
50525065
* 'broadcast' : results will be broadcast to the original shape
5053-
of the frame, the original index & columns will be retained.
5066+
of the DataFrame, the original index and columns will be
5067+
retained.
50545068
50555069
The default behaviour (None) depends on the return value of the
50565070
applied function: list-like results will be returned as a Series
@@ -5060,61 +5074,56 @@ def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
50605074
.. versionadded:: 0.23.0
50615075
50625076
args : tuple
5063-
Positional arguments to pass to function in addition to the
5064-
array/series
5065-
Additional keyword arguments will be passed as keywords to the function
5077+
Positional arguments to pass to `func` in addition to the
5078+
array/series.
5079+
**kwds
5080+
Additional keyword arguments to pass as keywords arguments to
5081+
`func`.
50665082
50675083
Notes
50685084
-----
5069-
In the current implementation apply calls func twice on the
5085+
In the current implementation apply calls `func` twice on the
50705086
first column/row to decide whether it can take a fast or slow
5071-
code path. This can lead to unexpected behavior if func has
5087+
code path. This can lead to unexpected behavior if `func` has
50725088
side-effects, as they will take effect twice for the first
50735089
column/row.
50745090
5075-
Examples
5091+
See also
50765092
--------
5093+
DataFrame.applymap: For elementwise operations
5094+
DataFrame.aggregate: only perform aggregating type operations
5095+
DataFrame.transform: only perform transformating type operations
50775096
5078-
We use this DataFrame to illustrate
5097+
Examples
5098+
--------
50795099
5080-
>>> df = pd.DataFrame(np.tile(np.arange(3), 6).reshape(6, -1) + 1,
5081-
... columns=['A', 'B', 'C'])
5100+
>>> df = pd.DataFrame([[4, 9],] * 3, columns=['A', 'B'])
50825101
>>> df
5083-
A B C
5084-
0 1 2 3
5085-
1 1 2 3
5086-
2 1 2 3
5087-
3 1 2 3
5088-
4 1 2 3
5089-
5 1 2 3
5102+
A B
5103+
0 4 9
5104+
1 4 9
5105+
2 4 9
50905106
50915107
Using a numpy universal function (in this case the same as
50925108
``np.sqrt(df)``):
50935109
50945110
>>> df.apply(np.sqrt)
5095-
A B C
5096-
0 1.0 1.414214 1.732051
5097-
1 1.0 1.414214 1.732051
5098-
2 1.0 1.414214 1.732051
5099-
3 1.0 1.414214 1.732051
5100-
4 1.0 1.414214 1.732051
5101-
5 1.0 1.414214 1.732051
5111+
A B
5112+
0 2.0 3.0
5113+
1 2.0 3.0
5114+
2 2.0 3.0
51025115
51035116
Using a reducing function on either axis
51045117
51055118
>>> df.apply(np.sum, axis=0)
5106-
A 6
5107-
B 12
5108-
C 18
5119+
A 12
5120+
B 27
51095121
dtype: int64
51105122
51115123
>>> df.apply(np.sum, axis=1)
5112-
0 6
5113-
1 6
5114-
2 6
5115-
3 6
5116-
4 6
5117-
5 6
5124+
0 13
5125+
1 13
5126+
2 13
51185127
dtype: int64
51195128
51205129
Retuning a list-like will result in a Series
@@ -5123,9 +5132,7 @@ def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
51235132
0 [1, 2]
51245133
1 [1, 2]
51255134
2 [1, 2]
5126-
3 [1, 2]
5127-
4 [1, 2]
5128-
5 [1, 2]
5135+
dtype: object
51295136
51305137
Passing result_type='expand' will expand list-like results
51315138
to columns of a Dataframe
@@ -5135,42 +5142,27 @@ def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
51355142
0 1 2
51365143
1 1 2
51375144
2 1 2
5138-
3 1 2
5139-
4 1 2
5140-
5 1 2
51415145
51425146
Returning a Series inside the function is similar to passing
51435147
``result_type='expand'``. The resulting column names
51445148
will be the Series index.
51455149
5146-
>>> df.apply(lambda x: Series([1, 2], index=['foo', 'bar']), axis=1)
5150+
>>> df.apply(lambda x: pd.Series([1, 2], index=['foo', 'bar']), axis=1)
51475151
foo bar
51485152
0 1 2
51495153
1 1 2
51505154
2 1 2
5151-
3 1 2
5152-
4 1 2
5153-
5 1 2
51545155
51555156
Passing ``result_type='broadcast'`` will ensure the same shape
51565157
result, whether list-like or scalar is returned by the function,
51575158
and broadcast it along the axis. The resulting column names will
51585159
be the originals.
51595160
5160-
>>> df.apply(lambda x: [1, 2, 3], axis=1, result_type='broadcast')
5161-
A B C
5162-
0 1 2 3
5163-
1 1 2 3
5164-
2 1 2 3
5165-
3 1 2 3
5166-
4 1 2 3
5167-
5 1 2 3
5168-
5169-
See also
5170-
--------
5171-
DataFrame.applymap: For elementwise operations
5172-
DataFrame.aggregate: only perform aggregating type operations
5173-
DataFrame.transform: only perform transformating type operations
5161+
>>> df.apply(lambda x: [1, 2], axis=1, result_type='broadcast')
5162+
A B
5163+
0 1 2
5164+
1 1 2
5165+
2 1 2
51745166
51755167
Returns
51765168
-------

0 commit comments

Comments
 (0)