Skip to content

Commit a10f2e0

Browse files
jorisvandenbosscheharisbal
authored and
harisbal
committed
DOC: some clean-up of the apply docs (follow-up pandas-dev#18577) (pandas-dev#19573)
1 parent a44efdb commit a10f2e0

File tree

4 files changed

+52
-41
lines changed

4 files changed

+52
-41
lines changed

doc/source/basics.rst

+9-7
Original file line numberDiff line numberDiff line change
@@ -774,9 +774,9 @@ We encourage you to view the source code of :meth:`~DataFrame.pipe`.
774774
Row or Column-wise Function Application
775775
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
776776

777-
Arbitrary functions can be applied along the axes of a DataFrame or Panel
777+
Arbitrary functions can be applied along the axes of a DataFrame
778778
using the :meth:`~DataFrame.apply` method, which, like the descriptive
779-
statistics methods, take an optional ``axis`` argument:
779+
statistics methods, takes an optional ``axis`` argument:
780780

781781
.. ipython:: python
782782
@@ -794,13 +794,15 @@ The :meth:`~DataFrame.apply` method will also dispatch on a string method name.
794794
df.apply('mean', axis=1)
795795
796796
The return type of the function passed to :meth:`~DataFrame.apply` affects the
797-
type of the ultimate output from DataFrame.apply
797+
type of the final output from ``DataFrame.apply`` for the default behaviour:
798798

799-
* If the applied function returns a ``Series``, the ultimate output is a ``DataFrame``.
799+
* If the applied function returns a ``Series``, the final output is a ``DataFrame``.
800800
The columns match the index of the ``Series`` returned by the applied function.
801-
* If the applied function returns any other type, the ultimate output is a ``Series``.
802-
* A ``result_type`` kwarg is accepted with the options: ``reduce``, ``broadcast``, and ``expand``.
803-
These will determine how list-likes return results expand (or not) to a ``DataFrame``.
801+
* If the applied function returns any other type, the final output is a ``Series``.
802+
803+
This default behaviour can be overridden using the ``result_type``, which
804+
accepts three options: ``reduce``, ``broadcast``, and ``expand``.
805+
These will determine how list-likes return values expand (or not) to a ``DataFrame``.
804806

805807
:meth:`~DataFrame.apply` combined with some cleverness can be used to answer many questions
806808
about a data set. For example, suppose we wanted to extract the date where the

doc/source/whatsnew/v0.23.0.txt

+12-11
Original file line numberDiff line numberDiff line change
@@ -334,20 +334,20 @@ Convert to an xarray DataArray
334334

335335
.. _whatsnew_0230.api_breaking.apply:
336336

337-
Apply Changes
338-
~~~~~~~~~~~~~
337+
Changes to make output of ``DataFrame.apply`` consistent
338+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
339339

340340
:func:`DataFrame.apply` was inconsistent when applying an arbitrary user-defined-function that returned a list-like with ``axis=1``. Several bugs and inconsistencies
341341
are resolved. If the applied function returns a Series, then pandas will return a DataFrame; otherwise a Series will be returned, this includes the case
342-
where a list-like (e.g. ``tuple`` or ``list`` is returned), (:issue:`16353`, :issue:`17437`, :issue:`17970`, :issue:`17348`, :issue:`17892`, :issue:`18573`,
343-
:issue:`17602`, :issue:`18775`, :issue:`18901`, :issue:`18919`)
342+
where a list-like (e.g. ``tuple`` or ``list`` is returned) (:issue:`16353`, :issue:`17437`, :issue:`17970`, :issue:`17348`, :issue:`17892`, :issue:`18573`,
343+
:issue:`17602`, :issue:`18775`, :issue:`18901`, :issue:`18919`).
344344

345345
.. ipython:: python
346346

347347
df = pd.DataFrame(np.tile(np.arange(3), 6).reshape(6, -1) + 1, columns=['A', 'B', 'C'])
348348
df
349349

350-
Previous Behavior. If the returned shape happened to match the original columns, this would return a ``DataFrame``.
350+
Previous Behavior: if the returned shape happened to match the length of original columns, this would return a ``DataFrame``.
351351
If the return shape did not match, a ``Series`` with lists was returned.
352352

353353
.. code-block:: python
@@ -373,7 +373,7 @@ If the return shape did not match, a ``Series`` with lists was returned.
373373
dtype: object
374374

375375

376-
New Behavior. The behavior is consistent. These will *always* return a ``Series``.
376+
New Behavior: When the applied function returns a list-like, this will now *always* return a ``Series``.
377377

378378
.. ipython:: python
379379

@@ -386,8 +386,9 @@ To have expanded columns, you can use ``result_type='expand'``
386386

387387
df.apply(lambda x: [1, 2, 3], axis=1, result_type='expand')
388388

389-
To have broadcast the result across, you can use ``result_type='broadcast'``. The shape
390-
must match the original columns.
389+
To broadcast the result across the original columns (the old behaviour for
390+
list-likes of the correct length), you can use ``result_type='broadcast'``.
391+
The shape must match the original columns.
391392

392393
.. ipython:: python
393394

@@ -397,7 +398,7 @@ Returning a ``Series`` allows one to control the exact return structure and colu
397398

398399
.. ipython:: python
399400

400-
df.apply(lambda x: Series([1, 2, 3], index=x.index), axis=1)
401+
df.apply(lambda x: Series([1, 2, 3], index=['D', 'E', 'F']]), axis=1)
401402

402403

403404
.. _whatsnew_0230.api_breaking.build_changes:
@@ -523,8 +524,8 @@ Deprecations
523524
- The ``is_copy`` attribute is deprecated and will be removed in a future version (:issue:`18801`).
524525
- ``IntervalIndex.from_intervals`` is deprecated in favor of the :class:`IntervalIndex` constructor (:issue:`19263`)
525526
- :func:``DataFrame.from_items`` is deprecated. Use :func:``DataFrame.from_dict()`` instead, or :func:``DataFrame.from_dict(OrderedDict())`` if you wish to preserve the key order (:issue:`17320`)
526-
- The ``broadcast`` parameter of ``.apply()`` is removed in favor of ``result_type='broadcast'`` (:issue:`18577`)
527-
- The ``reduce`` parameter of ``.apply()`` is removed in favor of ``result_type='reduce'`` (:issue:`18577`)
527+
- The ``broadcast`` parameter of ``.apply()`` is deprecated in favor of ``result_type='broadcast'`` (:issue:`18577`)
528+
- The ``reduce`` parameter of ``.apply()`` is deprecated in favor of ``result_type='reduce'`` (:issue:`18577`)
528529

529530
.. _whatsnew_0230.prior_deprecations:
530531

pandas/core/frame.py

+20-16
Original file line numberDiff line numberDiff line change
@@ -4822,12 +4822,12 @@ def aggregate(self, func, axis=0, *args, **kwargs):
48224822

48234823
def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
48244824
result_type=None, args=(), **kwds):
4825-
"""Applies function along input axis of DataFrame.
4825+
"""Applies function along an axis of the DataFrame.
48264826
48274827
Objects passed to functions are Series objects having index
48284828
either the DataFrame's index (axis=0) or the columns (axis=1).
4829-
Return type depends on whether passed function aggregates, or the
4830-
reduce argument if the DataFrame is empty.
4829+
Final return type depends on the return type of the applied function,
4830+
or on the `result_type` argument.
48314831
48324832
Parameters
48334833
----------
@@ -4863,15 +4863,18 @@ def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
48634863
by result_type='reduce'.
48644864
48654865
result_type : {'expand', 'reduce', 'broadcast, None}
4866-
These only act when axis=1 {columns}
4866+
These only act when axis=1 {columns}:
4867+
48674868
* 'expand' : list-like results will be turned into columns.
48684869
* 'reduce' : return a Series if possible rather than expanding
48694870
list-like results. This is the opposite to 'expand'.
48704871
* 'broadcast' : results will be broadcast to the original shape
48714872
of the frame, the original index & columns will be retained.
4872-
* None : list-like results will be returned as a list
4873-
in a single column. However if the apply function
4874-
returns a Series these are expanded to columns.
4873+
4874+
The default behaviour (None) depends on the return value of the
4875+
applied function: list-like results will be returned as a Series
4876+
of those. However if the apply function returns a Series these
4877+
are expanded to columns.
48754878
48764879
.. versionadded:: 0.23.0
48774880
@@ -4893,8 +4896,8 @@ def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
48934896
48944897
We use this DataFrame to illustrate
48954898
4896-
>>> df = DataFrame(np.tile(np.arange(3), 6).reshape(6, -1) + 1,
4897-
... columns=['A', 'B', 'C'])
4899+
>>> df = pd.DataFrame(np.tile(np.arange(3), 6).reshape(6, -1) + 1,
4900+
... columns=['A', 'B', 'C'])
48984901
>>> df
48994902
A B C
49004903
0 1 2 3
@@ -4904,7 +4907,8 @@ def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
49044907
4 1 2 3
49054908
5 1 2 3
49064909
4907-
Using a ufunc
4910+
Using a numpy universal function (in this case the same as
4911+
``np.sqrt(df)``):
49084912
49094913
>>> df.apply(np.sqrt)
49104914
A B C
@@ -4954,8 +4958,8 @@ def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
49544958
4 1 2
49554959
5 1 2
49564960
4957-
Return a Series inside the function is similar to passing
4958-
Passing result_type='expand'. The resulting column names
4961+
Returning a Series inside the function is similar to passing
4962+
``result_type='expand'``. The resulting column names
49594963
will be the Series index.
49604964
49614965
>>> df.apply(lambda x: Series([1, 2], index=['foo', 'bar']), axis=1)
@@ -4967,10 +4971,10 @@ def apply(self, func, axis=0, broadcast=None, raw=False, reduce=None,
49674971
4 1 2
49684972
5 1 2
49694973
4970-
4971-
Passing result_type='broadcast' will take a same shape
4972-
result, whether list-like or scalar and broadcast it
4973-
along the axis. The resulting column names will be the originals.
4974+
Passing ``result_type='broadcast'`` will ensure the same shape
4975+
result, whether list-like or scalar is returned by the function,
4976+
and broadcast it along the axis. The resulting column names will
4977+
be the originals.
49744978
49754979
>>> df.apply(lambda x: [1, 2, 3], axis=1, result_type='broadcast')
49764980
A B C

pandas/core/sparse/frame.py

+11-7
Original file line numberDiff line numberDiff line change
@@ -861,14 +861,18 @@ def apply(self, func, axis=0, broadcast=None, reduce=None,
861861
by result_type='reduce'.
862862
863863
result_type : {'expand', 'reduce', 'broadcast, None}
864-
These only act when axis=1 {columns}
865-
* 'expand' : list-like results will be turned into columns
864+
These only act when axis=1 {columns}:
865+
866+
* 'expand' : list-like results will be turned into columns.
866867
* 'reduce' : return a Series if possible rather than expanding
867-
list-like results. This is the opposite to 'expand'
868-
* 'broadcast' : scalar results will be broadcast to all columns
869-
* None : list-like results will be returned as a list
870-
in a single column. However if the apply function
871-
returns a Series these are expanded to columns.
868+
list-like results. This is the opposite to 'expand'.
869+
* 'broadcast' : results will be broadcast to the original shape
870+
of the frame, the original index & columns will be retained.
871+
872+
The default behaviour (None) depends on the return value of the
873+
applied function: list-like results will be returned as a Series
874+
of those. However if the apply function returns a Series these
875+
are expanded to columns.
872876
873877
.. versionadded:: 0.23.0
874878

0 commit comments

Comments
 (0)