Skip to content

ENH: enhancements to Panel.apply to enable arbitrary functions and multi-dim slicing (GH1148) #5850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 15, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -785,6 +785,7 @@ Attributes and underlying data
Panel.axes
Panel.ndim
Panel.shape
Panel.dtypes

Conversion
~~~~~~~~~~
Expand Down Expand Up @@ -1122,7 +1123,7 @@ Indexing, iteration
~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/

GroupBy.__iter__
GroupBy.groups
GroupBy.indices
Expand All @@ -1141,7 +1142,7 @@ Computations / Descriptive Stats
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated/

GroupBy.mean
GroupBy.median
GroupBy.std
Expand All @@ -1155,7 +1156,7 @@ Computations / Descriptive Stats

.. toctree::
:hidden:

generated/pandas.core.common.isnull
generated/pandas.core.common.notnull
generated/pandas.core.reshape.get_dummies
Expand Down
79 changes: 77 additions & 2 deletions doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -637,6 +637,81 @@ to :ref:`merging/joining functionality <merging>`:
s
s.map(t)


.. _basics.apply_panel:

Applying with a Panel
~~~~~~~~~~~~~~~~~~~~~

Applying with a ``Panel`` will pass a ``Series`` to the applied function. If the applied
function returns a ``Series``, the result of the application will be a ``Panel``. If the applied function
reduces to a scalar, the result of the application will be a ``DataFrame``.

.. note::

Prior to 0.13.1 ``apply`` on a ``Panel`` would only work on ``ufuncs`` (e.g. ``np.sum/np.max``).

.. ipython:: python

import pandas.util.testing as tm
panel = tm.makePanel(5)
panel
panel['ItemA']

A transformational apply.

.. ipython:: python

result = panel.apply(lambda x: x*2, axis='items')
result
result['ItemA']

A reduction operation.

.. ipython:: python

panel.apply(lambda x: x.dtype, axis='items')

A similar reduction type operation

.. ipython:: python

panel.apply(lambda x: x.sum(), axis='major_axis')

This last reduction is equivalent to

.. ipython:: python

panel.sum('major_axis')

A transformation operation that returns a ``Panel``, but is computing
the z-score across the ``major_axis``.

.. ipython:: python

result = panel.apply(lambda x: (x-x.mean())/x.std(), axis='major_axis')
result
result['ItemA']

Apply can also accept multiple axes in the ``axis`` argument. This will pass a
``DataFrame`` of the cross-section to the applied function.

.. ipython:: python

f = lambda x: (x-x.mean(1)/x.std(1))

result = panel.apply(f, axis = ['items','major_axis'])
result
result.loc[:,:,'ItemA']

This is equivalent to the following

.. ipython:: python

result = Panel(dict([ (ax,f(panel.loc[:,:,ax])) for ax in panel.minor_axis ]))
result
result.loc[:,:,'ItemA']

.. _basics.reindexing:

Reindexing and altering labels
Expand Down Expand Up @@ -1066,7 +1141,7 @@ or match a pattern:

Series(['1', '2', '3a', '3b', '03c']).str.match(pattern, as_indexer=True)

The distinction between ``match`` and ``contains`` is strictness: ``match``
The distinction between ``match`` and ``contains`` is strictness: ``match``
relies on strict ``re.match``, while ``contains`` relies on ``re.search``.

.. warning::
Expand All @@ -1078,7 +1153,7 @@ relies on strict ``re.match``, while ``contains`` relies on ``re.search``.
This old, deprecated behavior of ``match`` is still the default. As
demonstrated above, use the new behavior by setting ``as_indexer=True``.
In this mode, ``match`` is analagous to ``contains``, returning a boolean
Series. The new behavior will become the default behavior in a future
Series. The new behavior will become the default behavior in a future
release.

Methods like ``match``, ``contains``, ``startswith``, and ``endswith`` take
Expand Down
3 changes: 3 additions & 0 deletions doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ Improvements to existing features
- df.info() view now display dtype info per column (:issue: `5682`)
- perf improvements in DataFrame ``count/dropna`` for ``axis=1``
- Series.str.contains now has a `regex=False` keyword which can be faster for plain (non-regex) string patterns. (:issue: `5879`)
- support ``dtypes`` on ``Panel``
- extend ``Panel.apply`` to allow arbitrary functions (rather than only ufuncs) (:issue:`1148`)
allow multiple axes to be used to operate on slabs of a ``Panel``

.. _release.bug_fixes-0.13.1:

Expand Down
54 changes: 54 additions & 0 deletions doc/source/v0.13.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,60 @@ Deprecations
Enhancements
~~~~~~~~~~~~

- ``Panel.apply`` will work on non-ufuncs. See :ref:`the docs<basics.apply_panel>`.

.. ipython:: python

import pandas.util.testing as tm
panel = tm.makePanel(5)
panel
panel['ItemA']

Specifying an ``apply`` that operates on a Series (to return a single element)

.. ipython:: python

panel.apply(lambda x: x.dtype, axis='items')

A similar reduction type operation

.. ipython:: python

panel.apply(lambda x: x.sum(), axis='major_axis')

This is equivalent to

.. ipython:: python

panel.sum('major_axis')

A transformation operation that returns a Panel, but is computing
the z-score across the major_axis

.. ipython:: python

result = panel.apply(lambda x: (x-x.mean())/x.std(), axis='major_axis')
result
result['ItemA']

- ``Panel.apply`` operating on cross-sectional slabs. (:issue:`1148`)

.. ipython:: python

f = lambda x: (x-x.mean(1)/x.std(1))

result = panel.apply(f, axis = ['items','major_axis'])
result
result.loc[:,:,'ItemA']

This is equivalent to the following

.. ipython:: python

result = Panel(dict([ (ax,f(panel.loc[:,:,ax])) for ax in panel.minor_axis ]))
result
result.loc[:,:,'ItemA']

Experimental
~~~~~~~~~~~~

Expand Down
Loading