Skip to content

Fix pipe docs #29368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 6, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 34 additions & 11 deletions doc/source/getting_started/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -753,28 +753,51 @@ on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.
Tablewise function application
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``DataFrames`` and ``Series`` can of course just be passed into functions.
``DataFrames`` and ``Series`` can be passed into functions.
However, if the function needs to be called in a chain, consider using the :meth:`~DataFrame.pipe` method.
Compare the following

.. code-block:: python
First some setup:

.. ipython:: python

# f, g, and h are functions taking and returning ``DataFrames``
>>> f(g(h(df), arg1=1), arg2=2, arg3=3)
def extract_city_name(df):
"""
Chicago, IL -> Chicago for city_name column
"""
df['city_name'] = df['city_and_code'].str.split(",").str.get(0)
return df

with the equivalent
def add_country_name(df, country_name=None):
"""
Chicago -> Chicago-US for city_name column
"""
col = 'city_name'
df['city_and_country'] = df[col] + country_name
return df

.. code-block:: python
df_p = pd.DataFrame({'city_and_code': ['Chicago, IL']})


``extract_city_name`` and ``add_country_name`` are functions taking and returning ``DataFrames``.

Now compare the following:

.. ipython:: python

add_country_name(extract_city_name(df_p), country_name='US')

Is equivalent to:

.. ipython:: python

>>> (df.pipe(h)
... .pipe(g, arg1=1)
... .pipe(f, arg2=2, arg3=3))
(df_p.pipe(extract_city_name)
.pipe(add_country_name, country_name="US"))

Pandas encourages the second style, which is known as method chaining.
``pipe`` makes it easy to use your own or another library's functions
in method chains, alongside pandas' methods.

In the example above, the functions ``f``, ``g``, and ``h`` each expected the ``DataFrame`` as the first positional argument.
In the example above, the functions ``extract_city_name`` and ``add_country_name`` each expected a ``DataFrame`` as the first positional argument.
What if the function you wish to apply takes its data as, say, the second argument?
In this case, provide ``pipe`` with a tuple of ``(callable, data_keyword)``.
``.pipe`` will route the ``DataFrame`` to the argument specified in the tuple.
Expand Down
1 change: 0 additions & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,6 @@ ignore = E402, # module level import not at top of file
E711, # comparison to none should be 'if cond is none:'

exclude =
doc/source/getting_started/basics.rst
doc/source/development/contributing_docstring.rst


Expand Down