Skip to content

Commit 37f8dc4

Browse files
committed
DOC: add notes to the groupby.rst docs
1 parent ad49095 commit 37f8dc4

File tree

1 file changed

+40
-2
lines changed

1 file changed

+40
-2
lines changed

doc/source/groupby.rst

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -344,8 +344,9 @@ Aggregation
344344
-----------
345345

346346
Once the GroupBy object has been created, several methods are available to
347-
perform a computation on the grouped data. An obvious one is aggregation via
348-
the ``aggregate`` or equivalently ``agg`` method:
347+
perform a computation on the grouped data.
348+
349+
An obvious one is aggregation via the ``aggregate`` or equivalently ``agg`` method:
349350

350351
.. ipython:: python
351352
@@ -382,6 +383,17 @@ index are the group names and whose values are the sizes of each group.
382383
383384
grouped.size()
384385
386+
.. ipython:: python
387+
388+
grouped.describe()
389+
390+
.. note::
391+
392+
Aggregation functions will **not** return the groups that you are aggregating over
393+
if they are named *columns*. The grouped columns will be the **indices** of the returned object.
394+
Aggregating functions are ones that reduce the dimension of the returned objects,
395+
for example: ``mean, sum, size, count, std, var, describe, first, last, min, max``. This is
396+
very much like performing a redcing operation on a ``DataFrame`` and getting a ``Series`` back.
385397

386398
.. _groupby.aggregate.multifunc:
387399

@@ -537,6 +549,15 @@ and that the transformed data contains no NAs.
537549
grouped_trans.count() # counts after transformation
538550
grouped_trans.size() # Verify non-NA count equals group size
539551
552+
.. note::
553+
554+
Some functions when applied to a groupby object will automatically transform the input, returning
555+
an object of the same shape as the original. For example: ``fillna, ffill, bfill, shift``.
556+
557+
.. ipython:: python
558+
559+
grouped.ffill()
560+
540561
.. _groupby.filter:
541562

542563
Filtration
@@ -579,6 +600,17 @@ For dataframes with multiple columns, filters should explicitly specify a column
579600
dff['C'] = np.arange(8)
580601
dff.groupby('B').filter(lambda x: len(x['C']) > 2)
581602
603+
.. note::
604+
605+
Some functions when applied to a groupby object will act as a **filter** on the input, returning
606+
a reduced shape of the original (and potentitally eliminating groups), but with the index unchanged.
607+
For example: ``head, tail nth``.
608+
609+
.. ipython:: python
610+
611+
dff.groupby('B').head(2)
612+
613+
582614
.. _groupby.dispatch:
583615

584616
Dispatching to instance methods
@@ -664,6 +696,12 @@ The dimension of the returned result can also change:
664696
s.apply(f)
665697
666698
699+
.. note::
700+
701+
``apply`` can act as a reducer, transformer, *or* filter function, depending on exactly what is passed to apply.
702+
So depending on the path taken, and exactly what you are grouping. Thus the grouped columns(s) may be included in
703+
the output as well as set the indices.
704+
667705
.. warning::
668706

669707
In the current implementation apply calls func twice on the

0 commit comments

Comments
 (0)