DOC: improve groupby reference docs

An overview of the reference doc on groupby is given here: http://pandas.pydata.org/pandas-docs/dev/api.html#groupby (apart from the extensive user guide: http://pandas.pydata.org/pandas-docs/dev/groupby.html)

There are some things that could use some improvement:
- [x] add some missing functions to the overview in api.rst (#8231)
  - GroupBy.filter
  - `first`/`last`/`nth`
  - `count`, `cumcount`, ..
  - `name`: not sure what the purpose of this is
- [ ] add the `GroupBy` object itself to the api docs (and so automatically all its methods) (#19302)
- [x] put all relevant docstrings in the GroupBy class, and not only in the subclasses DataFrameGroupBy, SeriesGroupBy (eg now the `aggregate` and `transform` docstrings of GroupBy are empty, but are more elaborate in the subclasses) (#8231)
- [ ] general clean-up of all the docstrings 
  - especially the `apply` docstring is not very clear to me
- [ ] expand DataFrame/Series.groupby() docstring:
  - clearly list all possibilities for the `by` arg (and provide some short examples in the 'Examples' section)
- [x] document the whitelisted methods: (#8231)
  - this could eg be done by injecting it in the docstring automatically based on `_apply_whitelist`
  - or alternatively by ensuring they appear in the methods list of the GroupBy class (which is not the case at the moment, only in instantiated objects) (see also discussion in #2644)
- [x] More clearly document the DataFrameGroupBy and SeriesGroupBy classes (#8231)
  - at least mention them in the docs
  - one idea is to have a DataFrameGroupBy api pages that just redirects to the general GroupBy page
- [x] Add docstrings to the wrapped whitelisted functions. Eg at present `g = df.groupby(...); g.count?` is returning `<no docstring>` (see https://github.com/pydata/pandas/issues/4500#issuecomment-41220138 for explanation how)
- [ ] Make a clear distinction, about what to expect for the return values of a grouped-apply, e.g. `head/tail/nth` are basically `filter` type of functions, `fillna/shift` are transformers, while almost everything else is a reducer (e.g. `sum/mean/describe`), while `apply/agg` can be any of the above. hmm. maybe needs a separate section for this. (and of course `as_index` just makes this crazy)

If someone wants to tackle this (or parts of this), go ahead!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC: improve groupby reference docs #6944

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DOC: improve groupby reference docs #6944

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions