Skip to content

DOC: Groupby transform should mention that parameter can be a string #50029

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
f21dc32
feat: adding to the function docs that transform function parameter c…
seanjedi Dec 1, 2022
ef6c095
docs: updating docs for groupby
seanjedi Dec 2, 2022
6068105
Merge branch 'main' into SEANJEDI-49961_Docs_for_groupby_transform_sh…
seanjedi Dec 3, 2022
3b84281
doc: updating docs to fix PR comments
seanjedi Dec 3, 2022
ea09518
Merge branch 'SEANJEDI-49961_Docs_for_groupby_transform_should_mentio…
seanjedi Dec 3, 2022
3d04504
Merge branch 'main' into SEANJEDI-49961_Docs_for_groupby_transform_sh…
seanjedi Dec 3, 2022
8f4b8c7
fix: fixing failing doc check
seanjedi Dec 4, 2022
b06287e
docs: updating docs according to PR comment
seanjedi Dec 4, 2022
f6c24f7
docs: fixing PR comment
seanjedi Dec 4, 2022
207abd6
docs: updating docs to better suit dataframes and series
seanjedi Dec 4, 2022
aa0c6eb
Merge branch 'main' into SEANJEDI-49961_Docs_for_groupby_transform_sh…
seanjedi Dec 4, 2022
90cae51
docs: resolving PR comments
seanjedi Dec 5, 2022
36f31f1
docs: fixing PR check failure
seanjedi Dec 6, 2022
9fa6aec
Merge branch 'main' into SEANJEDI-49961_Docs_for_groupby_transform_sh…
seanjedi Dec 6, 2022
1b6a971
Merge branch 'SEANJEDI-49961_Docs_for_groupby_transform_should_mentio…
seanjedi Dec 6, 2022
9223b59
docs: resolving PR comments
seanjedi Dec 6, 2022
0651cdc
docs: fixing issue with docstring validation
seanjedi Dec 6, 2022
d7636fa
Merge branch 'main' into SEANJEDI-49961_Docs_for_groupby_transform_sh…
seanjedi Dec 6, 2022
a5f84df
docs: fixing doctest failures
seanjedi Dec 6, 2022
3306057
docs: fixing some issues in the docstrings checks
seanjedi Dec 6, 2022
e7681ec
docs: adding in some missing docs for docstest check
seanjedi Dec 6, 2022
4b5067d
docs: fixing doctest check failure
seanjedi Dec 6, 2022
0ed0d36
docs: fixing docstring validation check failure
seanjedi Dec 6, 2022
a42438c
Merge branch 'main' into SEANJEDI-49961_Docs_for_groupby_transform_sh…
seanjedi Dec 7, 2022
04d48f2
docs: updating docs according to PR comment
seanjedi Dec 8, 2022
8a32a1c
Merge branch 'main' into SEANJEDI-49961_Docs_for_groupby_transform_sh…
seanjedi Dec 8, 2022
6c0264d
Merge branch 'main' into SEANJEDI-49961_Docs_for_groupby_transform_sh…
seanjedi Dec 12, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 100 additions & 2 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -427,7 +427,51 @@ def _aggregate_named(self, func, *args, **kwargs):

return result

@Substitution(klass="Series")
__examples_series_doc = dedent(
"""
>>> ser = pd.Series(
... [390.0, 350.0, 30.0, 20.0],
... index=["Falcon", "Falcon", "Parrot", "Parrot"],
... name="Max Speed")
>>> grouped = ser.groupby([1, 1, 2, 2])
>>> grouped.transform(lambda x: (x - x.mean()) / x.std())
Falcon 0.707107
Falcon -0.707107
Parrot 0.707107
Parrot -0.707107
Name: Max Speed, dtype: float64

Broadcast result of the transformation

>>> grouped.transform(lambda x: x.max() - x.min())
Falcon 40.0
Falcon 40.0
Parrot 10.0
Parrot 10.0
Name: Max Speed, dtype: float64

>>> grouped.transform("mean")
Falcon 370.0
Falcon 370.0
Parrot 25.0
Parrot 25.0
Name: Max Speed, dtype: float64

.. versionchanged:: 1.3.0

The resulting dtype will reflect the return value of the passed ``func``,
for example:

>>> grouped.transform(lambda x: x.astype(int).max())
Falcon 390
Falcon 390
Parrot 30
Parrot 30
Name: Max Speed, dtype: int64
"""
)

@Substitution(klass="Series", example=__examples_series_doc)
@Appender(_transform_template)
def transform(self, func, *args, engine=None, engine_kwargs=None, **kwargs):
return self._transform(
Expand Down Expand Up @@ -1405,7 +1449,61 @@ def _transform_general(self, func, *args, **kwargs):
concatenated = concatenated.reindex(concat_index, axis=other_axis, copy=False)
return self._set_result_index_ordered(concatenated)

@Substitution(klass="DataFrame")
__examples_dataframe_doc = dedent(
"""
>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
... 'foo', 'bar'],
... 'B' : ['one', 'one', 'two', 'three',
... 'two', 'two'],
... 'C' : [1, 5, 5, 2, 5, 5],
... 'D' : [2.0, 5., 8., 1., 2., 9.]})
>>> grouped = df.groupby('A')[['C', 'D']]
>>> grouped.transform(lambda x: (x - x.mean()) / x.std())
C D
0 -1.154701 -0.577350
1 0.577350 0.000000
2 0.577350 1.154701
3 -1.154701 -1.000000
4 0.577350 -0.577350
5 0.577350 1.000000

Broadcast result of the transformation

>>> grouped.transform(lambda x: x.max() - x.min())
C D
0 4.0 6.0
1 3.0 8.0
2 4.0 6.0
3 3.0 8.0
4 4.0 6.0
5 3.0 8.0

>>> grouped.transform("mean")
C D
0 3.666667 4.0
1 4.000000 5.0
2 3.666667 4.0
3 4.000000 5.0
4 3.666667 4.0
5 4.000000 5.0

.. versionchanged:: 1.3.0

The resulting dtype will reflect the return value of the passed ``func``,
for example:

>>> grouped.transform(lambda x: x.astype(int).max())
C D
0 5 8
1 5 9
2 5 8
3 5 9
4 5 8
5 5 9
"""
)

@Substitution(klass="DataFrame", example=__examples_dataframe_doc)
@Appender(_transform_template)
def transform(self, func, *args, engine=None, engine_kwargs=None, **kwargs):
return self._transform(
Expand Down
54 changes: 10 additions & 44 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -402,15 +402,22 @@ class providing the base-class of operations.
f : function, str
Function to apply to each group. See the Notes section below for requirements.

Can also accept a Numba JIT function with
``engine='numba'`` specified.
Accepted inputs are:

- String function name
- Python function
- Numba JIT function with ``engine='numba'`` specified.

Only passing a single function is supported with this engine.
If the ``'numba'`` engine is chosen, the function must be
a user defined function with ``values`` and ``index`` as the
first and second arguments respectively in the function signature.
Each group's index will be passed to the user defined function
and optionally available for use.

If a string is chosen, then it needs to be the name
of the groupby method you want to use.

.. versionchanged:: 1.1.0
*args
Positional arguments to pass to func.
Expand Down Expand Up @@ -480,48 +487,7 @@ class providing the base-class of operations.

Examples
--------

>>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
... 'foo', 'bar'],
... 'B' : ['one', 'one', 'two', 'three',
... 'two', 'two'],
... 'C' : [1, 5, 5, 2, 5, 5],
... 'D' : [2.0, 5., 8., 1., 2., 9.]})
>>> grouped = df.groupby('A')[['C', 'D']]
>>> grouped.transform(lambda x: (x - x.mean()) / x.std())
C D
0 -1.154701 -0.577350
1 0.577350 0.000000
2 0.577350 1.154701
3 -1.154701 -1.000000
4 0.577350 -0.577350
5 0.577350 1.000000

Broadcast result of the transformation

>>> grouped.transform(lambda x: x.max() - x.min())
C D
0 4.0 6.0
1 3.0 8.0
2 4.0 6.0
3 3.0 8.0
4 4.0 6.0
5 3.0 8.0

.. versionchanged:: 1.3.0

The resulting dtype will reflect the return value of the passed ``func``,
for example:

>>> grouped.transform(lambda x: x.astype(int).max())
C D
0 5 8
1 5 9
2 5 8
3 5 9
4 5 8
5 5 9
"""
%(example)s"""

_agg_template = """
Aggregate using one or more operations over the specified axis.
Expand Down