DOC: Minor improvements groupby user guide #56465

rhshadrach · 2023-12-12T02:23:21Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

rhshadrach · 2023-12-12T02:24:30Z

doc/source/user_guide/groupby.rst

@@ -602,7 +601,7 @@ Any reduction method that pandas implements can be passed as a string to
   grouped.agg("sum")

 The result of the aggregation will have the group names as the
-new index along the grouped axis. In the case of multiple keys, the result is a
+new index. In the case of multiple keys, the result is a


Removed since axis=1 is deprecated

rhshadrach · 2023-12-12T02:24:51Z

doc/source/user_guide/groupby.rst

@@ -846,15 +845,14 @@ The following methods on GroupBy act as transformations. Of these methods, only
        :meth:`~.DataFrameGroupBy.cumsum`;Compute the cumulative sum within each group
        :meth:`~.DataFrameGroupBy.diff`;Compute the difference between adjacent values within each group
        :meth:`~.DataFrameGroupBy.ffill`;Forward fill NA values within each group
-        :meth:`~.DataFrameGroupBy.fillna`;Fill NA values within each group


Removed since fillna is deprecated

rhshadrach · 2023-12-12T02:25:45Z

doc/source/user_guide/groupby.rst

-
-    # Use .agg function to aggregate over standard and "nuisance" data types
-    # at the same time
-    df_dec.groupby(["id"]).agg({"int_column": "sum", "dec_column": "sum"})


You used to need to do this to include nuisance columns; not anymore. Now they are included by default.

rhshadrach · 2023-12-12T02:26:35Z

doc/source/user_guide/groupby.rst

@@ -1350,35 +1339,53 @@ The returned dtype of the grouped will *always* include *all* of the categories

   s = (
       pd.Series([1, 1, 1])
-       .groupby(pd.Categorical(["a", "a", "a"], categories=["a", "b"]), observed=False)
+       .groupby(pd.Categorical(["a", "a", "a"], categories=["a", "b"]), observed=True)


This demonstrates unobserved categories are being kept, so I think it makes more sense to show with observed=True

mroeschke · 2023-12-12T02:58:09Z

Just noting to feel free to liberally move examples and behavior notes to docstrings or remove sections that are too niche/outdated. IMO I feel that information is best served in docstrings as opposed to user guides given how outdated they are

mroeschke · 2023-12-12T15:44:44Z

Thanks @rhshadrach

DOC: Minor improvements groupby user guide

e313494

rhshadrach added Docs Groupby labels Dec 12, 2023

rhshadrach commented Dec 12, 2023

View reviewed changes

breakup long line

4138e3a

mroeschke approved these changes Dec 12, 2023

View reviewed changes

mroeschke added this to the 2.2 milestone Dec 12, 2023

mroeschke merged commit acc395a into pandas-dev:main Dec 12, 2023

rhshadrach deleted the doc_gb_user_guide branch December 19, 2023 11:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Minor improvements groupby user guide #56465

DOC: Minor improvements groupby user guide #56465

rhshadrach commented Dec 12, 2023

rhshadrach Dec 12, 2023

rhshadrach Dec 12, 2023

rhshadrach Dec 12, 2023

rhshadrach Dec 12, 2023

mroeschke commented Dec 12, 2023

mroeschke commented Dec 12, 2023

DOC: Minor improvements groupby user guide #56465

DOC: Minor improvements groupby user guide #56465

Conversation

rhshadrach commented Dec 12, 2023

rhshadrach Dec 12, 2023

Choose a reason for hiding this comment

rhshadrach Dec 12, 2023

Choose a reason for hiding this comment

rhshadrach Dec 12, 2023

Choose a reason for hiding this comment

rhshadrach Dec 12, 2023

Choose a reason for hiding this comment

mroeschke commented Dec 12, 2023

mroeschke commented Dec 12, 2023