@@ -497,28 +497,6 @@ index are the group names and whose values are the sizes of each group.
497
497
498
498
``nth `` can act as a reducer *or * a filter, see :ref: `here <groupby.nth >`
499
499
500
- Decimal columns are "nuisance" columns that .agg automatically excludes in groupby.
501
-
502
- If you do wish to aggregate them you must do so explicitly:
503
-
504
- .. ipython :: python
505
-
506
- from decimal import Decimal
507
- dec = pd.DataFrame(
508
- {' name' : [' foo' , ' bar' , ' foo' , ' bar' ],
509
- ' title' : [' boo' , ' far' , ' boo' , ' far' ],
510
- ' id' : [123 , 456 , 123 , 456 ],
511
- ' int_column' : [1 , 2 , 3 , 4 ],
512
- ' dec_column1' : [Decimal(' 0.50' ), Decimal(' 0.15' ), Decimal(' 0.25' ), Decimal(' 0.40' )],
513
- ' dec_column2' : [Decimal(' 0.20' ), Decimal(' 0.30' ), Decimal(' 0.55' ), Decimal(' 0.60' )]
514
- },
515
- columns = [' name' ,' title' ,' id' ,' int_column' ,' dec_column1' ,' dec_column2' ]
516
- )
517
-
518
- dec.groupby([' name' , ' title' , ' id' ], as_index = False ).sum()
519
-
520
- dec.groupby([' name' , ' title' , ' id' ], as_index = False ).agg({' dec_column1' : ' sum' , ' dec_column2' : ' sum' })
521
-
522
500
.. _groupby.aggregate.multifunc :
523
501
524
502
Applying multiple functions at once
@@ -977,6 +955,42 @@ will be (silently) dropped. Thus, this does not pose any problems:
977
955
978
956
df.groupby(' A' ).std()
979
957
958
+ .. note ::
959
+ Decimal columns are also "nuisance" columns. They are excluded from aggregate functions automatically in groupby.
960
+
961
+ If you do wish to include decimal columns in the aggregation, you must do so explicitly:
962
+
963
+ .. ipython :: python
964
+
965
+ from decimal import Decimal
966
+ dec = pd.DataFrame(
967
+ {' name' : [' foo' , ' bar' , ' foo' , ' bar' ],
968
+ ' title' : [' boo' , ' far' , ' boo' , ' far' ],
969
+ ' id' : [123 , 456 , 123 , 456 ],
970
+ ' int_column' : [1 , 2 , 3 , 4 ],
971
+ ' dec_column1' : [Decimal(' 0.50' ), Decimal(' 0.15' ), Decimal(' 0.25' ), Decimal(' 0.40' )],
972
+ ' dec_column2' : [Decimal(' 0.20' ), Decimal(' 0.30' ), Decimal(' 0.55' ), Decimal(' 0.60' )]
973
+ },
974
+ columns = [' name' ,' title' ,' id' ,' int_column' ,' dec_column1' ,' dec_column2' ]
975
+ )
976
+
977
+ dec.head()
978
+
979
+ dec.dtypes
980
+
981
+ # Decimal columns excluded from sum by default
982
+ dec.groupby([' name' , ' title' , ' id' ], as_index = False ).sum()
983
+
984
+ # Decimal columns can be sum'd explicitly by themselves...
985
+ dec.groupby([' name' , ' title' , ' id' ], as_index = False )[' dec_column1' ,' dec_column2' ].sum()
986
+
987
+ # ...but cannot be combined with standard data types or they will be excluded
988
+ dec.groupby([' name' , ' title' , ' id' ], as_index = False )[' int_column' ,' dec_column1' ,' dec_column2' ].sum()
989
+
990
+ # Use .agg function to aggregate over standard and "nuisance" data types at the same time
991
+ dec.groupby([' name' , ' title' , ' id' ], as_index = False ).agg({' int_column' : ' sum' , ' dec_column1' : ' sum' , ' dec_column2' : ' sum' })
992
+
993
+
980
994
.. _groupby.missing :
981
995
982
996
NA and NaT group handling
0 commit comments