@@ -843,10 +843,11 @@ Aggregation API
843
843
.. versionadded :: 0.20.0
844
844
845
845
The aggregation API allows one to express possibly multiple aggregation operations in a single concise way.
846
- This API is similar across pandas objects, :ref: `groupby aggregates <groupby.aggregate >`,
847
- :ref: `window functions <stats.aggregate >`, and the :ref: `resample API <timeseries.aggregate >`.
846
+ This API is similar across pandas objects, see :ref: `groupby API <groupby.aggregate >`, the
847
+ :ref: `window functions API <stats.aggregate >`, and the :ref: `resample API <timeseries.aggregate >`.
848
+ The entry point for aggregation is the method :meth: `~DataFrame.aggregate `, or the alias :meth: `~DataFrame.agg `.
848
849
849
- We will use a similar starting frame from above.
850
+ We will use a similar starting frame from above:
850
851
851
852
.. ipython :: python
852
853
@@ -855,8 +856,8 @@ We will use a similar starting frame from above.
855
856
tsdf.iloc[3 :7 ] = np.nan
856
857
tsdf
857
858
858
- Using a single function is equivalent to `` .apply ` `; You can also pass named methods as strings.
859
- This will return a Series of the output.
859
+ Using a single function is equivalent to :meth: ` ~DataFrame .apply `; You can also pass named methods as strings.
860
+ These will return a `` Series `` of the aggregated output:
860
861
861
862
.. ipython :: python
862
863
@@ -867,72 +868,68 @@ This will return a Series of the output.
867
868
# these are equivalent to a ``.sum()`` because we are aggregating on a single function
868
869
tsdf.sum()
869
870
870
- On a Series this will result in a scalar value
871
+ Single aggregations on a `` Series `` this will result in a scalar value:
871
872
872
873
.. ipython :: python
873
874
874
875
tsdf.A.agg(' sum' )
875
876
876
877
877
- Aggregating multiple functions at once
878
- ++++++++++++++++++++++++++++++++++++++
878
+ Aggregating with multiple functions
879
+ +++++++++++++++++++++++++++++++++++
879
880
880
- You can pass arguments as a list. The results of each of the passed functions will be a row in the resultant DataFrame.
881
+ You can pass multiple aggregation arguments as a list.
882
+ The results of each of the passed functions will be a row in the resultant ``DataFrame ``.
881
883
These are naturally named from the aggregation function.
882
884
883
885
.. ipython :: python
884
886
885
887
tsdf.agg([' sum' ])
886
888
887
- Multiple functions yield multiple rows.
889
+ Multiple functions yield multiple rows:
888
890
889
891
.. ipython :: python
890
892
891
893
tsdf.agg([' sum' , ' mean' ])
892
894
893
- On a Series, multiple functions return a Series, indexed by the function names.
895
+ On a `` Series `` , multiple functions return a `` Series `` , indexed by the function names:
894
896
895
897
.. ipython :: python
896
898
897
899
tsdf.A.agg([' sum' , ' mean' ])
898
900
899
-
900
- Aggregating with a dict of functions
901
- ++++++++++++++++++++++++++++++++++++
902
-
903
- Passing a dictionary of column name to function or list of functions, to ``DataFame.agg ``
904
- allows you to customize which functions are applied to which columns.
901
+ Passing a ``lambda `` function will yield a ``<lambda> `` named row:
905
902
906
903
.. ipython :: python
907
904
908
- tsdf.agg({ ' A ' : ' mean ' , ' B ' : ' sum ' } )
905
+ tsdf.A. agg([ ' sum ' , lambda x : x.mean()] )
909
906
910
- Passing a list-like will generate a DataFrame output. You will get a matrix-like output
911
- of all of the aggregators; some may be missing values.
907
+ Passing a named function will yield that name for the row:
912
908
913
909
.. ipython :: python
914
910
915
- tsdf.agg({' A' : [' mean' , ' min' ], ' B' : ' sum' })
916
-
917
- For a Series, you can pass a dict. You will get back a MultiIndex Series; The outer level will
918
- be the keys, the inner the name of the functions.
911
+ def mymean (x ):
912
+ return x.mean()
919
913
920
- .. ipython :: python
914
+ tsdf.A.agg([ ' sum ' , mymean])
921
915
922
- tsdf.A.agg({' foo' : [' sum' , ' mean' ]})
916
+ Aggregating with a dict
917
+ +++++++++++++++++++++++
923
918
924
- Alternatively, using multiple dictionaries, you can have renamed elements with the aggregation
919
+ Passing a dictionary of column names to a scalar or a list of scalars, to ``DataFame.agg ``
920
+ allows you to customize which functions are applied to which columns.
925
921
926
922
.. ipython :: python
927
923
928
- tsdf.A. agg({' foo ' : ' sum ' , ' bar ' : ' mean ' })
924
+ tsdf.agg({' A ' : ' mean ' , ' B ' : ' sum ' })
929
925
930
- Multiple keys will yield a MultiIndex Series. The outer level will be the keys, the inner
931
- the names of the functions.
926
+ Passing a list-like will generate a ``DataFrame `` output. You will get a matrix-like output
927
+ of all of the aggregators. The output will consist of all unique functions. Those that are
928
+ not noted for a particular column will be ``NaN ``:
932
929
933
930
.. ipython :: python
934
931
935
- tsdf.A. agg({' foo ' : [' sum ' , ' mean' ] , ' bar ' : [ ' min' , ' max ' , lambda x : x. sum() + 1 ] })
932
+ tsdf.agg({' A ' : [' mean' , ' min' ] , ' B ' : ' sum' })
936
933
937
934
.. _basics.aggregation.mixed_dtypes :
938
935
@@ -980,7 +977,7 @@ Transform API
980
977
981
978
.. versionadded :: 0.20.0
982
979
983
- The `` transform ` ` method returns an object that is indexed the same (same size)
980
+ The :method: ` ~DataFrame. transform ` method returns an object that is indexed the same (same size)
984
981
as the original. This API allows you to provide *multiple * operations at the same
985
982
time rather than one-by-one. Its api is quite similar to the ``.agg `` API.
986
983
@@ -1034,8 +1031,8 @@ resulting column names will be the transforming functions.
1034
1031
tsdf.A.transform([np.abs, lambda x : x+ 1 ])
1035
1032
1036
1033
1037
- Transforming with a dict of functions
1038
- +++++++++++++++++++++++++++++++++++++
1034
+ Transforming with a dict
1035
+ ++++++++++++++++++++++++
1039
1036
1040
1037
1041
1038
Passing a dict of functions will will allow selective transforming per column.
@@ -1051,14 +1048,6 @@ selective transforms.
1051
1048
1052
1049
tsdf.transform({' A' : np.abs, ' B' : [lambda x : x+ 1 , ' sqrt' ]})
1053
1050
1054
- On a Series, passing a dict allows renaming as in ``.agg() ``
1055
-
1056
- .. ipython :: python
1057
-
1058
- tsdf.A.transform({' foo' : np.abs})
1059
- tsdf.A.transform({' foo' : np.abs, ' bar' : [lambda x : x+ 1 , ' sqrt' ]})
1060
-
1061
-
1062
1051
.. _basics.elementwise :
1063
1052
1064
1053
Applying Elementwise Functions
0 commit comments