@@ -702,7 +702,8 @@ on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.
702
702
703
703
1. `Tablewise Function Application `_: :meth: `~DataFrame.pipe `
704
704
2. `Row or Column-wise Function Application `_: :meth: `~DataFrame.apply `
705
- 3. Elementwise _ function application: :meth: `~DataFrame.applymap `
705
+ 3. `Aggregation API `_: :meth: `~DataFrame.agg ` and :meth: `~DataFrame.transform `
706
+ 4. `Applying Elementwise Functions `_: :meth: `~DataFrame.applymap `
706
707
707
708
.. _basics.pipe :
708
709
@@ -778,6 +779,13 @@ statistics methods, take an optional ``axis`` argument:
778
779
df.apply(np.cumsum)
779
780
df.apply(np.exp)
780
781
782
+ ``.apply() `` will also dispatch on a string method name.
783
+
784
+ .. ipython :: python
785
+
786
+ df.apply(' mean' )
787
+ df.apply(' mean' , axis = 1 )
788
+
781
789
Depending on the return type of the function passed to :meth: `~DataFrame.apply `,
782
790
the result will either be of lower dimension or the same dimension.
783
791
@@ -827,16 +835,212 @@ set to True, the passed function will instead receive an ndarray object, which
827
835
has positive performance implications if you do not need the indexing
828
836
functionality.
829
837
830
- .. seealso ::
838
+ .. _basics.aggregate :
839
+
840
+ Aggregation API
841
+ ~~~~~~~~~~~~~~~
842
+
843
+ .. versionadded :: 0.20.0
844
+
845
+ The aggregation API allows one to express possibly multiple aggregation operations in a single concise way.
846
+ This API is similar across pandas objects, :ref: `groupby aggregates <groupby.aggregate >`,
847
+ :ref: `window functions <stats.aggregate >`, and the :ref: `resample API <timeseries.aggregate >`.
848
+
849
+ We will use a similar starting frame from above.
850
+
851
+ .. ipython :: python
852
+
853
+ tsdf = pd.DataFrame(np.random.randn(10 , 3 ), columns = [' A' , ' B' , ' C' ],
854
+ index = pd.date_range(' 1/1/2000' , periods = 10 ))
855
+ tsdf.iloc[3 :7 ] = np.nan
856
+ tsdf
857
+
858
+ Using a single function is equivalent to ``.apply ``; You can also pass named methods as strings.
859
+ This will return a Series of the output.
860
+
861
+ .. ipython :: python
862
+
863
+ tsdf.agg(np.sum)
864
+
865
+ tsdf.agg(' sum' )
866
+
867
+ # these are equivalent to a ``.sum()`` because we are aggregating on a single function
868
+ tsdf.sum()
869
+
870
+ On a Series this will result in a scalar value
871
+
872
+ .. ipython :: python
873
+
874
+ tsdf.A.agg(' sum' )
875
+
876
+
877
+ Aggregating multiple functions at once
878
+ ++++++++++++++++++++++++++++++++++++++
879
+
880
+ You can pass arguments as a list. The results of each of the passed functions will be a row in the resultant DataFrame.
881
+ These are naturally named from the aggregation function.
882
+
883
+ .. ipython :: python
884
+
885
+ tsdf.agg([' sum' ])
886
+
887
+ Multiple functions yield multiple rows.
888
+
889
+ .. ipython :: python
890
+
891
+ tsdf.agg([' sum' , ' mean' ])
892
+
893
+ On a Series, multiple functions return a Series.
894
+
895
+ .. ipython :: python
896
+
897
+ tsdf.A.agg([' sum' , ' mean' ])
898
+
899
+
900
+ Aggregating with a dict of functions
901
+ ++++++++++++++++++++++++++++++++++++
902
+
903
+ Passing a dictionary of column name to function or list of functions, to ``DataFame.agg ``
904
+ allows you to customize which functions are applied to which columns.
905
+
906
+ .. ipython :: python
907
+
908
+ tsdf.agg({' A' : ' mean' , ' B' : ' sum' })
909
+
910
+ Passing a list-like will generate a DataFrame output. You will get a matrix-like output
911
+ of all of the aggregators; some may be missing values.
912
+
913
+ .. ipython :: python
914
+
915
+ tsdf.agg({' A' : [' mean' , ' min' ], ' B' : ' sum' })
916
+
917
+ For a Series, you can pass a dict; the keys will set the name of the column
918
+
919
+ .. ipython :: python
920
+
921
+ tsdf.A.agg({' foo' : [' sum' , ' mean' ]})
922
+
923
+ Alternatively, using multiple dictionaries, you can have renamed elements with the aggregation
924
+
925
+ .. ipython :: python
926
+
927
+ tsdf.A.agg({' foo' : ' sum' , ' bar' :' mean' })
928
+
929
+ Multiple keys will yield multiple columns.
930
+
931
+ .. ipython :: python
932
+
933
+ tsdf.A.agg({' foo' : [' sum' , ' mean' ], ' bar' : [' min' , ' max' , lambda x : x.sum()+ 1 ]})
934
+
935
+ .. _basics.custom_describe :
936
+
937
+ Custom describe
938
+ +++++++++++++++
939
+
940
+ With ``.agg() `` is it possible to easily create a custom describe function, similar
941
+ to the built in :ref: `describe function <basics.describe >`.
942
+
943
+ .. ipython :: python
944
+
945
+ from functools import partial
946
+
947
+ q_25 = partial(pd.Series.quantile, q = 0.25 )
948
+ q_25.__name__ = ' 25%'
949
+ q_75 = partial(pd.Series.quantile, q = 0.75 )
950
+ q_75.__name__ = ' 75%'
951
+
952
+ tsdf.agg([' count' , ' mean' , ' std' , ' min' , q_25, ' median' , q_75, ' max' ])
953
+
954
+ .. _basics.transform :
955
+
956
+ Transform API
957
+ ~~~~~~~~~~~~~
958
+
959
+ .. versionadded :: 0.20.0
960
+
961
+ The ``transform `` method returns an object that is indexed the same (same size)
962
+ as the original. This API allows you to provide *multiple * operations at the same
963
+ time rather than one-by-one. Its api is quite similar to the ``.agg `` API.
964
+
965
+ Use a similar frame to the above sections.
831
966
832
- The section on :ref: `GroupBy <groupby >` demonstrates related, flexible
833
- functionality for grouping by some criterion, applying, and combining the
834
- results into a Series, DataFrame, etc.
967
+ .. ipython :: python
968
+
969
+ tsdf = pd.DataFrame(np.random.randn(10 , 3 ), columns = [' A' , ' B' , ' C' ],
970
+ index = pd.date_range(' 1/1/2000' , periods = 10 ))
971
+ tsdf.iloc[3 :7 ] = np.nan
972
+ tsdf
973
+
974
+ Transform the entire frame. Transform allows functions to input as a numpy function, string
975
+ function name and user defined function.
976
+
977
+ .. ipython :: python
835
978
836
- .. _Elementwise :
979
+ tsdf.transform(np.abs)
980
+ tsdf.transform(' abs' )
981
+ tsdf.transform(lambda x : x.abs())
982
+
983
+ Since this is a single function, this is equivalent to a ufunc application
984
+
985
+ .. ipython :: python
986
+
987
+ np.abs(tsdf)
988
+
989
+ Passing a single function to ``.transform() `` with a Series will yield a single Series in return.
990
+
991
+ .. ipython :: python
837
992
838
- Applying elementwise Python functions
839
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
993
+ tsdf.A.transform(np.abs)
994
+
995
+
996
+ Transform with multiple functions
997
+ +++++++++++++++++++++++++++++++++
998
+
999
+ Passing multiple functions will yield a column multi-indexed DataFrame.
1000
+ The first level will be the original frame column names; the second level
1001
+ will be the names of the transforming functions.
1002
+
1003
+ .. ipython :: python
1004
+
1005
+ tsdf.transform([np.abs, lambda x : x+ 1 ])
1006
+
1007
+ Passing multiple functions to a Series will yield a DataFrame. The
1008
+ resulting column names will be the transforming functions.
1009
+
1010
+ .. ipython :: python
1011
+
1012
+ tsdf.A.transform([np.abs, lambda x : x+ 1 ])
1013
+
1014
+
1015
+ Transforming with a dict of functions
1016
+ +++++++++++++++++++++++++++++++++++++
1017
+
1018
+
1019
+ Passing a dict of functions will will allow selective transforming per column.
1020
+
1021
+ .. ipython :: python
1022
+
1023
+ tsdf.transform({' A' : np.abs, ' B' : lambda x : x+ 1 })
1024
+
1025
+ Passing a dict of lists will generate a multi-indexed DataFrame with these
1026
+ selective transforms.
1027
+
1028
+ .. ipython :: python
1029
+
1030
+ tsdf.transform({' A' : np.abs, ' B' : [lambda x : x+ 1 , ' sqrt' ]})
1031
+
1032
+ On a Series, passing a dict allows renaming as in ``.agg() ``
1033
+
1034
+ .. ipython :: python
1035
+
1036
+ tsdf.A.transform({' foo' : np.abs})
1037
+ tsdf.A.transform({' foo' : np.abs, ' bar' : [lambda x : x+ 1 , ' sqrt' ]})
1038
+
1039
+
1040
+ .. _basics.elementwise :
1041
+
1042
+ Applying Elementwise Functions
1043
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
840
1044
841
1045
Since not all functions can be vectorized (accept NumPy arrays and return
842
1046
another array or value), the methods :meth: `~DataFrame.applymap ` on DataFrame
0 commit comments