@@ -700,7 +700,8 @@ on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.
700
700
701
701
1. `Tablewise Function Application `_: :meth: `~DataFrame.pipe `
702
702
2. `Row or Column-wise Function Application `_: :meth: `~DataFrame.apply `
703
- 3. Elementwise _ function application: :meth: `~DataFrame.applymap `
703
+ 3. `Aggregation API `_: :meth: `~DataFrame.agg ` and :meth: `~DataFrame.transform `
704
+ 4. `Applying Elementwise Functions `_: :meth: `~DataFrame.applymap `
704
705
705
706
.. _basics.pipe :
706
707
@@ -776,6 +777,13 @@ statistics methods, take an optional ``axis`` argument:
776
777
df.apply(np.cumsum)
777
778
df.apply(np.exp)
778
779
780
+ ``.apply() `` will also dispatch on a string method name.
781
+
782
+ .. ipython :: python
783
+
784
+ df.apply(' mean' )
785
+ df.apply(' mean' , axis = 1 )
786
+
779
787
Depending on the return type of the function passed to :meth: `~DataFrame.apply `,
780
788
the result will either be of lower dimension or the same dimension.
781
789
@@ -825,16 +833,186 @@ set to True, the passed function will instead receive an ndarray object, which
825
833
has positive performance implications if you do not need the indexing
826
834
functionality.
827
835
828
- .. seealso ::
836
+ .. _basics.aggregate :
837
+
838
+ Aggregation API
839
+ ~~~~~~~~~~~~~~~
840
+
841
+ .. versionadded :: 0.20.0
842
+
843
+ The aggregation APi allows one to express possibly multiple aggregation operations in a single concise way.
844
+ This API is similar across pandas objects, :ref: `groupby aggregates <groupby.aggregate >`,
845
+ :ref: `window functions <stats.aggregate >`, and the :ref: `resample API <timeseries.aggregate >`.
846
+
847
+ We will use a similar starting frame from above.
848
+
849
+ .. ipython :: python
850
+
851
+ tsdf = pd.DataFrame(np.random.randn(10 , 3 ), columns = [' A' , ' B' , ' C' ],
852
+ index = pd.date_range(' 1/1/2000' , periods = 10 ))
853
+ tsdf.iloc[3 :7 ] = np.nan
854
+ tsdf
855
+
856
+ Using a single function is equivalent to ``.apply ``; You can also pass named methods as strings.
857
+ This will return a Series of the output.
858
+
859
+ .. ipython :: python
860
+
861
+ tsdf.agg(np.sum)
862
+
863
+ tsdf.agg(' sum' )
864
+
865
+ On a Series this will result in a scalar value
866
+
867
+ .. ipython :: python
868
+
869
+ tsdf.A.agg(' sum' )
870
+
871
+
872
+ Aggregating multiple functions at once
873
+ ++++++++++++++++++++++++++++++++++++++
874
+
875
+ You can pass arguments as a list. The results of each of the passed functions will be a row in the resultant DataFrame.
876
+ These are naturally named from the aggregation function.
877
+
878
+ .. ipython :: python
879
+
880
+ tsdf.agg([' sum' ])
881
+
882
+ Multiple functions yield multiple rows.
883
+
884
+ .. ipython :: python
885
+
886
+ tsdf.agg([' sum' , ' mean' ])
887
+
888
+ On a Series, multiple functions return a Series.
889
+
890
+ .. ipython :: python
891
+
892
+ tsdf.A.agg([' sum' , ' mean' ])
893
+
894
+
895
+ Aggregating with a dict of functions
896
+ ++++++++++++++++++++++++++++++++++++
897
+
898
+ Passing a dictionary of column name to function or list of functions, to ``DataFame.agg ``
899
+ allows you to customize which functions are applied to which columns.
900
+
901
+ .. ipython :: python
902
+
903
+ tsdf.agg({' A' : ' mean' , ' B' : ' sum' })
904
+
905
+ Passing a list-like will generate a DataFrame output. You will get a matrix-like output
906
+ of all of the aggregators; some may be missing values.
907
+
908
+ .. ipython :: python
909
+
910
+ tsdf.agg({' A' : [' mean' , ' min' ], ' B' : ' sum' })
911
+
912
+ For a Series, you can pass a dict; the keys will set the name of the column
913
+
914
+ .. ipython :: python
915
+
916
+ tsdf.A.agg({' foo' : [' sum' , ' mean' ]})
917
+
918
+ Multiple keys will yield multiple columns.
919
+
920
+ .. ipython :: python
921
+
922
+ tsdf.A.agg({' foo' : [' sum' , ' mean' ], ' bar' : [' min' , ' max' , lambda x : x.sum()+ 1 ]})
923
+
924
+
925
+ .. _basics.transform :
926
+
927
+ Transform API
928
+ ~~~~~~~~~~~~~
929
+
930
+ .. versionadded :: 0.20.0
931
+
932
+ The ``transform `` method returns an object that is indexed the same (same size)
933
+ as the original. This API allows you to provide *multiple * operations at the same
934
+ time rather than one-by-one. Its api is quite similar to the ``.agg `` API.
935
+
936
+ Use a similar frame to the above sections.
937
+
938
+ .. ipython :: python
939
+
940
+ tsdf = pd.DataFrame(np.random.randn(10 , 3 ), columns = [' A' , ' B' , ' C' ],
941
+ index = pd.date_range(' 1/1/2000' , periods = 10 ))
942
+ tsdf.iloc[3 :7 ] = np.nan
943
+ tsdf
944
+
945
+ Transform the entire frame. Transform allows functions to input as a numpy function, string
946
+ function name and user defined function.
947
+
948
+ .. ipython :: python
949
+
950
+ tsdf.transform(np.abs)
951
+ tsdf.transform(' abs' )
952
+ tsdf.transform(lambda x : x.abs())
953
+
954
+ ``.transform() `` with a single function is equivalent to applying a function across the
955
+ columns.
956
+
957
+ .. ipython :: python
958
+
959
+ tsdf.apply(np.abs, axis = 1 )
960
+
961
+ Passing a single function to ``.transform() `` with a Series will yield a single Series in return.
962
+
963
+ .. ipython :: python
964
+
965
+ tsdf.A.transform(np.abs)
829
966
830
- The section on :ref: `GroupBy <groupby >` demonstrates related, flexible
831
- functionality for grouping by some criterion, applying, and combining the
832
- results into a Series, DataFrame, etc.
833
967
834
- .. _Elementwise :
968
+ Transform with multiple functions
969
+ +++++++++++++++++++++++++++++++++
835
970
836
- Applying elementwise Python functions
837
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
971
+ Passing multiple functions will yield a column multi-indexed DataFrame.
972
+ The first level will be the original frame column names; the second level
973
+ will be the names of the transforming functions.
974
+
975
+ .. ipython :: python
976
+
977
+ tsdf.transform([np.abs, lambda x : x+ 1 ])
978
+
979
+ Passing multiple functions to a Series will yield a DataFrame. The
980
+ resulting column names will be the transforming functions.
981
+
982
+ .. ipython :: python
983
+
984
+ tsdf.A.transform([np.abs, lambda x : x+ 1 ])
985
+
986
+
987
+ Transforming with a dict of functions
988
+ +++++++++++++++++++++++++++++++++++++
989
+
990
+
991
+ Passing a dict of functions will will allow selective transforming per column.
992
+
993
+ .. ipython :: python
994
+
995
+ tsdf.transform({' A' : np.abs, ' B' : lambda x : x+ 1 })
996
+
997
+ Passing a dict of lists will generate a multi-indexed DataFrame with these
998
+ selective transforms.
999
+
1000
+ .. ipython :: python
1001
+
1002
+ tsdf.transform({' A' : np.abs, ' B' : [lambda x : x+ 1 , ' sqrt' ]})
1003
+
1004
+ On a Series, passing a dict allows renaming as in ``.agg() ``
1005
+
1006
+ .. ipython :: python
1007
+
1008
+ tsdf.A.transform({' foo' : np.abs})
1009
+ tsdf.A.transform({' foo' : np.abs, ' bar' : [lambda x : x+ 1 , ' sqrt' ]})
1010
+
1011
+
1012
+ .. _basics.elementwise :
1013
+
1014
+ Applying Elementwise Functions
1015
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
838
1016
839
1017
Since not all functions can be vectorized (accept NumPy arrays and return
840
1018
another array or value), the methods :meth: `~DataFrame.applymap ` on DataFrame
0 commit comments