@@ -26,6 +26,70 @@ Other Enhancements
26
26
Backwards incompatible API changes
27
27
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
28
28
29
+ Fast GroupBy.apply on ``DataFrame `` evaluates first group only once
30
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
31
+
32
+ (:issue: `2936 `, :issue: `2656 `, :issue: `7739 `, :issue: `10519 `, :issue: `12155 `,
33
+ :issue: `20084 `, :issue: `21417 `)
34
+
35
+ The implementation of ``DataFrame.groupby.apply `` previously evaluated func
36
+ consistently twice on the first group to infer if it is safe to use a fast
37
+ code path. Particularly for functions with side effects, this was an undesired
38
+ behavior and may have led to surprises.
39
+
40
+ The new behavior is that the first group is no longer evaluated twice if the
41
+ fast path can be used.
42
+
43
+ Previous behavior:
44
+
45
+ .. code-block :: ipython
46
+
47
+ In [2]: df = pd.DataFrame({"a": ["x", "y"], "b": [1, 2]})
48
+
49
+ In [3]: side_effects = []
50
+
51
+ In [4]: def func_fast_apply(group):
52
+ ...: side_effects.append(group.name)
53
+ ...: return len(group)
54
+ ...:
55
+
56
+ In [5]: df.groupby("a").apply(func_fast_apply)
57
+
58
+ In [6]: assert side_effects == ["x", "x", "y"]
59
+
60
+ New behavior:
61
+
62
+ .. ipython :: python
63
+
64
+ df = pd.DataFrame({" a" : [" x" , " y" ], " b" : [1 , 2 ]})
65
+
66
+ side_effects = []
67
+ def func_fast_apply (group ):
68
+ """
69
+ This func doesn't modify inplace and returns
70
+ a scalar which is safe to fast apply
71
+ """
72
+ side_effects.append(group.name)
73
+ return len (group)
74
+
75
+ df.groupby(" a" ).apply(func_fast_apply)
76
+ side_effects
77
+
78
+ side_effects.clear()
79
+ def identity (group ):
80
+ """
81
+ This triggers the slow path because ``identity(group) is group``
82
+ If there is no inplace modification happening
83
+ this may be avoided by returning a shallow copy
84
+ i.e. return group.copy()
85
+ """
86
+ side_effects.append(group.name)
87
+ return group
88
+
89
+ df.groupby(" a" ).apply(identity)
90
+ side_effects
91
+
92
+
29
93
.. _whatsnew_0250.api.other :
30
94
31
95
Other API Changes
0 commit comments