@@ -188,21 +188,33 @@ class providing the base-class of operations.
188
188
>>> df = pd.DataFrame({'A': 'a a b'.split(),
189
189
... 'B': [1,2,3],
190
190
... 'C': [4,6,5]})
191
- >>> g = df.groupby('A')
191
+ >>> g1 = df.groupby('A', group_keys=False)
192
+ >>> g2 = df.groupby('A', group_keys=True)
192
193
193
- Notice that ``g`` has two groups, ``a`` and ``b``.
194
- Calling `apply` in various ways, we can get different grouping results:
194
+ Notice that ``g1`` have ``g2`` have two groups, ``a`` and ``b``, and only
195
+ differ in their ``group_keys`` argument. Calling `apply` in various ways,
196
+ we can get different grouping results:
195
197
196
198
Example 1: below the function passed to `apply` takes a DataFrame as
197
199
its argument and returns a DataFrame. `apply` combines the result for
198
200
each group together into a new DataFrame:
199
201
200
- >>> g [['B', 'C']].apply(lambda x: x / x.sum())
202
+ >>> g1 [['B', 'C']].apply(lambda x: x / x.sum())
201
203
B C
202
204
0 0.333333 0.4
203
205
1 0.666667 0.6
204
206
2 1.000000 1.0
205
207
208
+ In the above, the groups are not part of the index. We can have them included
209
+ by using ``g2`` where ``group_keys=True``:
210
+
211
+ >>> g2[['B', 'C']].apply(lambda x: x / x.sum())
212
+ B C
213
+ A
214
+ a 0 0.333333 0.4
215
+ 1 0.666667 0.6
216
+ b 2 1.000000 1.0
217
+
206
218
Example 2: The function passed to `apply` takes a DataFrame as
207
219
its argument and returns a Series. `apply` combines the result for
208
220
each group together into a new DataFrame.
@@ -211,28 +223,41 @@ class providing the base-class of operations.
211
223
212
224
The resulting dtype will reflect the return value of the passed ``func``.
213
225
214
- >>> g[['B', 'C']].apply(lambda x: x.astype(float).max() - x.min())
226
+ >>> g1[['B', 'C']].apply(lambda x: x.astype(float).max() - x.min())
227
+ B C
228
+ A
229
+ a 1.0 2.0
230
+ b 0.0 0.0
231
+
232
+ >>> g2[['B', 'C']].apply(lambda x: x.astype(float).max() - x.min())
215
233
B C
216
234
A
217
235
a 1.0 2.0
218
236
b 0.0 0.0
219
237
238
+ The ``group_keys`` argument has no effect here because the result is not
239
+ like-indexed (i.e. :ref:`a transform <groupby.transform>`) when compared
240
+ to the input.
241
+
220
242
Example 3: The function passed to `apply` takes a DataFrame as
221
243
its argument and returns a scalar. `apply` combines the result for
222
244
each group together into a Series, including setting the index as
223
245
appropriate:
224
246
225
- >>> g .apply(lambda x: x.C.max() - x.B.min())
247
+ >>> g1 .apply(lambda x: x.C.max() - x.B.min())
226
248
A
227
249
a 5
228
250
b 2
229
251
dtype: int64""" ,
230
252
"series_examples" : """
231
253
>>> s = pd.Series([0, 1, 2], index='a a b'.split())
232
- >>> g = s.groupby(s.index)
254
+ >>> g1 = s.groupby(s.index, group_keys=False)
255
+ >>> g2 = s.groupby(s.index, group_keys=True)
233
256
234
257
From ``s`` above we can see that ``g`` has two groups, ``a`` and ``b``.
235
- Calling `apply` in various ways, we can get different grouping results:
258
+ Notice that ``g1`` have ``g2`` have two groups, ``a`` and ``b``, and only
259
+ differ in their ``group_keys`` argument. Calling `apply` in various ways,
260
+ we can get different grouping results:
236
261
237
262
Example 1: The function passed to `apply` takes a Series as
238
263
its argument and returns a Series. `apply` combines the result for
@@ -242,18 +267,36 @@ class providing the base-class of operations.
242
267
243
268
The resulting dtype will reflect the return value of the passed ``func``.
244
269
245
- >>> g .apply(lambda x: x*2 if x.name == 'a' else x/2)
270
+ >>> g1 .apply(lambda x: x*2 if x.name == 'a' else x/2)
246
271
a 0.0
247
272
a 2.0
248
273
b 1.0
249
274
dtype: float64
250
275
276
+ In the above, the groups are not part of the index. We can have them included
277
+ by using ``g2`` where ``group_keys=True``:
278
+
279
+ >>> g2.apply(lambda x: x*2 if x.name == 'a' else x/2)
280
+ a a 0.0
281
+ a 2.0
282
+ b b 1.0
283
+ dtype: float64
284
+
251
285
Example 2: The function passed to `apply` takes a Series as
252
286
its argument and returns a scalar. `apply` combines the result for
253
287
each group together into a Series, including setting the index as
254
288
appropriate:
255
289
256
- >>> g.apply(lambda x: x.max() - x.min())
290
+ >>> g1.apply(lambda x: x.max() - x.min())
291
+ a 1
292
+ b 0
293
+ dtype: int64
294
+
295
+ The ``group_keys`` argument has no effect here because the result is not
296
+ like-indexed (i.e. :ref:`a transform <groupby.transform>`) when compared
297
+ to the input.
298
+
299
+ >>> g2.apply(lambda x: x.max() - x.min())
257
300
a 1
258
301
b 0
259
302
dtype: int64""" ,
0 commit comments