@@ -105,9 +105,9 @@ consider the following DataFrame:
105
105
.. versionadded :: 0.20
106
106
107
107
A string passed to ``groupby `` may refer to either a column or an index level.
108
- If a string matches both a column and an index level then a warning is issued
109
- and the column takes precedence. This will result in an ambiguity error in a
110
- future version.
108
+ If a string matches both a column name and an index level name then a warning is
109
+ issued and the column takes precedence. This will result in an ambiguity error
110
+ in a future version.
111
111
112
112
.. ipython :: python
113
113
@@ -247,17 +247,6 @@ the length of the ``groups`` dict, so it is largely just a convenience:
247
247
gb.aggregate gb.count gb.cumprod gb.dtype gb.first gb.groups gb.hist gb.max gb.min gb.nth gb.prod gb.resample gb.sum gb.var
248
248
gb.apply gb.cummax gb.cumsum gb.fillna gb.gender gb.head gb.indices gb.mean gb.name gb.ohlc gb.quantile gb.size gb.tail gb.weight
249
249
250
-
251
- .. ipython :: python
252
- :suppress:
253
-
254
- df = pd.DataFrame({' A' : [' foo' , ' bar' , ' foo' , ' bar' ,
255
- ' foo' , ' bar' , ' foo' , ' foo' ],
256
- ' B' : [' one' , ' one' , ' two' , ' three' ,
257
- ' two' , ' two' , ' one' , ' three' ],
258
- ' C' : np.random.randn(8 ),
259
- ' D' : np.random.randn(8 )})
260
-
261
250
.. _groupby.multiindex :
262
251
263
252
GroupBy with MultiIndex
@@ -299,7 +288,9 @@ chosen level:
299
288
300
289
s.sum(level = ' second' )
301
290
302
- Also as of v0.6, grouping with multiple levels is supported.
291
+ .. versionadded :: 0.6
292
+
293
+ Grouping with multiple levels is supported.
303
294
304
295
.. ipython :: python
305
296
:suppress:
@@ -316,15 +307,73 @@ Also as of v0.6, grouping with multiple levels is supported.
316
307
s
317
308
s.groupby(level = [' first' , ' second' ]).sum()
318
309
310
+ .. versionadded :: 0.20
311
+
312
+ Index level names may be supplied as keys.
313
+
314
+ .. ipython :: python
315
+
316
+ s.groupby([' first' , ' second' ]).sum()
317
+
319
318
More on the ``sum `` function and aggregation later.
320
319
320
+ Grouping DataFrame with Index Levels and Columns
321
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
322
+ A DataFrame may be grouped by a combination of columns and index levels by
323
+ specifying the column names as strings and the index levels as ``pd.Grouper ``
324
+ objects.
325
+
326
+ .. ipython :: python
327
+
328
+ arrays = [[' bar' , ' bar' , ' baz' , ' baz' , ' foo' , ' foo' , ' qux' , ' qux' ],
329
+ [' one' , ' two' , ' one' , ' two' , ' one' , ' two' , ' one' , ' two' ]]
330
+
331
+ index = pd.MultiIndex.from_arrays(arrays, names = [' first' , ' second' ])
332
+
333
+ df = pd.DataFrame({' A' : [1 , 1 , 1 , 1 , 2 , 2 , 3 , 3 ],
334
+ ' B' : np.arange(8 )},
335
+ index = index)
336
+
337
+ df
338
+
339
+ The following example groups ``df `` by the ``second `` index level and
340
+ the ``A `` column.
341
+
342
+ .. ipython :: python
343
+
344
+ df.groupby([pd.Grouper(level = 1 ), ' A' ]).sum()
345
+
346
+ Index levels may also be specified by name.
347
+
348
+ .. ipython :: python
349
+
350
+ df.groupby([pd.Grouper(level = ' second' ), ' A' ]).sum()
351
+
352
+ .. versionadded :: 0.20
353
+
354
+ Index level names may be specified as keys directly to ``groupby ``.
355
+
356
+ .. ipython :: python
357
+
358
+ df.groupby([' second' , ' A' ]).sum()
359
+
321
360
DataFrame column selection in GroupBy
322
361
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
323
362
324
363
Once you have created the GroupBy object from a DataFrame, for example, you
325
364
might want to do something different for each of the columns. Thus, using
326
365
``[] `` similar to getting a column from a DataFrame, you can do:
327
366
367
+ .. ipython :: python
368
+ :suppress:
369
+
370
+ df = pd.DataFrame({' A' : [' foo' , ' bar' , ' foo' , ' bar' ,
371
+ ' foo' , ' bar' , ' foo' , ' foo' ],
372
+ ' B' : [' one' , ' one' , ' two' , ' three' ,
373
+ ' two' , ' two' , ' one' , ' three' ],
374
+ ' C' : np.random.randn(8 ),
375
+ ' D' : np.random.randn(8 )})
376
+
328
377
.. ipython :: python
329
378
330
379
grouped = df.groupby([' A' ])
0 commit comments