From e6c2d59e8e1d170f1347a51dc104e5bb6975a108 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Mon, 10 Dec 2018 19:05:43 +0200 Subject: [PATCH 01/15] DOC: Remove doc/source/groupby.rst from flake8-rst exclude section in setup.cfg (#24178) --- setup.cfg | 1 - 1 file changed, 1 deletion(-) diff --git a/setup.cfg b/setup.cfg index 11fd07006fda4..0c4d582022c5d 100644 --- a/setup.cfg +++ b/setup.cfg @@ -79,7 +79,6 @@ exclude = doc/source/basics.rst doc/source/contributing_docstring.rst doc/source/enhancingperf.rst - doc/source/groupby.rst doc/source/indexing.rst doc/source/missing_data.rst doc/source/options.rst From 51c129a477b30af08a102427b184355c5e1cb408 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 13 Dec 2018 20:54:37 +0200 Subject: [PATCH 02/15] DOC: Fix E225 and E999 caused by . with # noqa: E225, E999 --- doc/source/groupby.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index 76481b8cc765a..706f41583c2ec 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -239,7 +239,7 @@ the length of the ``groups`` dict, so it is largely just a convenience: .. ipython:: @verbatim - In [1]: gb. + In [1]: gb. # noqa: E225, E999 gb.agg gb.boxplot gb.cummin gb.describe gb.filter gb.get_group gb.height gb.last gb.median gb.ngroups gb.plot gb.rank gb.std gb.transform gb.aggregate gb.count gb.cumprod gb.dtype gb.first gb.groups gb.hist gb.max gb.min gb.nth gb.prod gb.resample gb.sum gb.var gb.apply gb.cummax gb.cumsum gb.fillna gb.gender gb.head gb.indices gb.mean gb.name gb.ohlc gb.quantile gb.size gb.tail gb.weight From 9209528746c2e0379b237badba777cdc7f6fe6ca Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 13 Dec 2018 21:41:04 +0200 Subject: [PATCH 03/15] DOC: Fix doc/source/groupby.rst:1306:42: F821 undefined name 'report_func' (#24178) --- doc/source/groupby.rst | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index 706f41583c2ec..cd31ae9e9e0ab 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -1301,11 +1301,15 @@ Piping can also be expressive when you want to deliver a grouped object to some arbitrary function, for example: .. code-block:: python + def mean(groupby): + return(groupby.mean()) - df.groupby(['Store', 'Product']).pipe(report_func) + df.groupby(['Store', 'Product']).pipe(mean) -where ``report_func`` takes a GroupBy object and creates a report -from that. +where ``mean`` takes a GroupBy object and finds the mean of the Revenue and Quantity +columns repectively for each Store-Product combination. The ``mean`` function can +be any function that takes in a GroupBy object; the ``.pipe`` will pass the GroupBy +object as a parameter into the function you specify. Examples -------- From 41a2e4791e9a4835fb53ed1f754dc2ad1f9fcaea Mon Sep 17 00:00:00 2001 From: LJArendse Date: Wed, 19 Dec 2018 22:50:29 +0200 Subject: [PATCH 04/15] DOC: Fix the following 'errors' (#24178): doc/source/groupby.rst:72:18: F821 undefined name 'obj' doc/source/groupby.rst:72:30: F821 undefined name 'key' doc/source/groupby.rst:73:18: F821 undefined name 'obj' doc/source/groupby.rst:73:30: F821 undefined name 'key' doc/source/groupby.rst:74:18: F821 undefined name 'obj' doc/source/groupby.rst:74:31: F821 undefined name 'key1' doc/source/groupby.rst:74:37: F821 undefined name 'key2' --- doc/source/groupby.rst | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index cd31ae9e9e0ab..907283d36029f 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -62,16 +62,30 @@ See the :ref:`cookbook` for some advanced strategies. Splitting an object into groups ------------------------------- -pandas objects can be split on any of their axes. The abstract definition of +Pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. To create a GroupBy -object (more on what the GroupBy object is later), you may do the following: +object, see below (more on what the GroupBy object is later). A +groupby can be applied in the following ways to a pandas object: + +* grouped = obj.groupby(key) +* grouped = obj.groupby(key, axis='columns') +* grouped = obj.groupby([key1, key2]) .. code-block:: python - # default is axis=0 - >>> grouped = obj.groupby(key) - >>> grouped = obj.groupby(key, axis=1) - >>> grouped = obj.groupby([key1, key2]) + df = pd.DataFrame( + [('bird', 'Falconiformes', 389.0), + ('bird', 'Psittaciformes', 24.0), + ('mammal', 'Carnivora', 80.2), + ('mammal', 'Primates', np.nan), + ('mammal', 'Carnivora', 58)], + index=['falcon', 'parrot', 'lion', 'monkey', 'leopard'], + columns=('class', 'order', 'max_speed') + ) + + grouped = df.groupby('class') + grouped = df.groupby('order', axis='columns') + grouped = df.groupby(['class', 'order']) The mapping can be specified many different ways: From dec7e639085bc2eb9a8800c9974d285471c59e58 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Wed, 19 Dec 2018 23:09:47 +0200 Subject: [PATCH 05/15] DOC: Remove groupby from flake8-rst section exclusions (redo after merge with master) --- setup.cfg | 1 - 1 file changed, 1 deletion(-) diff --git a/setup.cfg b/setup.cfg index 380100df774c1..c199567737531 100644 --- a/setup.cfg +++ b/setup.cfg @@ -53,7 +53,6 @@ exclude = doc/source/basics.rst doc/source/contributing_docstring.rst doc/source/enhancingperf.rst - doc/source/groupby.rst [yapf] From e5e05dc895a2d37db8ce120124a6ee2fd669d0e5 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 10:16:10 +0200 Subject: [PATCH 06/15] Revert "DOC: Remove groupby from flake8-rst section exclusions (redo after merge with master)" This reverts commit dec7e639085bc2eb9a8800c9974d285471c59e58. --- setup.cfg | 1 + 1 file changed, 1 insertion(+) diff --git a/setup.cfg b/setup.cfg index c199567737531..380100df774c1 100644 --- a/setup.cfg +++ b/setup.cfg @@ -53,6 +53,7 @@ exclude = doc/source/basics.rst doc/source/contributing_docstring.rst doc/source/enhancingperf.rst + doc/source/groupby.rst [yapf] From b80ff177293437d0a306127e407ba8c671dd6b4f Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 10:17:11 +0200 Subject: [PATCH 07/15] Revert "DOC: Fix the following 'errors' (#24178):" This reverts commit 41a2e4791e9a4835fb53ed1f754dc2ad1f9fcaea. --- doc/source/groupby.rst | 26 ++++++-------------------- 1 file changed, 6 insertions(+), 20 deletions(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index 907283d36029f..cd31ae9e9e0ab 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -62,30 +62,16 @@ See the :ref:`cookbook` for some advanced strategies. Splitting an object into groups ------------------------------- -Pandas objects can be split on any of their axes. The abstract definition of +pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. To create a GroupBy -object, see below (more on what the GroupBy object is later). A -groupby can be applied in the following ways to a pandas object: - -* grouped = obj.groupby(key) -* grouped = obj.groupby(key, axis='columns') -* grouped = obj.groupby([key1, key2]) +object (more on what the GroupBy object is later), you may do the following: .. code-block:: python - df = pd.DataFrame( - [('bird', 'Falconiformes', 389.0), - ('bird', 'Psittaciformes', 24.0), - ('mammal', 'Carnivora', 80.2), - ('mammal', 'Primates', np.nan), - ('mammal', 'Carnivora', 58)], - index=['falcon', 'parrot', 'lion', 'monkey', 'leopard'], - columns=('class', 'order', 'max_speed') - ) - - grouped = df.groupby('class') - grouped = df.groupby('order', axis='columns') - grouped = df.groupby(['class', 'order']) + # default is axis=0 + >>> grouped = obj.groupby(key) + >>> grouped = obj.groupby(key, axis=1) + >>> grouped = obj.groupby([key1, key2]) The mapping can be specified many different ways: From 28eb57f0f731b72e901158161792a0d364c1a661 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 11:38:47 +0200 Subject: [PATCH 08/15] DOC: Add blank line --- doc/source/groupby.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index cd31ae9e9e0ab..99e57c945b3c2 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -1301,6 +1301,7 @@ Piping can also be expressive when you want to deliver a grouped object to some arbitrary function, for example: .. code-block:: python + def mean(groupby): return(groupby.mean()) From 402b5832b32f8fb26b09e0c1753e2d24fd38dadd Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 14:27:06 +0200 Subject: [PATCH 09/15] DOC: Change code-block to ipython --- doc/source/groupby.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index 99e57c945b3c2..6347a478474c4 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -1300,7 +1300,7 @@ Now, to find prices per store/product, we can simply do: Piping can also be expressive when you want to deliver a grouped object to some arbitrary function, for example: -.. code-block:: python +.. ipython:: python def mean(groupby): return(groupby.mean()) From 00b03f0c5f2e1d05f1da98faafb37ab0575f3fcd Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 14:46:07 +0200 Subject: [PATCH 10/15] DOC: Remove return statement brackets --- doc/source/groupby.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index 6347a478474c4..f72db0ff4a279 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -1303,7 +1303,7 @@ arbitrary function, for example: .. ipython:: python def mean(groupby): - return(groupby.mean()) + return groupby.mean() df.groupby(['Store', 'Product']).pipe(mean) From c30f4567b60298836f867515f0f761da8807a3f0 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 15:30:26 +0200 Subject: [PATCH 11/15] DOC: Reapply code example fix from 41a2e4791e9a4835fb53ed1f754dc2ad1f9fcaea --- doc/source/groupby.rst | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index f72db0ff4a279..480df0b15170c 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -66,12 +66,24 @@ pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. To create a GroupBy object (more on what the GroupBy object is later), you may do the following: -.. code-block:: python +.. ipython:: python + + df = pd.DataFrame( + [('bird', 'Falconiformes', 389.0), + ('bird', 'Psittaciformes', 24.0), + ('mammal', 'Carnivora', 80.2), + ('mammal', 'Primates', np.nan), + ('mammal', 'Carnivora', 58)], + index=['falcon', 'parrot', 'lion', 'monkey', 'leopard'], + columns=('class', 'order', 'max_speed') + ) + + df - # default is axis=0 - >>> grouped = obj.groupby(key) - >>> grouped = obj.groupby(key, axis=1) - >>> grouped = obj.groupby([key1, key2]) + # default is axis=0 + grouped = df.groupby('class') + grouped = df.groupby('order', axis='columns') + grouped = df.groupby(['class', 'order']) The mapping can be specified many different ways: From ec76a49b1cd4390c17bae588cefe2d39e9730b49 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 15:37:17 +0200 Subject: [PATCH 12/15] DOC: Remove doc/source/groupby.rst from setup.cfg --- setup.cfg | 1 - 1 file changed, 1 deletion(-) diff --git a/setup.cfg b/setup.cfg index 380100df774c1..c199567737531 100644 --- a/setup.cfg +++ b/setup.cfg @@ -53,7 +53,6 @@ exclude = doc/source/basics.rst doc/source/contributing_docstring.rst doc/source/enhancingperf.rst - doc/source/groupby.rst [yapf] From 38f2d103fa6a6ef29ad6f4b5ccc0fd97c25b7e76 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 15:57:56 +0200 Subject: [PATCH 13/15] Revert "DOC: Reapply code example fix from 41a2e4791e9a4835fb53ed1f754dc2ad1f9fcaea" This reverts commit c30f4567b60298836f867515f0f761da8807a3f0. --- doc/source/groupby.rst | 22 +++++----------------- 1 file changed, 5 insertions(+), 17 deletions(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index 480df0b15170c..f72db0ff4a279 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -66,24 +66,12 @@ pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. To create a GroupBy object (more on what the GroupBy object is later), you may do the following: -.. ipython:: python - - df = pd.DataFrame( - [('bird', 'Falconiformes', 389.0), - ('bird', 'Psittaciformes', 24.0), - ('mammal', 'Carnivora', 80.2), - ('mammal', 'Primates', np.nan), - ('mammal', 'Carnivora', 58)], - index=['falcon', 'parrot', 'lion', 'monkey', 'leopard'], - columns=('class', 'order', 'max_speed') - ) - - df +.. code-block:: python - # default is axis=0 - grouped = df.groupby('class') - grouped = df.groupby('order', axis='columns') - grouped = df.groupby(['class', 'order']) + # default is axis=0 + >>> grouped = obj.groupby(key) + >>> grouped = obj.groupby(key, axis=1) + >>> grouped = obj.groupby([key1, key2]) The mapping can be specified many different ways: From 3d50cf716eab91b1ae859c625fb0cfb475d41679 Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 16:08:28 +0200 Subject: [PATCH 14/15] DOC: Fix the following flake8 issues doc/source/groupby.rst:72:18: F821 undefined name 'obj' doc/source/groupby.rst:72:30: F821 undefined name 'key' doc/source/groupby.rst:73:18: F821 undefined name 'obj' doc/source/groupby.rst:73:30: F821 undefined name 'key' doc/source/groupby.rst:74:18: F821 undefined name 'obj' doc/source/groupby.rst:74:31: F821 undefined name 'key1' doc/source/groupby.rst:74:37: F821 undefined name 'key2' --- doc/source/groupby.rst | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index f72db0ff4a279..debbe002f0b4d 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -66,12 +66,23 @@ pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. To create a GroupBy object (more on what the GroupBy object is later), you may do the following: -.. code-block:: python +.. ipython:: python + + df = pd.DataFrame( + [('bird', 'Falconiformes', 389.0), + ('bird', 'Psittaciformes', 24.0), + ('mammal', 'Carnivora', 80.2), + ('mammal', 'Primates', np.nan), + ('mammal', 'Carnivora', 58)], + index=['falcon', 'parrot', 'lion', 'monkey', 'leopard'], + columns=('class', 'order', 'max_speed') + ) + df - # default is axis=0 - >>> grouped = obj.groupby(key) - >>> grouped = obj.groupby(key, axis=1) - >>> grouped = obj.groupby([key1, key2]) + # default is axis=0 + grouped = df.groupby('class') + grouped = df.groupby('order', axis='columns') + grouped = df.groupby(['class', 'order']) The mapping can be specified many different ways: From 84c0110c605c299718a8ae220c2b026a2520b56b Mon Sep 17 00:00:00 2001 From: LJArendse Date: Thu, 27 Dec 2018 17:28:19 +0200 Subject: [PATCH 15/15] DOC: Fix dataframe indentation --- doc/source/groupby.rst | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index debbe002f0b4d..a37aa2644a805 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -68,15 +68,13 @@ object (more on what the GroupBy object is later), you may do the following: .. ipython:: python - df = pd.DataFrame( - [('bird', 'Falconiformes', 389.0), - ('bird', 'Psittaciformes', 24.0), - ('mammal', 'Carnivora', 80.2), - ('mammal', 'Primates', np.nan), - ('mammal', 'Carnivora', 58)], - index=['falcon', 'parrot', 'lion', 'monkey', 'leopard'], - columns=('class', 'order', 'max_speed') - ) + df = pd.DataFrame([('bird', 'Falconiformes', 389.0), + ('bird', 'Psittaciformes', 24.0), + ('mammal', 'Carnivora', 80.2), + ('mammal', 'Primates', np.nan), + ('mammal', 'Carnivora', 58)], + index=['falcon', 'parrot', 'lion', 'monkey', 'leopard'], + columns=('class', 'order', 'max_speed')) df # default is axis=0