From 460e590fdc6f911c2ec244996b97b7fc3d83d416 Mon Sep 17 00:00:00 2001 From: Wanderson Date: Mon, 5 Dec 2016 00:32:37 -0200 Subject: [PATCH 1/5] documentation of groupby resample expanding --- doc/foo | 5 +++ doc/source/computation.rst | 4 +++ doc/source/groupby.rst | 67 ++++++++++++++++++++++++++++++++++++++ doc/source/timeseries.rst | 5 +++ 4 files changed, 81 insertions(+) create mode 100644 doc/foo diff --git a/doc/foo b/doc/foo new file mode 100644 index 0000000000000..b9f4bd5180238 --- /dev/null +++ b/doc/foo @@ -0,0 +1,5 @@ +,col_1 +0,1 +1,2 +2,'A' +3,4.22 diff --git a/doc/source/computation.rst b/doc/source/computation.rst index 1414d2dd3c8dc..fcb2b296790cd 100644 --- a/doc/source/computation.rst +++ b/doc/source/computation.rst @@ -214,6 +214,10 @@ computing common *window* or *rolling* statistics. Among these are count, sum, mean, median, correlation, variance, covariance, standard deviation, skewness, and kurtosis. +Now the ``rolling()`` and ``expanding()`` functions can be used directly from +DataFrameGroupBy objects, see :ref:`whatsnew docs ` and :ref:`groupby transformation ` + + .. note:: The API for window statistics is quite similar to the way one works with ``GroupBy`` objects, see the documentation :ref:`here ` diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index c5a77770085d6..ca0ebc67e9e11 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -614,6 +614,73 @@ and that the transformed data contains no NAs. grouped.ffill() + +.. _groupby.transform.window_resample: + +New syntax to window and resample operations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.. versionadded:: 0.18.1 + +Working with the resample, expanding or rolling operations on the groupby +level used to require the application of helper functions. However, +now it is possible to use ``resample()``, ``expanding()`` and +``rolling()`` as methods on groupbys. We will find below simple +examples of the application of each of the refered methods: + + +Example of the ``rolling()`` method applied to groupbys: + +.. ipython:: python + + df = pd.DataFrame({'A': [1] * 20 + [2] * 12, + 'B': np.arange(32)}) + + df.groupby('A').rolling(4).B.mean() + +.. note:: + + The example above is grouping the values of the column A, creating + a rolling window of size 4 on the samples of columns B and applying + the average. + + +Example of the ``resample()``: + +.. ipython:: python + + df = pd.DataFrame({'date': pd.date_range(start='2016-01-01', + periods=4, + freq='W'), + 'group': [1, 1, 2, 2], + 'val': [5, 6, 7, 8]}).set_index('date') + + df.groupby('group').resample('1D').ffill() + +.. note:: + + The example above is grouping the values of column ``group``, + applying a resample operation to a daily frequency and completing + the missing values with the ``ffill()`` method. + + + +Example of the ``expanding()``: + +.. ipython:: python + + df = pd.DataFrame({'group': [1] * 5 + [2] * 5, + 'val': [5] * 5 + [1] * 5 }) + + df.groupby('group').expanding().sum() + +.. note:: + + The example above of the ``expanding`` method is grouping the + dataframe into the groups available at the columns groups. The + expanding operation will accumulate the defined operations (sum() + in this example) for all the members of the a particular group. + + .. _groupby.filter: Filtration diff --git a/doc/source/timeseries.rst b/doc/source/timeseries.rst index 854de443ac5ee..8d50b93752f77 100644 --- a/doc/source/timeseries.rst +++ b/doc/source/timeseries.rst @@ -1288,6 +1288,9 @@ limited to, financial applications. ``.resample()`` is a time-based groupby, followed by a reduction method on each of its groups. See some :ref:`cookbook examples ` for some advanced strategies +Now the ``resample()`` function can be used directly from +DataFrameGroupBy objects, see :ref:`whatsnew docs ` and :ref:`groupby transformation ` + .. note:: ``.resample()`` is similar to using a ``.rolling()`` operation with a time-based offset, see a discussion :ref:`here ` @@ -1353,6 +1356,8 @@ retains the input representation. frequency periods. + + Up Sampling ~~~~~~~~~~~ From 6f547675f5f029971697287c73442fe4f09594d3 Mon Sep 17 00:00:00 2001 From: Wanderson Date: Mon, 5 Dec 2016 00:48:01 -0200 Subject: [PATCH 2/5] removing foo --- doc/foo | 5 ----- 1 file changed, 5 deletions(-) delete mode 100644 doc/foo diff --git a/doc/foo b/doc/foo deleted file mode 100644 index b9f4bd5180238..0000000000000 --- a/doc/foo +++ /dev/null @@ -1,5 +0,0 @@ -,col_1 -0,1 -1,2 -2,'A' -3,4.22 From 869eea58ac319817d3239168374f77bf21652a22 Mon Sep 17 00:00:00 2001 From: Wanderson Date: Thu, 8 Dec 2016 08:18:00 -0200 Subject: [PATCH 3/5] updating feedbacks --- doc/source/computation.rst | 5 +++-- doc/source/groupby.rst | 3 +++ 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/doc/source/computation.rst b/doc/source/computation.rst index fcb2b296790cd..716ce83926798 100644 --- a/doc/source/computation.rst +++ b/doc/source/computation.rst @@ -214,8 +214,9 @@ computing common *window* or *rolling* statistics. Among these are count, sum, mean, median, correlation, variance, covariance, standard deviation, skewness, and kurtosis. -Now the ``rolling()`` and ``expanding()`` functions can be used directly from -DataFrameGroupBy objects, see :ref:`whatsnew docs ` and :ref:`groupby transformation ` +Starting in version 0.18.1 the ``rolling()`` and ``expanding()`` +functions can be used directly from DataFrameGroupBy objects, +see :ref:`whatsnew docs ` and :ref:`groupby transformation ` .. note:: diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index ca0ebc67e9e11..16eed909ea82b 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -635,7 +635,10 @@ Example of the ``rolling()`` method applied to groupbys: df = pd.DataFrame({'A': [1] * 20 + [2] * 12, 'B': np.arange(32)}) + df + df.groupby('A').rolling(4).B.mean() + df .. note:: From b0fa25439669ad801ccc33b7035c80fc1f2b3ba8 Mon Sep 17 00:00:00 2001 From: Wanderson Date: Thu, 8 Dec 2016 13:28:21 -0200 Subject: [PATCH 4/5] final modifications --- doc/source/groupby.rst | 54 ++++++++++++--------------------------- doc/source/timeseries.rst | 2 +- 2 files changed, 17 insertions(+), 39 deletions(-) diff --git a/doc/source/groupby.rst b/doc/source/groupby.rst index 16eed909ea82b..ff97775afc2e2 100644 --- a/doc/source/groupby.rst +++ b/doc/source/groupby.rst @@ -624,30 +624,32 @@ New syntax to window and resample operations Working with the resample, expanding or rolling operations on the groupby level used to require the application of helper functions. However, now it is possible to use ``resample()``, ``expanding()`` and -``rolling()`` as methods on groupbys. We will find below simple -examples of the application of each of the refered methods: +``rolling()`` as methods on groupbys. - -Example of the ``rolling()`` method applied to groupbys: +The example below will apply the ``rolling()`` method on the samples of +the column B based on the groups of column A. .. ipython:: python - df = pd.DataFrame({'A': [1] * 20 + [2] * 12, - 'B': np.arange(32)}) - + df = pd.DataFrame({'A': [1] * 10 + [5] * 10, + 'B': np.arange(20)}) df df.groupby('A').rolling(4).B.mean() - df -.. note:: - The example above is grouping the values of the column A, creating - a rolling window of size 4 on the samples of columns B and applying - the average. +The ``expanding()`` method will accumulate a given operation +(``sum()`` in the example) for all the members of each particular +group. + +.. ipython:: python + df.groupby('A').expanding().sum() -Example of the ``resample()``: + +Suppose you want to use the ``resample()`` method to get a daily +frequency in each group of your dataframe and wish to complete the +missing values with the ``ffill()`` method. .. ipython:: python @@ -656,34 +658,10 @@ Example of the ``resample()``: freq='W'), 'group': [1, 1, 2, 2], 'val': [5, 6, 7, 8]}).set_index('date') + df df.groupby('group').resample('1D').ffill() -.. note:: - - The example above is grouping the values of column ``group``, - applying a resample operation to a daily frequency and completing - the missing values with the ``ffill()`` method. - - - -Example of the ``expanding()``: - -.. ipython:: python - - df = pd.DataFrame({'group': [1] * 5 + [2] * 5, - 'val': [5] * 5 + [1] * 5 }) - - df.groupby('group').expanding().sum() - -.. note:: - - The example above of the ``expanding`` method is grouping the - dataframe into the groups available at the columns groups. The - expanding operation will accumulate the defined operations (sum() - in this example) for all the members of the a particular group. - - .. _groupby.filter: Filtration diff --git a/doc/source/timeseries.rst b/doc/source/timeseries.rst index 8d50b93752f77..cc711453f665b 100644 --- a/doc/source/timeseries.rst +++ b/doc/source/timeseries.rst @@ -1288,7 +1288,7 @@ limited to, financial applications. ``.resample()`` is a time-based groupby, followed by a reduction method on each of its groups. See some :ref:`cookbook examples ` for some advanced strategies -Now the ``resample()`` function can be used directly from +Starting in version 0.18.1 ``resample()`` function can be used directly from DataFrameGroupBy objects, see :ref:`whatsnew docs ` and :ref:`groupby transformation ` .. note:: From bf8ee33b3bbe78a36494761c8581bcbdb2119b9c Mon Sep 17 00:00:00 2001 From: Joris Van den Bossche Date: Sat, 10 Dec 2016 12:10:19 +0100 Subject: [PATCH 5/5] remove link to whatsnew (same content as link to groupby docs) --- doc/source/computation.rst | 4 ++-- doc/source/timeseries.rst | 6 ++---- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/doc/source/computation.rst b/doc/source/computation.rst index 716ce83926798..d727424750be5 100644 --- a/doc/source/computation.rst +++ b/doc/source/computation.rst @@ -214,9 +214,9 @@ computing common *window* or *rolling* statistics. Among these are count, sum, mean, median, correlation, variance, covariance, standard deviation, skewness, and kurtosis. -Starting in version 0.18.1 the ``rolling()`` and ``expanding()`` +Starting in version 0.18.1, the ``rolling()`` and ``expanding()`` functions can be used directly from DataFrameGroupBy objects, -see :ref:`whatsnew docs ` and :ref:`groupby transformation ` +see the :ref:`groupby docs `. .. note:: diff --git a/doc/source/timeseries.rst b/doc/source/timeseries.rst index cc711453f665b..9253124f7e8b2 100644 --- a/doc/source/timeseries.rst +++ b/doc/source/timeseries.rst @@ -1288,8 +1288,8 @@ limited to, financial applications. ``.resample()`` is a time-based groupby, followed by a reduction method on each of its groups. See some :ref:`cookbook examples ` for some advanced strategies -Starting in version 0.18.1 ``resample()`` function can be used directly from -DataFrameGroupBy objects, see :ref:`whatsnew docs ` and :ref:`groupby transformation ` +Starting in version 0.18.1, the ``resample()`` function can be used directly from +DataFrameGroupBy objects, see the :ref:`groupby docs `. .. note:: @@ -1356,8 +1356,6 @@ retains the input representation. frequency periods. - - Up Sampling ~~~~~~~~~~~