BUG: groupby with resample using on parameter errors when selecting column to apply function #19433

discort · 2018-01-28T16:56:33Z

closes BUG: groupby with resample using on parameter errors when selecting column to apply function #17813
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

codecov · 2018-01-29T17:25:06Z

Codecov Report

Merging #19433 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #19433      +/-   ##
==========================================
- Coverage   91.62%   91.59%   -0.03%     
==========================================
  Files         150      150              
  Lines       48714    48720       +6     
==========================================
- Hits        44633    44625       -8     
- Misses       4081     4095      +14

Flag	Coverage Δ
#multiple	`89.96% <100%> (-0.03%)`	⬇️
#single	`41.75% <22.22%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/groupby.py	`92.17% <100%> (+0.02%)`	⬆️
pandas/plotting/_converter.py	`65.22% <0%> (-1.74%)`	⬇️
pandas/util/testing.py	`83.64% <0%> (-0.21%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update de39a15...2f25d40. Read the comment docs.

discort · 2018-01-29T18:03:57Z

@jreback

jreback · 2018-01-30T00:20:14Z

try with something like this. your patch is just working around the issue and not fixing it at the source. its pretty deep code, but try here.

diff --git a/pandas/core/groupby.py b/pandas/core/groupby.py
index 4c03f3bfa..6a2cd69eb 100644
--- a/pandas/core/groupby.py
+++ b/pandas/core/groupby.py
@@ -466,11 +466,15 @@ class Grouper(object):
                 "The Grouper cannot specify both a key and a level!")
 
         # the key must be a valid info item
-        if self.key is not None and not isinstance(obj, Series):
+        if self.key is not None:
             key = self.key
-            if key not in obj._info_axis:
-                raise KeyError("The grouper name {0} is not found".format(key))
-            ax = Index(obj[key], name=key)
+
+            if getattr(self.grouper, 'name', None) == key:
+                ax = self.grouper
+            else:
+                if key not in obj._info_axis:
+                    raise KeyError("The grouper name {0} is not found".format(key))
+                ax = Index(obj[key], name=key)
 
         else:
             ax = obj._get_axis(self.axis)
diff --git a/pandas/core/resample.py b/pandas/core/resample.py
index 03b522abc..43391061d 100644
--- a/pandas/core/resample.py
+++ b/pandas/core/resample.py
@@ -743,11 +743,6 @@ class _GroupByMixin(GroupByMixin):
 
             return x.apply(f, **kwargs)
 
-        # Patch the index in case we're applying after __getitem__. GH 17813
-        if self.groupby.key:
-            key = self.groupby.key
-            self._groupby.obj.index = Index(self.groupby.obj[key], key=key)
-
         result = self._groupby.apply(func)
         return self._wrap_result(result)

discort · 2018-01-30T08:10:14Z

I applied your diff. The result, in this case, would be:

In [11]: result
Out[11]:
id  date
a   2016-01-01    2
    2016-01-03    0
    2016-01-05    0
b   2016-01-01    2
    2016-01-03    1
    2016-01-05    0
Name: data, dtype: int64

instead of:

In [10]: exp
Out[10]:
id  date
a   2016-01-01    2
b   2016-01-03    2
    2016-01-05    1
Name: data, dtype: int64

But I got your idea. The main challenge is that self.grouper is overriding here. Could you take a look at newest solution? @jreback

jreback

change looks good. some comments. ping on green.

jreback · 2018-02-04T15:25:14Z

doc/source/whatsnew/v0.23.0.txt

@@ -548,7 +548,7 @@ Groupby/Resample/Rolling
 - Bug in :func:`DataFrame.resample` which silently ignored unsupported (or mistyped) options for ``label``, ``closed`` and ``convention`` (:issue:`19303`)
 - Bug in :func:`DataFrame.groupby` where tuples were interpreted as lists of keys rather than as keys (:issue:`17979`, :issue:`18249`)
 - Bug in ``transform`` where particular aggregation functions were being incorrectly cast to match the dtype(s) of the grouped data (:issue:`19200`)
-
+- Bug in `DataFrame.groupby` with resample using `on` parameter when selecting column to apply function (:issue:`17813`)


passing the on= kwarg, and subsequently using .apply() (not the double backticks on both)

jreback · 2018-02-04T15:26:09Z

pandas/core/groupby.py

-            if key not in obj._info_axis:
-                raise KeyError("The grouper name {0} is not found".format(key))
-            ax = Index(obj[key], name=key)
+            if getattr(self.grouper, 'name', None) == key and \


can you add a comment here about what this is doing (IOW the on is already defined).

use ABCSeries rather than Series

jreback · 2018-02-04T15:27:37Z

pandas/tests/test_resample.py

@@ -252,6 +252,15 @@ def test_pipe(self):
        result = r.pipe(lambda x: x.max() - x.mean())
        tm.assert_frame_equal(result, expected)

+    def test_groupby_resample_on_api_with_getitem(self):


can you move this a bit lower down in the file (immediately after the other getitem tests)

discort · 2018-02-05T09:13:57Z

@jreback

jreback · 2018-02-05T11:14:50Z

thanks @discort

…olumn to apply function closes pandas-dev#17813 Author: discort <[email protected]> Closes pandas-dev#19433 from discort/fix_17813 and squashes the following commits: 2f25d40 [discort] Fixed bug in df.resample using 'on' parameter

discort force-pushed the fix_17813 branch 3 times, most recently from e4ca463 to 836139e Compare January 29, 2018 14:36

jreback added Bug Resample resample method labels Jan 30, 2018

discort force-pushed the fix_17813 branch from 836139e to 33a1202 Compare January 30, 2018 08:06

discort force-pushed the fix_17813 branch 2 times, most recently from 838c058 to b4dbb59 Compare February 4, 2018 06:56

jreback requested changes Feb 4, 2018

View reviewed changes

Fixed bug in df.resample using 'on' parameter

2f25d40

discort force-pushed the fix_17813 branch from b4dbb59 to 2f25d40 Compare February 5, 2018 08:02

jreback closed this in 5b58a20 Feb 5, 2018

jreback added this to the 0.23.0 milestone Feb 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: groupby with resample using on parameter errors when selecting column to apply function #19433

BUG: groupby with resample using on parameter errors when selecting column to apply function #19433

discort commented Jan 28, 2018 •

edited

Loading

codecov bot commented Jan 29, 2018 •

edited

Loading

discort commented Jan 29, 2018

jreback commented Jan 30, 2018

discort commented Jan 30, 2018 •

edited

Loading

jreback left a comment

jreback Feb 4, 2018

jreback Feb 4, 2018

jreback Feb 4, 2018

discort commented Feb 5, 2018

jreback commented Feb 5, 2018

BUG: groupby with resample using on parameter errors when selecting column to apply function #19433

BUG: groupby with resample using on parameter errors when selecting column to apply function #19433

Conversation

discort commented Jan 28, 2018 • edited Loading

codecov bot commented Jan 29, 2018 • edited Loading

Codecov Report

discort commented Jan 29, 2018

jreback commented Jan 30, 2018

discort commented Jan 30, 2018 • edited Loading

jreback left a comment

Choose a reason for hiding this comment

jreback Feb 4, 2018

Choose a reason for hiding this comment

jreback Feb 4, 2018

Choose a reason for hiding this comment

jreback Feb 4, 2018

Choose a reason for hiding this comment

discort commented Feb 5, 2018

jreback commented Feb 5, 2018

discort commented Jan 28, 2018 •

edited

Loading

codecov bot commented Jan 29, 2018 •

edited

Loading

discort commented Jan 30, 2018 •

edited

Loading