Unable to pass additional arguments to resample().apply() #22261

discort · 2018-08-09T11:48:09Z

closes Unable to pass additional arguments to resample().apply() #14615
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

There is flake8 warning ^ but it is not related with my changes.

discort · 2018-08-09T11:50:56Z

I considered this comment at first before started the implementation. The reason why I didn't do like that is a compatibility with Resampler.apply docs.
Docs says us that positional args will be pass into applied function.

TomAugspurger · 2018-08-09T13:45:00Z

pandas/core/resample.py

@@ -852,7 +854,7 @@ def __init__(self, obj, *args, **kwargs):
        self._groupby.grouper.mutated = True
        self.groupby = copy.copy(parent.groupby)

-    def _apply(self, f, **kwargs):
+    def _apply(self, f, *args, **kwargs):


*args is unused now, which seems incorrect. Shouldn't it be passed through?

@TomAugspurger
I added it for a compatibility and TestResamplerGrouper.test_apply failed

Can you add a test that the proper behavior is happening now?

@TomAugspurger

Could you please clarify which test did you mean?

Anyway, at this moment, _GroupByMixin and Resampler have different arguments in _groupby_and_aggregate, so I added args for a compatibility.

This seems to be hit by df.groupby(...).resample.

In [1]: import pandas as pd In [2]: def f(x, arg): ...: assert arg == 0 ...: return arg ...: ...: In [3]: df = pd.DataFrame({"A": 1, "B": 2}, index=pd.date_range('2017', periods=10)) In [4]: df.groupby("A").resample("D").agg(f, 0)

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) ~/sandbox/pandas/pandas/core/groupby/generic.py in aggregate(self, func_or_funcs, *args, **kwargs) 774 try: --> 775 return self._python_agg_general(func_or_funcs, *args, **kwargs) 776 except Exception: ~/sandbox/pandas/pandas/core/groupby/groupby.py in _python_agg_general(self, func, *args, **kwargs) 839 if len(output) == 0: --> 840 return self._python_apply_general(f) 841 ~/sandbox/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f) 701 keys, values, mutated = self.grouper.apply(f, self._selected_obj, --> 702 self.axis) 703 ~/sandbox/pandas/pandas/core/groupby/ops.py in apply(self, f, data, axis) 197 group_axes = _get_axes(group) --> 198 res = f(group) 199 if not _is_indexed_like(res, group_axes): ~/sandbox/pandas/pandas/core/groupby/groupby.py in <lambda>(x) 827 func = self._is_builtin_func(func) --> 828 f = lambda x: func(x, *args, **kwargs) 829 TypeError: f() missing 1 required positional argument: 'arg' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) ~/sandbox/pandas/pandas/core/resample.py in _groupby_and_aggregate(self, how, grouper, *args, **kwargs) 325 # Check if the function is reducing or not. --> 326 result = grouped._aggregate_item_by_item(how, *args, **kwargs) 327 else: ~/sandbox/pandas/pandas/core/groupby/generic.py in _aggregate_item_by_item(self, func, *args, **kwargs) 279 if not len(result_columns) and errors is not None: --> 280 raise errors 281 ~/sandbox/pandas/pandas/core/groupby/generic.py in _aggregate_item_by_item(self, func, *args, **kwargs) 264 result[item] = self._try_cast( --> 265 colg.aggregate(func, *args, **kwargs), data) 266 except ValueError: ~/sandbox/pandas/pandas/core/groupby/generic.py in aggregate(self, func_or_funcs, *args, **kwargs) 776 except Exception: --> 777 result = self._aggregate_named(func_or_funcs, *args, **kwargs) 778 ~/sandbox/pandas/pandas/core/groupby/generic.py in _aggregate_named(self, func, *args, **kwargs) 904 group.name = name --> 905 output = func(group, *args, **kwargs) 906 if isinstance(output, (Series, Index, np.ndarray)): TypeError: f() missing 1 required positional argument: 'arg' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) ~/sandbox/pandas/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs) 683 try: --> 684 result = self._python_apply_general(f) 685 except Exception: ~/sandbox/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f) 701 keys, values, mutated = self.grouper.apply(f, self._selected_obj, --> 702 self.axis) 703 ~/sandbox/pandas/pandas/core/groupby/ops.py in apply(self, f, data, axis) 197 group_axes = _get_axes(group) --> 198 res = f(group) 199 if not _is_indexed_like(res, group_axes): TypeError: f() missing 1 required positional argument: 'arg' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) ~/sandbox/pandas/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs) 683 try: --> 684 result = self._python_apply_general(f) 685 except Exception: ~/sandbox/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f) 701 keys, values, mutated = self.grouper.apply(f, self._selected_obj, --> 702 self.axis) 703 ~/sandbox/pandas/pandas/core/groupby/ops.py in apply(self, f, data, axis) 197 group_axes = _get_axes(group) --> 198 res = f(group) 199 if not _is_indexed_like(res, group_axes): ~/sandbox/pandas/pandas/core/resample.py in func(x) 869 --> 870 return x.apply(f, **kwargs) 871 ~/sandbox/pandas/pandas/core/resample.py in aggregate(self, func, *args, **kwargs) 246 *args, --> 247 **kwargs) 248 ~/sandbox/pandas/pandas/core/resample.py in _groupby_and_aggregate(self, how, grouper, *args, **kwargs) 332 # try to evaluate --> 333 result = grouped.apply(how, *args, **kwargs) 334 ~/sandbox/pandas/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs) 695 with _group_selection_context(self): --> 696 return self._python_apply_general(f) 697 ~/sandbox/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f) 701 keys, values, mutated = self.grouper.apply(f, self._selected_obj, --> 702 self.axis) 703 ~/sandbox/pandas/pandas/core/groupby/ops.py in apply(self, f, data, axis) 197 group_axes = _get_axes(group) --> 198 res = f(group) 199 if not _is_indexed_like(res, group_axes): TypeError: f() missing 1 required positional argument: 'arg' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) ~/sandbox/pandas/pandas/core/resample.py in _groupby_and_aggregate(self, how, grouper, *args, **kwargs) 325 # Check if the function is reducing or not. --> 326 result = grouped._aggregate_item_by_item(how, *args, **kwargs) 327 else: ~/sandbox/pandas/pandas/core/groupby/generic.py in _aggregate_item_by_item(self, func, *args, **kwargs) 279 if not len(result_columns) and errors is not None: --> 280 raise errors 281 ~/sandbox/pandas/pandas/core/groupby/generic.py in _aggregate_item_by_item(self, func, *args, **kwargs) 264 result[item] = self._try_cast( --> 265 colg.aggregate(func, *args, **kwargs), data) 266 except ValueError: ~/sandbox/pandas/pandas/core/groupby/generic.py in aggregate(self, func_or_funcs, *args, **kwargs) 776 except Exception: --> 777 result = self._aggregate_named(func_or_funcs, *args, **kwargs) 778 ~/sandbox/pandas/pandas/core/groupby/generic.py in _aggregate_named(self, func, *args, **kwargs) 904 group.name = name --> 905 output = func(group, *args, **kwargs) 906 if isinstance(output, (Series, Index, np.ndarray)): TypeError: f() missing 1 required positional argument: 'arg' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) ~/sandbox/pandas/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs) 683 try: --> 684 result = self._python_apply_general(f) 685 except Exception: ~/sandbox/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f) 701 keys, values, mutated = self.grouper.apply(f, self._selected_obj, --> 702 self.axis) 703 ~/sandbox/pandas/pandas/core/groupby/ops.py in apply(self, f, data, axis) 197 group_axes = _get_axes(group) --> 198 res = f(group) 199 if not _is_indexed_like(res, group_axes): TypeError: f() missing 1 required positional argument: 'arg' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) <ipython-input-4-9cc8965c7cbe> in <module>() ----> 1 df.groupby("A").resample("D").agg(f, 0) ~/sandbox/pandas/pandas/core/resample.py in aggregate(self, func, *args, **kwargs) 245 grouper, 246 *args, --> 247 **kwargs) 248 249 result = self._apply_loffset(result) ~/sandbox/pandas/pandas/core/resample.py in _apply(self, f, *args, **kwargs) 870 return x.apply(f, **kwargs) 871 --> 872 result = self._groupby.apply(func) 873 return self._wrap_result(result) 874 ~/sandbox/pandas/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs) 694 695 with _group_selection_context(self): --> 696 return self._python_apply_general(f) 697 698 return result ~/sandbox/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f) 700 def _python_apply_general(self, f): 701 keys, values, mutated = self.grouper.apply(f, self._selected_obj, --> 702 self.axis) 703 704 return self._wrap_applied_output( ~/sandbox/pandas/pandas/core/groupby/ops.py in apply(self, f, data, axis) 196 # group might be modified 197 group_axes = _get_axes(group) --> 198 res = f(group) 199 if not _is_indexed_like(res, group_axes): 200 mutated = True ~/sandbox/pandas/pandas/core/resample.py in func(x) 868 return getattr(x, f)(**kwargs) 869 --> 870 return x.apply(f, **kwargs) 871 872 result = self._groupby.apply(func) ~/sandbox/pandas/pandas/core/resample.py in aggregate(self, func, *args, **kwargs) 245 grouper, 246 *args, --> 247 **kwargs) 248 249 result = self._apply_loffset(result) ~/sandbox/pandas/pandas/core/resample.py in _groupby_and_aggregate(self, how, grouper, *args, **kwargs) 331 # we have a non-reducing function 332 # try to evaluate --> 333 result = grouped.apply(how, *args, **kwargs) 334 335 result = self._apply_loffset(result) ~/sandbox/pandas/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs) 694 695 with _group_selection_context(self): --> 696 return self._python_apply_general(f) 697 698 return result ~/sandbox/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f) 700 def _python_apply_general(self, f): 701 keys, values, mutated = self.grouper.apply(f, self._selected_obj, --> 702 self.axis) 703 704 return self._wrap_applied_output( ~/sandbox/pandas/pandas/core/groupby/ops.py in apply(self, f, data, axis) 196 # group might be modified 197 group_axes = _get_axes(group) --> 198 res = f(group) 199 if not _is_indexed_like(res, group_axes): 200 mutated = True TypeError: f() missing 1 required positional argument: 'arg'

So IIUC, we want to be passing through *args here.

TomAugspurger · 2018-08-09T13:45:37Z

pandas/tests/test_resample.py

@@ -2165,6 +2165,14 @@ def test_resample_datetime_values(self):
        res = df['timestamp'].resample('2D').first()
        tm.assert_series_equal(res, exp)

+    def test_resample_apply_with_additional_args(self):
+        def f(data, add_arg):


A better test would be to return add_arg, to ensure that the proper value is passed through.

TomAugspurger · 2018-08-09T13:45:49Z

pandas/tests/test_resample.py

@@ -2165,6 +2165,14 @@ def test_resample_datetime_values(self):
        res = df['timestamp'].resample('2D').first()
        tm.assert_series_equal(res, exp)

+    def test_resample_apply_with_additional_args(self):
+        def f(data, add_arg):


added a comment referencing the github issue.

Which issue?

can you add this issue number as a comment

TomAugspurger · 2018-08-09T13:46:46Z

pandas/core/resample.py

@@ -239,7 +239,9 @@ def aggregate(self, arg, *args, **kwargs):
        self._set_binner()
        result, how = self._aggregate(arg, *args, **kwargs)
        if result is None:
+            grouper = kwargs.get('grouper', None)


What happens if the user passes a function like def f(arr, grouper)? Does this raise? If so, can we provide a good error message?

This is correct. Is it okay to pass None like?:

self._groupby_and_aggregate(arg, None, *args, **kwargs)

codecov · 2018-08-09T14:04:22Z

Codecov Report

Merging #22261 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #22261      +/-   ##
==========================================
- Coverage   92.05%   92.05%   -0.01%     
==========================================
  Files         169      169              
  Lines       50733    50735       +2     
==========================================
+ Hits        46702    46703       +1     
- Misses       4031     4032       +1

Flag	Coverage Δ
#multiple	`90.46% <100%> (-0.01%)`	⬇️
#single	`42.24% <16.66%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/resample.py	`96.09% <100%> (+0.01%)`	⬆️
pandas/util/_depr_module.py	`65.11% <0%> (-2.33%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f11d1a...547d2a5. Read the comment docs.

gfyoung · 2018-08-09T20:00:59Z

pandas/core/resample.py

@@ -240,6 +240,7 @@ def aggregate(self, arg, *args, **kwargs):
        result, how = self._aggregate(arg, *args, **kwargs)
        if result is None:
            result = self._groupby_and_aggregate(arg,
+                                                 None,


Could you explain why you need this None argument?

@gfyoung

Using test scenario, multiplier arg will be passed to _groupby_and_aggregate as grouper arg if we omit None. So it's better to pass None explicitly.

can you then pass grouper as a kwarg

we may need to rename this arg itself to avoid a user pass ing grouper= (unlikely though)

can you then pass grouper as a kwarg

I can't.

TypeError: _groupby_and_aggregate() got multiple values for argument 'grouper'

The workaround for this will be something like that:

grouper = None result = self._groupby_and_aggregate(arg, grouper, *args, **kwargs)

we may need to rename this arg itself to avoid a user pass ing grouper= (unlikely though)

At this moment, there may be TypeError if a user pass grouper or how args to Resampler.apply or even Series.apply:

TypeError: _groupby_and_aggregate() got multiple values for argument 'how'

I can tackle it within another issue.

@jreback

jreback · 2018-08-09T22:45:35Z

pandas/tests/test_resample.py

+            return np.mean(data) * add_arg
+
+        multiplier = 10
+        result = self.series.resample('D').apply(f, multiplier)


test both as an arg and as a kwargs

jreback · 2018-08-09T22:46:18Z

doc/source/whatsnew/v0.23.5.txt

@@ -34,6 +34,7 @@ Bug Fixes
 **Groupby/Resample/Rolling**

 - Bug in :meth:`DataFrame.resample` when resampling ``NaT`` in ``TimeDeltaIndex`` (:issue:`13223`).
+- Bug in :meth:`Resampler.apply` when passing an additional argument to applied func (:issue:`14615`).


move to 0.24.0

jreback · 2018-08-09T22:46:37Z

pandas/core/resample.py

@@ -240,6 +240,7 @@ def aggregate(self, arg, *args, **kwargs):
        result, how = self._aggregate(arg, *args, **kwargs)
        if result is None:
            result = self._groupby_and_aggregate(arg,
+                                                 None,


can you then pass grouper as a kwarg

jreback · 2018-08-09T22:47:10Z

pandas/core/resample.py

@@ -240,6 +240,7 @@ def aggregate(self, arg, *args, **kwargs):
        result, how = self._aggregate(arg, *args, **kwargs)
        if result is None:
            result = self._groupby_and_aggregate(arg,
+                                                 None,


we may need to rename this arg itself to avoid a user pass ing grouper= (unlikely though)

jreback · 2018-08-10T10:01:33Z

doc/source/whatsnew/v0.24.0.txt

@@ -650,7 +650,7 @@ Groupby/Resample/Rolling
  ``SeriesGroupBy`` when the grouping variable only contains NaNs and numpy version < 1.13 (:issue:`21956`).
 - Multiple bugs in :func:`pandas.core.Rolling.min` with ``closed='left'` and a
  datetime-like index leading to incorrect results and also segfault. (:issue:`21704`)
-
+- Bug in :meth:`Resampler.apply` when passing an additional argument to applied func (:issue:`14615`).


can you clarify that this is postiional args

jreback · 2018-08-10T10:03:42Z

pandas/core/resample.py

            result = self._groupby_and_aggregate(arg,
+                                                 grouper,


Ideally we would pass both how and grouper as kwargs, but you can only do that in PY3, ok
add ad how=arg and pass how to make this more clear then (positionally as the first arg, just where arg is now)

and actually rename arg to func, which is what it is called elsewhere

discort · 2018-08-11T05:18:35Z

@jreback

discort · 2018-08-15T09:45:38Z

@TomAugspurger
@gfyoung

gfyoung · 2018-08-15T16:37:15Z

Ping @jreback @TomAugspurger : Could you have a look?

pep8speaks · 2018-08-20T09:05:47Z

Hello @discort! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on August 20, 2018 at 18:45 Hours UTC

discort · 2018-08-20T10:14:13Z

@TomAugspurger
Fixed within 66550b1

jreback

small comment, ping on green.

jreback · 2018-08-20T10:22:32Z

pandas/tests/test_resample.py

@@ -2165,6 +2165,14 @@ def test_resample_datetime_values(self):
        res = df['timestamp'].resample('2D').first()
        tm.assert_series_equal(res, exp)

+    def test_resample_apply_with_additional_args(self):
+        def f(data, add_arg):


can you add this issue number as a comment

discort · 2018-08-22T11:24:44Z

@jreback

TomAugspurger · 2018-08-22T12:24:32Z

Thanks!

TomAugspurger reviewed Aug 9, 2018

View reviewed changes

discort force-pushed the fix_14615 branch from 31d552d to a06f264 Compare August 9, 2018 14:03

gfyoung added Bug Groupby Resample resample method labels Aug 9, 2018

gfyoung reviewed Aug 9, 2018

View reviewed changes

jreback requested changes Aug 9, 2018

View reviewed changes

discort force-pushed the fix_14615 branch from a06f264 to a8b9d72 Compare August 10, 2018 08:33

jreback requested changes Aug 10, 2018

View reviewed changes

discort force-pushed the fix_14615 branch from a8b9d72 to c6f2318 Compare August 10, 2018 10:39

discort force-pushed the fix_14615 branch from c6f2318 to cbb6b82 Compare August 20, 2018 07:24

discort force-pushed the fix_14615 branch from 33af77c to 66550b1 Compare August 20, 2018 09:07

jreback requested changes Aug 20, 2018

View reviewed changes

jreback added this to the 0.24.0 milestone Aug 20, 2018

discort force-pushed the fix_14615 branch from 66550b1 to 3ce37c7 Compare August 20, 2018 11:01

discort added 2 commits August 20, 2018 21:44

Fixed additional passing args to resample().apply

553a68e

fixed args on df

547d2a5

discort force-pushed the fix_14615 branch from 3ce37c7 to 547d2a5 Compare August 20, 2018 18:45

TomAugspurger merged commit 27e4141 into pandas-dev:master Aug 22, 2018

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018

BUG: Pass through arguments to resample().apply() (pandas-dev#22261)

7d99eda

Uh oh!

Unable to pass additional arguments to resample().apply() #22261

Unable to pass additional arguments to resample().apply() #22261

Uh oh!

Conversation

discort commented Aug 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

discort commented Aug 9, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Aug 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

discort Aug 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

discort commented Aug 11, 2018

Uh oh!

discort commented Aug 15, 2018

Uh oh!

gfyoung commented Aug 15, 2018

Uh oh!

pep8speaks commented Aug 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated on August 20, 2018 at 18:45 Hours UTC

Uh oh!

discort commented Aug 20, 2018

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

discort commented Aug 9, 2018 •

edited

Loading

codecov bot commented Aug 9, 2018 •

edited

Loading

discort Aug 10, 2018 •

edited

Loading

pep8speaks commented Aug 20, 2018 •

edited

Loading