Skip to content

DEPR: Remove unnecessary kwargs in groupby ops #50483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from
Closed
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -505,6 +505,7 @@ Other API changes
Deprecations
~~~~~~~~~~~~
- Deprecated argument ``infer_datetime_format`` in :func:`to_datetime` and :func:`read_csv`, as a strict version of it is now the default (:issue:`48621`)
- Deprecate unused argument ``**kwargs`` from :meth:`cummax`, :meth:`cummin`, :meth:`cumsum`, :meth:`cumprod`, :meth:`take` and :meth:`skew` (:issue:`50483`)

.. ---------------------------------------------------------------------------

Expand Down
32 changes: 16 additions & 16 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
from pandas.util._decorators import (
Appender,
Substitution,
deprecate_nonkeyword_arguments,
doc,
)

Expand Down Expand Up @@ -893,6 +894,9 @@ def fillna(
return result

@doc(Series.take.__doc__)
@deprecate_nonkeyword_arguments(
version="3.0", allowed_args=["self", "axis", "indices"]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having another look, looks like deprecate_nonkeyword_arguments doesn't deprecate the argument, but makes it usable only via a keyword argument (e.g. foo(1) would produce the warning, but foo(val=1) wouldn't). If that's the case, doesn't seem like we've got a decorator to simply deprecate an argument. In that case, you'll have to produce it manually.

def take(
self,
indices: TakeIndexer,
Expand All @@ -903,6 +907,9 @@ def take(
return result

@doc(Series.skew.__doc__)
@deprecate_nonkeyword_arguments(
version="3.0", allowed_args=["self", "axis", "skipna", "numeric_only"]
)
def skew(
self,
axis: Axis | lib.NoDefault = lib.no_default,
Expand All @@ -911,11 +918,7 @@ def skew(
**kwargs,
) -> Series:
result = self._op_via_apply(
"skew",
axis=axis,
skipna=skipna,
numeric_only=numeric_only,
**kwargs,
"skew", axis=axis, skipna=skipna, numeric_only=numeric_only, **kwargs
)
return result

Expand Down Expand Up @@ -2272,16 +2275,17 @@ def fillna(
return result

@doc(DataFrame.take.__doc__)
def take(
self,
indices: TakeIndexer,
axis: Axis | None = 0,
**kwargs,
) -> DataFrame:
@deprecate_nonkeyword_arguments(
version="3.0", allowed_args=["self", "indices", "axis"]
)
def take(self, indices: TakeIndexer, axis: Axis | None = 0, **kwargs) -> DataFrame:
result = self._op_via_apply("take", indices=indices, axis=axis, **kwargs)
return result

@doc(DataFrame.skew.__doc__)
@deprecate_nonkeyword_arguments(
version="3.0", allowed_args=["self", "axis", "skipna", "numeric_only"]
)
def skew(
self,
axis: Axis | None | lib.NoDefault = lib.no_default,
Expand All @@ -2290,11 +2294,7 @@ def skew(
**kwargs,
) -> DataFrame:
result = self._op_via_apply(
"skew",
axis=axis,
skipna=skipna,
numeric_only=numeric_only,
**kwargs,
"skew", axis=axis, skipna=skipna, numeric_only=numeric_only, **kwargs
)
return result

Expand Down
9 changes: 9 additions & 0 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ class providing the base-class of operations.
Appender,
Substitution,
cache_readonly,
deprecate_nonkeyword_arguments,
doc,
)

Expand Down Expand Up @@ -3411,6 +3412,7 @@ def rank(
@final
@Substitution(name="groupby")
@Appender(_common_see_also)
@deprecate_nonkeyword_arguments(version="3.0", allowed_args=["self", "axis"])
def cumprod(self, axis: Axis = 0, *args, **kwargs) -> NDFrameT:
"""
Cumulative product for each group.
Expand All @@ -3429,6 +3431,7 @@ def cumprod(self, axis: Axis = 0, *args, **kwargs) -> NDFrameT:
@final
@Substitution(name="groupby")
@Appender(_common_see_also)
@deprecate_nonkeyword_arguments(version="3.0", allowed_args=["self", "axis"])
def cumsum(self, axis: Axis = 0, *args, **kwargs) -> NDFrameT:
"""
Cumulative sum for each group.
Expand All @@ -3447,6 +3450,9 @@ def cumsum(self, axis: Axis = 0, *args, **kwargs) -> NDFrameT:
@final
@Substitution(name="groupby")
@Appender(_common_see_also)
@deprecate_nonkeyword_arguments(
version="3.0", allowed_args=["self", "axis", "numeric_only"]
)
def cummin(
self, axis: AxisInt = 0, numeric_only: bool = False, **kwargs
) -> NDFrameT:
Expand All @@ -3472,6 +3478,9 @@ def cummin(
@final
@Substitution(name="groupby")
@Appender(_common_see_also)
@deprecate_nonkeyword_arguments(
version="3.0", allowed_args=["self", "axis", "numeric_only"]
)
def cummax(
self, axis: AxisInt = 0, numeric_only: bool = False, **kwargs
) -> NDFrameT:
Expand Down
16 changes: 16 additions & 0 deletions pandas/tests/groupby/test_function.py
Original file line number Diff line number Diff line change
Expand Up @@ -1594,3 +1594,19 @@ def test_multiindex_group_all_columns_when_empty(groupby_func):
result = method(*args).index
expected = df.index
tm.assert_index_equal(result, expected)


def test_deprecated_keywords():
msg = (
"In the future version of pandas the arguments args"
"and kwargs will be deprecated for these functions"
)
with tm.assert_produces_warning(FutureWarning, match=msg):
df = DataFrame({"a": [1, 1, 2], "b": [3, 4, 5]})
gb = df.groupby("a")
assert gb.cummax(kwargs=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using kwargs as the sample keyword argument is a bit misleading. Something like gb.cummax(incorrect_argument=)` would probably be clearer.

assert gb.cummin(kwargs=1)
assert gb.cumsum(args=1, kwargs=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've got this wrong. This is not testing args at all, to test if the *args deprecation is working you need to use gb.cumsum(1, incorrect_argument=1). And I'd recommend splitting this into two separate tests, otherwise you will only test that one of them works, not that both work.

I'd recommend that you make sure you understand well what the * and ** do in a function signature like def foo(my_arg, *args, **kwargs):

assert gb.cumprod(args=1, kwargs=1)
assert gb.skew(kwargs=1)
assert gb.take(kwargs=1)
Comment on lines +1607 to +1612
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert is used to assert that a certain condition is true. Do you want to assert something here, or just call these functions?

Also, what will happen with this test if cummax produces the warning, but the other functions don't? Seems like the test will pass, even if the changes don't do what we want them to do. You probably want to use pytest parametrization, with the function name as a parameter, then you can retrieve the function with getattr(gb, func_name).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to use pytest parametrization, with the function name as a parameter, then you can retrieve the function with getattr(gb, func_name).

i'll try this way