Skip to content

DEPR: Enforce DataFrameGroupBy.__iter__ returning tuples of length 1 #50064

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 5, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -596,7 +596,7 @@ Removal of prior version deprecations/changes
- Enforced deprecation of silently dropping nuisance columns in groupby and resample operations when ``numeric_only=False`` (:issue:`41475`)
- Changed default of ``numeric_only`` in various :class:`.DataFrameGroupBy` methods; all methods now default to ``numeric_only=False`` (:issue:`46072`)
- Changed default of ``numeric_only`` to ``False`` in :class:`.Resampler` methods (:issue:`47177`)
-
- When providing a list of columns of length one to :meth:`DataFrame.groupby`, the keys that are returned by iterating over the resulting :class:`DataFrameGroupBy` object will now be tuples of length one (:issue:`47761`)

.. ---------------------------------------------------------------------------
.. _whatsnew_200.performance:
Expand Down
17 changes: 4 additions & 13 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ class providing the base-class of operations.
cache_readonly,
doc,
)
from pandas.util._exceptions import find_stack_level

from pandas.core.dtypes.cast import ensure_dtype_can_hold_na
from pandas.core.dtypes.common import (
Expand Down Expand Up @@ -832,19 +831,11 @@ def __iter__(self) -> Iterator[tuple[Hashable, NDFrameT]]:
for each group
"""
keys = self.keys
result = self.grouper.get_iterator(self._selected_obj, axis=self.axis)
if isinstance(keys, list) and len(keys) == 1:
warnings.warn(
(
"In a future version of pandas, a length 1 "
"tuple will be returned when iterating over a "
"groupby with a grouper equal to a list of "
"length 1. Don't supply a list with a single grouper "
"to avoid this warning."
),
FutureWarning,
stacklevel=find_stack_level(),
)
return self.grouper.get_iterator(self._selected_obj, axis=self.axis)
# GH#42795 - when keys is a list, return tuples even when length is 1
result = (((key,), group) for key, group in result)
return result


# To track operations that expand dimensions, like ohlc
Expand Down
16 changes: 4 additions & 12 deletions pandas/tests/groupby/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -2743,18 +2743,10 @@ def test_groupby_none_column_name():

def test_single_element_list_grouping():
# GH 42795
df = DataFrame(
{"a": [np.nan, 1], "b": [np.nan, 5], "c": [np.nan, 2]}, index=["x", "y"]
)
msg = (
"In a future version of pandas, a length 1 "
"tuple will be returned when iterating over "
"a groupby with a grouper equal to a list of "
"length 1. Don't supply a list with a single grouper "
"to avoid this warning."
)
with tm.assert_produces_warning(FutureWarning, match=msg):
values, _ = next(iter(df.groupby(["a"])))
df = DataFrame({"a": [1, 2], "b": [np.nan, 5], "c": [np.nan, 2]}, index=["x", "y"])
result = [key for key, _ in df.groupby(["a"])]
expected = [(1,), (2,)]
assert result == expected


@pytest.mark.parametrize("func", ["sum", "cumsum", "cumprod", "prod"])
Expand Down