Skip to content

BUG: #57775 Fix groupby apply in case func returns None for all groups #57800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.2.2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Fixed regressions

Bug fixes
~~~~~~~~~
-
- :meth:`DataFrameGroupBy.apply` was returning a completely empty DataFrame when all return values of ``func`` were ``None`` instead of returning an empty DataFrame with the original columns and dtypes. (:issue:`57775`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to v3.0.0.rst?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you undo this white space change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks, I missed that one.

.. ---------------------------------------------------------------------------
.. _whatsnew_222.other:
Expand Down
7 changes: 5 additions & 2 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1642,8 +1642,11 @@ def _wrap_applied_output(
first_not_none = next(com.not_none(*values), None)

if first_not_none is None:
# GH9684 - All values are None, return an empty frame.
return self.obj._constructor()
# GH9684 - All values are None, return an empty frame
# GH57775 - Ensure that columns and dtypes from original frame are kept.
result = self.obj._constructor(columns=data.columns)
result = result.astype(data.dtypes)
return result
elif isinstance(first_not_none, DataFrame):
return self._concat_objects(
values,
Expand Down
15 changes: 15 additions & 0 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -1572,6 +1572,13 @@ def apply(self, func, *args, include_groups: bool = True, **kwargs) -> NDFrameT:
behavior or errors and are not supported. See :ref:`gotchas.udf-mutation`
for more details.

Groups for which ``func`` returns ``None`` will be filtered from the result.

.. versionchanged:: 2.2.2

In case all groups are filtered from the result, an empty DataFrame
with the columns and dtypes of the original dataframe will be returned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Groups for which ``func`` returns ``None`` will be filtered from the result.
.. versionchanged:: 2.2.2
In case all groups are filtered from the result, an empty DataFrame
with the columns and dtypes of the original dataframe will be returned.

(the example below is sufficient)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the description, only left the example

Examples
--------
>>> df = pd.DataFrame({"A": "a a b".split(), "B": [1, 2, 3], "C": [4, 6, 5]})
Expand Down Expand Up @@ -1636,6 +1643,14 @@ def apply(self, func, *args, include_groups: bool = True, **kwargs) -> NDFrameT:
a 5
b 2
dtype: int64

Example 4: The function passed to ``apply`` returns ``None`` for one of the
group. This group is filtered from the result:

>>> g1.apply(lambda x: None if x.iloc[0, 0] == 3 else x, include_groups=False)
B C
0 1 4
1 2 6
"""
if isinstance(func, str):
if hasattr(self, func):
Expand Down
3 changes: 2 additions & 1 deletion pandas/tests/groupby/test_apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -838,7 +838,8 @@ def test_func(x):
msg = "DataFrameGroupBy.apply operated on the grouping columns"
with tm.assert_produces_warning(DeprecationWarning, match=msg):
result = test_df.groupby("groups").apply(test_func)
expected = DataFrame()
expected = DataFrame(columns=test_df.columns)
expected = expected.astype(test_df.dtypes)
tm.assert_frame_equal(result, expected)


Expand Down