Skip to content

Groupby doesn't call aggregation on empty groups #18869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Dec 20, 2017 · 6 comments
Closed

Groupby doesn't call aggregation on empty groups #18869

TomAugspurger opened this issue Dec 20, 2017 · 6 comments
Assignees
Labels
Bug Categorical Categorical Data Type good first issue Groupby Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@TomAugspurger
Copy link
Contributor

This should exit the interpreter.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"A": pd.Categorical(['a', 'a'], categories=['a', 'b']),
   ...:                    "B": [1, 1]})
   ...:

In [3]: import sys

In [4]: def f(x):
   ...:     if len(x) == 0:
   ...:         sys.exit(1)
   ...:     return len(x)
   ...:

In [5]: df.groupby('A').agg(f)
Out[5]:
     B
A
a  2.0
b  NaN

Instead, I think groupby assumes the output of the custom aggfunc is NaN.

@TomAugspurger TomAugspurger added Categorical Categorical Data Type Difficulty Intermediate Groupby Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Dec 20, 2017
@TomAugspurger TomAugspurger added this to the Next Major Release milestone Dec 20, 2017
@mroeschke mroeschke added the Bug label Jun 28, 2020
@arw2019
Copy link
Member

arw2019 commented Nov 13, 2020

This is still a problem on 1.2 master

@rhshadrach
Copy link
Member

This is fixed on main, could use tests.

@rhshadrach rhshadrach added good first issue Needs Tests Unit test(s) needed to prevent regressions labels Nov 14, 2023
@HaruguchiKazuto
Copy link
Contributor

@rhshadrach
Hello, I would like to work on this issue. This will be my first contribution to oss.
I understand that I can add test cases.
If it's convenient, could you please let me know which file I should add the test cases to? That would be very helpful.

@rhshadrach
Copy link
Member

Thanks @HaruguchiKazuto - this would go in pandas.tests.groupby.aggregate.test_aggregate.

@HaruguchiKazuto
Copy link
Contributor

take

@rhshadrach rhshadrach removed the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Nov 24, 2023
@rhshadrach rhshadrach added this to the 2.2 milestone Nov 29, 2023
@rhshadrach
Copy link
Member

Closed by #56145

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type good first issue Groupby Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants