Skip to content

BUG: GroupBy.apply doesn't add group keys to index whether setting group_keys=True or not. #40720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
taozuoqiao opened this issue Apr 1, 2021 · 3 comments
Closed
2 of 3 tasks
Labels
Apply Apply, Aggregate, Transform, Map Bug

Comments

@taozuoqiao
Copy link

taozuoqiao commented Apr 1, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


code

import pandas as pd
print(pd.__version__)
data = { 
    'groupby_col': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', ], 
    'agg_col': [1, 1, 0, 1, 0, 0, 0, 0, 1, 0], 
} 
df = pd.DataFrame(data)

print(df.groupby('groupby_col', group_keys=True).apply(lambda x: x.rolling(4).mean()))
print(df.groupby('groupby_col', group_keys=False).apply(lambda x: x.rolling(4).mean()))

output

1.2.3
   agg_col
0      NaN
1      NaN
2      NaN
3     0.75
4     0.50
5      NaN
6      NaN
7      NaN
8     0.25
9     0.25
   agg_col
0      NaN
1      NaN
2      NaN
3     0.75
4     0.50
5      NaN
6      NaN
7      NaN
8     0.25
9     0.25

The group keys should be add to index when setting group_keys=True(the default option). And when I use pandas==1.1.5, the output is as expected:

1.1.5
               agg_col
groupby_col           
A           0      NaN
            1      NaN
            2      NaN
            3     0.75
            4     0.50
B           5      NaN
            6      NaN
            7      NaN
            8     0.25
            9     0.25
   agg_col
0      NaN
1      NaN
2      NaN
3     0.75
4     0.50
5      NaN
6      NaN
7      NaN
8     0.25
9     0.25

But I find that examples in #38787 (comment) does work as expected even in 1.2.3:

code

df = pd.DataFrame({"key": [1, 1, 1, 2, 2, 2, 3, 3, 3], "value": range(9)})
print(df.groupby("key", group_keys=True).apply(lambda x: x[:2]))
print(df.groupby("key", group_keys=False).apply(lambda x: x[:2]))

output

       key  value
key              
1   0    1      0
    1    1      1
2   3    2      3
    4    2      4
3   6    3      6
    7    3      7
   key  value
0    1      0
1    1      1
3    2      3
4    2      4
6    3      6
7    3      7
@taozuoqiao taozuoqiao added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 1, 2021
@kaisengit
Copy link

Can confirm that this is still a problem with pandas 1.3.0

@mroeschke mroeschke added Apply Apply, Aggregate, Transform, Map and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 19, 2021
@rhshadrach
Copy link
Member

This was fixed in 1.5.0: https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.5.0.html#using-group-keys-with-transformers-in-dataframegroupby-apply-and-seriesgroupby-apply

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Bug
Projects
None yet
Development

No branches or pull requests

4 participants