Skip to content

df.groupby(group_keys=True) sometimes doesn't do anything #26805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue Jun 12, 2019 · 1 comment
Closed

df.groupby(group_keys=True) sometimes doesn't do anything #26805

ghost opened this issue Jun 12, 2019 · 1 comment
Labels
Duplicate Report Duplicate issue or pull request Groupby

Comments

@ghost
Copy link

ghost commented Jun 12, 2019

According to the df.groupby docstring :

group_keys : bool, default True
    When calling apply, add group keys to index to identify pieces.

But it seems to work for some cases and not others:

df = pd.DataFrame({'key': [1, 1, 1, 2, 2, 2, 3, 3, 3],
                        'value': range(9)})

df.groupby('key', group_keys=True).apply(lambda x: x.key)  # index by groups 
df.groupby('key', group_keys=True).apply(pd.np.sum) # index by groups
df.groupby('key', group_keys=True).apply(lambda x: x[:].key) # index by groups
df.groupby('key', group_keys=True).apply(lambda x:x-x.mean()) # does nothing
df.groupby('key', group_keys=True).apply(lambda x:x) # does nothing

For example, the following gives the same output regardles of the group_keys value

import pandas as pd
df=pd.DataFrame(dict(price=[10,10,20,20,30,30],color=[10,10,20,20,30,30],cost=(100,200,300,400,500,600)))
df.groupby(['price'],group_keys=False).apply(lambda x:x)
# result
   price  color  cost
0     13     11   101
1     11     11   201
2     22     21   301
3     21     21   401
4     32     31   501
5     31     31   601

df.groupby(['price'],group_keys=True).apply(lambda x:x)
# same result
   price  color  cost
0     13     11   101
1     11     11   201
2     22     21   301
3     21     21   401
4     32     31   501
5     31     31   601

xref #22545 for related groupby confusion.

@WillAyd
Copy link
Member

WillAyd commented Jun 12, 2019

Thanks for the report. This is a duplicate of #22848 - would certainly welcome PRs and investigation into that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request Groupby
Projects
None yet
Development

No branches or pull requests

1 participant