Skip to content

Change groupby value_counts (from fall through behaviour) #6540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hayd opened this issue Mar 4, 2014 · 5 comments
Closed

Change groupby value_counts (from fall through behaviour) #6540

hayd opened this issue Mar 4, 2014 · 5 comments

Comments

@hayd
Copy link
Contributor

hayd commented Mar 4, 2014

The fall through value_counts (for Series) is a bit strange, I think better result would be (with the standard options):

Can put together if people think it's good.

In [132]: df = pd.DataFrame([['a_link', 'dofollow'], ['a_link', 'dofollow'], ['a_link', 'nofollow'], ['b_link', 'javascript']], columns=['link', 'type'])

In [133]: g = df.groupby(['link', 'type'])

In [134]: g.value_counts()
AttributeError: 'DataFrameGroupBy' object has no attribute 'value_counts'

In [135]: g.link.value_counts()   # redundant level
Out[135]: 
link    type              
a_link  dofollow    a_link    2
        nofollow    a_link    1
b_link  javascript  b_link    1
dtype: int64


Following would make sense for DataFrameGroupby to:
In [136]: pd.Series([len(g.groups[i]) for i in g.grouper.result_index], g.grouper.result_index)
Out[136]: 
link    type      
a_link  dofollow      2
        nofollow      1
b_link  javascript    1
dtype: int64

Note: as_index doesn't make sense here so would be ignored.

@hayd
Copy link
Contributor Author

hayd commented Mar 4, 2014

Ahem, that would be size....

@hayd hayd closed this as completed Mar 4, 2014
@hayd
Copy link
Contributor Author

hayd commented Mar 4, 2014

Maybe it should be size with standard value_count options ?

@hayd hayd reopened this Mar 4, 2014
@jreback
Copy link
Contributor

jreback commented Mar 4, 2014

maybe just alias value_counts to size?

this related (maybe dupe) of #6312

@jreback jreback added this to the 0.14.0 milestone Mar 4, 2014
@hayd
Copy link
Contributor Author

hayd commented Mar 4, 2014

Yeah I think this makes sense, I don't think most people want to do anything else (trying to think of a case when this wouldn't be desired)... I guess that's when the selected column isn't in the index... darn then it is going to be different. :s

@jreback jreback modified the milestones: 0.15.0, 0.14.0 May 5, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015
@datapythonista datapythonista modified the milestones: Contributions Welcome, Someday Jul 8, 2018
@rhshadrach
Copy link
Member

In the result of a groupby, the groups are the index, not the values. To me, this makes "g.value_counts()" a bit confusing. Since g.size() already gives the desired output, I personally think this should not be implemented/aliased.

@rhshadrach rhshadrach modified the milestones: Someday, No action Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants