-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: allow 'size' in groupby aggregation #6312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Please note, the current (slow) |
so count is non-null values, which goes someway to explain why it is slower. |
yep...this is a very easy fix (just alias count to size) as its already comptued by the group indexer |
What I mean is, count is a different operation to size, size just cares about the result_index whilst count cares about whether values are non-null in columns... (same thing with vaue_counts, sometime user may want to count at values in another column). |
@jreback This issue is not really clear to me, as And I think we don't want to "alias count to size", as |
no i think we just need to alias size (like we do mean). iow add it to the cython table i think (this might work now) |
@jreback updated top post to clarify the issue |
I'll note that we should look at count perf as well (maybe create another issue); it may have been fixed since this issue |
this is implemented actually. |
Allow to use
'size'
in groupby's aggregate, so you can do:http://stackoverflow.com/questions/21660686/pandas-groupby-straight-forward-counting-the-number-of-elements-in-each-group-i
count
should directly implementsize
(enh)count
/size
should be allowed in an aggregation list (the bug)The text was updated successfully, but these errors were encountered: