-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Discuss: transformation vs. aggregation in agg
vs. transform
#27389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
agg
vs. transform
I would be fine with the first point to ensure agg reduces. I'm not clear on what issue points 2 and 3 are trying to solve |
Agreed. We should deprecate the current behavior with a warning. I also don't understand point 2.
Yes, I find this useful. |
Edited OP for clarity. What should |
Good. But how does that sit with 25.0 being released and the no-deprecation policy during this release cycle coming up on 1.0? Can we do it? |
Discussion on the mailing list: https://mail.python.org/pipermail/pandas-dev/2019-July/001030.html I think the plan was to be light on new deprecations for 1.0. But I don't know how wedded we are to that idea. |
I think the more recent discussion of this has moved to #35725, so closing in favor of that issue. |
There have been multiple issues regarding
agg
andtransform
(andapply
).Why do I get strange aggregation result from DataFrame groupby()? #26960, Behavior of new df.agg, df.transform and df.apply is very inconsistent #18103 in both the DataFrame and Grou
transform('rank')
and others returning the wrong answer (first issue from 2016): bug when filling missing values with transform? #14274, Shortcut functions in transform are not grouped #19354, Wrong output of GroupBy transform with string input (e.g., transform('rank')) #22509,I'd like to prepare a PR to fix this, but I need to know what the consensus is first.
1. should
Groupby/DataFrame/Series.agg
disallow transformations?#14741 (comment) about
Groupby.agg
agg
currently accepts transformations as well.DataFrame.agg
was merged in #27389 despite repeated objections to mixing transformations and aggregations, #14668 (comment) and #14668 (comment).2. What should
transform('rank')
return?updated
On the one hand, users have been told that transformations don't belong in
transform
(!): #22509 (comment), #14274 (comment). This makes sense if you think of thetransform('name')
form as solely for broadcasting aggregations.On the other hand, the documentation for
transform
, as well as Wes Mckinney's excellent pandas book portraytransform
as the dedicated tool for shape-preserving operations, so excluding them from thetransform('name')
case would be a little surprising.Personally, I'm +0 for deprecating
transform('rank')
and with a warning to useg.rank()
, as well as for the other transformation ops.The text was updated successfully, but these errors were encountered: