-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Clarify difference between agg and apply for Series / DataFrame #49672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the changed tests imply changed behavior of apply or agg? If it does, this needs a whatsnew entry.
|
||
# caller can react | ||
return None | ||
|
||
def agg_udf(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "udf" here a shorthand for? Can you rename to something more explicit and/or add a doc string & type hints to help better understand this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User Defined Function; I'd prefer sticking with the name but completely agree it should be expanding upon in a docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Is |
Yes - it very much changes their behavior in various cases. I think one could make the argument that the inconsistency in the current behavior rises to the level of a bug, but because the difference between agg / apply was never made clear before, I think this behavior needs to be deprecated first before changing. This PR was meant just to demonstrate what the implementation of the proposal in #49673 would look like.
Yes - we can guarantee list-likes will return DataFrame and strings / callables will return Series. |
Ok great, do you have a plan how to do the deprecation? Seems a bit difficult to do cleanly. |
Yes - I've thought about this and it is quite difficult. So I wouldn't be surprised if what I have come up with missing some aspect(s). For a single argument:
For multiple arguments, we will need to use
If |
Ok, yes it is indeed difficult. You're saying that "For DataFrame.agg, we need to warn precisely when the result is a DataFrame...". Does this imply that things will keep working the old way, and users will just get a warning? How will users get to use the new (IMO better) way? A flag somewhere? |
I think for all ops, users will have a way of changing their code to retain current behavior that won't break when we make the change and we'll be able to warn them. I think that's sufficient for deprecation from a backwards compatibility point of view. Without a flag, they wouldn't have a way to get the new behavior. We could implement it with a flag - I'm not opposed to this, but also don't see it at necessary (might still be nice to do). |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
|
||
from pandas import Series | ||
|
||
result = Series(results, index=keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we be using something._constructor here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could be using selected_obj._constructor_sliced
@rhshadrach is this still draft or ready for review. also some conflicts to resolve. |
@simonjayhawkins - this was just to demonstrate what the changes look like for #49673; will need to deprecate first. I'll close this and put up one for the deprecation. |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.