-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: new .agg for list-likes #43736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: new .agg for list-likes #43736
Conversation
just a comment - I have always tried to avoid names like "new_routine" because at some point, it won't be, and you wished you never called it that in the first place! :) |
@attack68 - The plan is to put all usage of these methods behind the option "new_udf_methods"; when the old are deprecated and then removed, they will be renamed by dropping all the "new_" prefixes. If you have any suggestions for different names, please do make them! |
no, no suggestions. But, and I kid you not, in my line of work I have seen |
Agreed, with @attack68. Helpful to have a slightly clearer naming even during the transition period. Maybe |
@mroeschke and @attack68 - is there opposition to the "new_udf_methods" option as well, or is it just on the method names themselves? I am not a particular fan of "new" myself, but I could not think of anything concise and better. Regardless of the name that is picked, I would like the relationship between the option name and the methods that are behind that option name to be clear. Edit: could go with "experimental" and prefix the methods with that? Edit: "experimental" is a bad choice because there will be a period where the default for this option is True and thus it is no longer experimental. |
+1 I would hope these are consistent. Maybe |
Thanks @mroeschke - I like "future". |
…w_udfs_list_agg � Conflicts: � pandas/tests/resample/test_resample_api.py
@@ -408,6 +414,77 @@ def agg_list_like(self) -> DataFrame | Series: | |||
) | |||
return concatenated.reindex(full_ordered_index, copy=False) | |||
|
|||
def future_list_single_arg( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same request from the other PR re naming; "future" won't be very helpful for a reader a year from now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention is to have "future_" methods alongside the current methods; all with the same prefix so they are easy to identify. Any such method is behind the option "future_udf_behavior" meaning they will only be called when set to True. Assuming we do end up going forward with this new (experimental) behavior, once it is in a good place we deprecate the option and then remove the option.
A year from now, we will still have the option "future_udf_behavior", and in my opinion, the "future_" prefix is meaningful and helpful - namely in its connection to this option. It is also the (intended) future behavior of the methods. When the option is removed, the "old" methods are removed and the "future_" methods are renamed by removing the prefix (none are public).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you've clearly given this more thought than i have so im going to stop complaining about this, will instead grumble to myself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is still up for improvements and suggestions are most welcome, but wanted to explain why I felt "future" was appropriate.
…w_udfs_list_agg � Conflicts: � pandas/core/groupby/generic.py � pandas/core/groupby/groupby.py � pandas/tests/apply/test_frame_apply.py � pandas/tests/groupby/test_groupby.py
* Fix dtypes for read_json * Address comments * Add whatsnew entry * Update doc/source/whatsnew/v1.4.0.rst Co-authored-by: Matthew Zeitlin <[email protected]> * Linting Co-authored-by: Matthew Zeitlin <[email protected]>
I've added docs (built attachment in OP) summarizing the changes here. |
…w_udfs_list_agg � Conflicts: � pandas/core/groupby/generic.py � pandas/core/groupby/groupby.py � pandas/tests/apply/test_frame_apply.py � pandas/tests/groupby/test_groupby.py
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
Part of #43678
Replaces the list-like implementation for
.agg
used by Series, DataFrame, SeriesGroupBy, DataFrameGroupBy, Window, and Resample. This will transpose agg results, and swap multiindex levels as well as the order from within the levels. Adds the new implementation behind the option "new_udf_methods".Docs: future_udf_behavior.zip