Skip to content

ENH: Implement Keyword Aggregation for DataFrame.agg and Series.agg #29116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 148 commits into from
Jul 10, 2020

Conversation

charlesdong1991
Copy link
Member

@charlesdong1991 charlesdong1991 commented Oct 20, 2019

This PR basically deals with three issue:

  1. implement keyword aggregation for DataFrame.agg
  2. implement keyword aggregation for Series.agg

@@ -0,0 +1,193 @@
from collections import OrderedDict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more informative name than "helper"?

func, indexes, order = _normalize_keyword_aggregation(kwargs)
reordered_indexes = [
pair[0] for pair in sorted(zip(indexes, order), key=lambda t: t[1])
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of this is already implemented in the SelectionMixin
you should not have to add much code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emm, i think this implementation is sort of post-processing after aggregation from SelectionMixin? no?
i will try my best to shorten the code anyway! thanks! @jreback

@charlesdong1991 charlesdong1991 changed the title ENH: Implement Keyword Aggregation for DataFrame.agg [WIP] ENH: Implement Keyword Aggregation for DataFrame.agg Oct 20, 2019
@charlesdong1991 charlesdong1991 changed the title [WIP] ENH: Implement Keyword Aggregation for DataFrame.agg ENH: Implement Keyword Aggregation for DataFrame.agg and Series.agg Oct 20, 2019
@charlesdong1991
Copy link
Member Author

ping @jreback @WillAyd for further reviews, thanks in advance!

@charlesdong1991 charlesdong1991 requested a review from WillAyd June 27, 2020 19:38
Union[
Union[Callable, str],
List[Union[Callable, str]],
Dict[Label, Union[Union[Callable, str], List[Union[Callable, str]]]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Dict[Label, Union[Union[Callable, str], List[Union[Callable, str]]]],
Dict[Label, Union[Callable, str], List[Union[Callable, str]]],

If reading correctly I think this is superfluous

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it's indeed superfluous, i changed to Dict[Label, Union[Callable, str, List[Union[Callable, str]]]

@charlesdong1991 charlesdong1991 requested a review from WillAyd July 5, 2020 12:40
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. very nice, just a minor changes. ping on green.

func: Optional[
Union[
Union[Callable, str],
List[Union[Callable, str]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crazy signatures, can you create an alias to make this easier to read, maybe some of this is Label?

@@ -198,6 +293,79 @@ def maybe_mangle_lambdas(agg_spec: Any) -> Any:
return mangled_aggspec


def _relabel_result(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this public (no _) as we are calling from a related module (its not exposed to users at all, just easier to grok)

@jreback jreback added this to the 1.1 milestone Jul 10, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm @charlesdong1991 typing cleanings can e a followup

@charlesdong1991
Copy link
Member Author

charlesdong1991 commented Jul 10, 2020

oh, sorry, i am working on the annotations, just haven't pushed the commit @jreback

will ping you for review once i have simplified annotations!

@charlesdong1991
Copy link
Member Author

emm, the errors look unrelated to this PR, I will rebase once there is new commits merged to master.

In the meantime, maybe pls take a look at the annotations so that i could make adjustments if needed? @jreback

@charlesdong1991
Copy link
Member Author

ehh?? looks good again somehow? 😅 i thought i saw errors back then

sorry for the noise

import pandas.core.common as com
from pandas.core.indexes.api import Index
from pandas.core.series import FrameOrSeriesUnion, Series

# types of `func` kwarg for DataFrame.aggregate and Series.aggregate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, we may want to think about putting these in _typing, but separate issue / PR (you can just try it if you want), but in a follown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, okay! i deliberately moved it from _typing to here in commit 05921af since i thought this might be only used in this file.

I will do a PR for moving it to _typing.

@jreback jreback merged commit 0534b00 into pandas-dev:master Jul 10, 2020
@jreback
Copy link
Contributor

jreback commented Jul 10, 2020

thanks @charlesdong1991 very nice!

simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this pull request Aug 28, 2020
jbrockmendel pushed a commit that referenced this pull request Aug 30, 2020
AlexKirko pushed a commit to AlexKirko/pandas that referenced this pull request Aug 31, 2020
kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow Keyword Aggregation in DataFrame.agg and Series.agg
6 participants