Skip to content

BUG: Remove error raise to allow same input function on the same column in named aggregation #28428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Dec 8, 2019

Conversation

charlesdong1991
Copy link
Member

@charlesdong1991 charlesdong1991 commented Sep 13, 2019

#27921 (comment)

this was kind of mentioned in the PR27921, and the case used in the issue is basically the same as having min or max. Previously it was not raising error is because I deliberately skipped this case since i thought this removal should be treated in different PR. Feel free to review and comment your ideas.

@charlesdong1991 charlesdong1991 changed the title BUG: Remove error raise to allow lambdas on the same column in named aggregation BUG: Remove error raise to allow same input function on the same column in named aggregation Sep 13, 2019
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm what happens with something like

In [6]: df = pd.DataFrame({"A": [1, 2]})

In [7]: df.groupby([0, 0]).agg(['sum', 'sum'])

In that case, I think we want to raise.

@@ -100,7 +100,7 @@ Other
^^^^^

- Compatibility with Python 3.8 in :meth:`DataFrame.query` (:issue:`27261`)
-
- Remove error raised due to duplicated input functions in named aggregation in :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`28426`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to 1.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! done! thanks! @TomAugspurger

@charlesdong1991
Copy link
Member Author

this solution looks tricky though, but i tested on other cases like lambda or min, 'max` stuff, they are all working except the partials. So I made this change (although looks like two different issues) @TomAugspurger

@charlesdong1991
Copy link
Member Author

@jreback @TomAugspurger merged to master and fixed tests

since this PR was quite long ago, so i hereby put little summary to the outcome of this PR, with the PR merged:

df = pd.DataFrame({"A": [1, 2]})
df.groupby([0, 0]).agg(['sum', 'sum'])

will still raise an error, but

df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df.groupby("A").agg(a=("B", "sum"), b=("B", "sum"))

will not raise because of users specifying different new column names for it

@charlesdong1991
Copy link
Member Author

emm, not sure what caused it. 🤔

@WillAyd
Copy link
Member

WillAyd commented Nov 7, 2019

@charlesdong1991 we had some CI issues a few weeks back; if you merge master and repush should be resolved

@charlesdong1991
Copy link
Member Author

thanks, @WillAyd merged! feel free to review and see if you think it's worth it to leave this PR open!

@@ -456,6 +456,7 @@ Other
- :meth:`DataFrame.to_csv` and :meth:`Series.to_csv` now support dicts as ``compression`` argument with key ``'method'`` being the compression method and others as additional compression options when the compression method is ``'zip'``. (:issue:`26023`)
- Bug in :meth:`Series.diff` where a boolean series would incorrectly raise a ``TypeError`` (:issue:`17294`)
- :meth:`Series.append` will no longer raise a ``TypeError`` when passed a tuple of ``Series`` (:issue:`28410`)
- Remove error raised due to duplicated input functions in named aggregation in :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`28426`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this to groupby (and the groupby one right below).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your comment! moved! @jreback

@jreback
Copy link
Contributor

jreback commented Nov 20, 2019

can you merge master

@pep8speaks
Copy link

pep8speaks commented Nov 20, 2019

Hello @charlesdong1991! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-12-03 14:45:06 UTC

@charlesdong1991
Copy link
Member Author

merged!
could you pls take a look? @jreback

@jreback jreback added this to the 1.0 milestone Dec 2, 2019
@jreback
Copy link
Contributor

jreback commented Dec 2, 2019

this looks good @charlesdong1991 can you merge master and ping on green.

@charlesdong1991
Copy link
Member Author

ping

@jreback

@@ -699,6 +699,8 @@ Groupby/resample/rolling
- Bug in :meth:`DataFrame.groupby` not offering selection by column name when ``axis=1`` (:issue:`27614`)
- Bug in :meth:`DataFrameGroupby.agg` not able to use lambda function with named aggregation (:issue:`27519`)
- Bug in :meth:`DataFrame.groupby` losing column name information when grouping by a categorical column (:issue:`28787`)
- Remove error raised due to duplicated input functions in named aggregation in :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`28426`)
- :meth:`SeriesGroupBy.value_counts` will be able to handle the case even when the :class:`Grouper` makes empty groups (:issue: 28479)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now a duplicated whatsnew? (2nd item), can you remove

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, good catch!! I forgot to remove it in the other place.

@@ -699,6 +699,8 @@ Groupby/resample/rolling
- Bug in :meth:`DataFrame.groupby` not offering selection by column name when ``axis=1`` (:issue:`27614`)
- Bug in :meth:`DataFrameGroupby.agg` not able to use lambda function with named aggregation (:issue:`27519`)
- Bug in :meth:`DataFrame.groupby` losing column name information when grouping by a categorical column (:issue:`28787`)
- Remove error raised due to duplicated input functions in named aggregation in :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`28426`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand this slightly to help a user understand what is actually changed here (previously raising in valid cases)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, expanded!

@charlesdong1991
Copy link
Member Author

ping @jreback

@jreback jreback merged commit facd756 into pandas-dev:master Dec 8, 2019
@jreback
Copy link
Contributor

jreback commented Dec 8, 2019

thanks @charlesdong1991

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't use same function twice with named aggregation
6 participants