-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Remove error raise to allow same input function on the same column in named aggregation #28428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm what happens with something like
In [6]: df = pd.DataFrame({"A": [1, 2]})
In [7]: df.groupby([0, 0]).agg(['sum', 'sum'])
In that case, I think we want to raise.
doc/source/whatsnew/v0.25.2.rst
Outdated
@@ -100,7 +100,7 @@ Other | |||
^^^^^ | |||
|
|||
- Compatibility with Python 3.8 in :meth:`DataFrame.query` (:issue:`27261`) | |||
- | |||
- Remove error raised due to duplicated input functions in named aggregation in :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`28426`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to 1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! done! thanks! @TomAugspurger
this solution looks tricky though, but i tested on other cases like lambda or |
@jreback @TomAugspurger merged to master and fixed tests since this PR was quite long ago, so i hereby put little summary to the outcome of this PR, with the PR merged: df = pd.DataFrame({"A": [1, 2]})
df.groupby([0, 0]).agg(['sum', 'sum']) will still raise an error, but df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df.groupby("A").agg(a=("B", "sum"), b=("B", "sum")) will not raise because of users specifying different new column names for it |
emm, not sure what caused it. 🤔 |
@charlesdong1991 we had some CI issues a few weeks back; if you merge master and repush should be resolved |
thanks, @WillAyd merged! feel free to review and see if you think it's worth it to leave this PR open! |
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -456,6 +456,7 @@ Other | |||
- :meth:`DataFrame.to_csv` and :meth:`Series.to_csv` now support dicts as ``compression`` argument with key ``'method'`` being the compression method and others as additional compression options when the compression method is ``'zip'``. (:issue:`26023`) | |||
- Bug in :meth:`Series.diff` where a boolean series would incorrectly raise a ``TypeError`` (:issue:`17294`) | |||
- :meth:`Series.append` will no longer raise a ``TypeError`` when passed a tuple of ``Series`` (:issue:`28410`) | |||
- Remove error raised due to duplicated input functions in named aggregation in :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`28426`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move this to groupby (and the groupby one right below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your comment! moved! @jreback
can you merge master |
Hello @charlesdong1991! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2019-12-03 14:45:06 UTC |
merged! |
this looks good @charlesdong1991 can you merge master and ping on green. |
ping |
@@ -699,6 +699,8 @@ Groupby/resample/rolling | |||
- Bug in :meth:`DataFrame.groupby` not offering selection by column name when ``axis=1`` (:issue:`27614`) | |||
- Bug in :meth:`DataFrameGroupby.agg` not able to use lambda function with named aggregation (:issue:`27519`) | |||
- Bug in :meth:`DataFrame.groupby` losing column name information when grouping by a categorical column (:issue:`28787`) | |||
- Remove error raised due to duplicated input functions in named aggregation in :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`28426`) | |||
- :meth:`SeriesGroupBy.value_counts` will be able to handle the case even when the :class:`Grouper` makes empty groups (:issue: 28479) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now a duplicated whatsnew? (2nd item), can you remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, good catch!! I forgot to remove it in the other place.
doc/source/whatsnew/v1.0.0.rst
Outdated
@@ -699,6 +699,8 @@ Groupby/resample/rolling | |||
- Bug in :meth:`DataFrame.groupby` not offering selection by column name when ``axis=1`` (:issue:`27614`) | |||
- Bug in :meth:`DataFrameGroupby.agg` not able to use lambda function with named aggregation (:issue:`27519`) | |||
- Bug in :meth:`DataFrame.groupby` losing column name information when grouping by a categorical column (:issue:`28787`) | |||
- Remove error raised due to duplicated input functions in named aggregation in :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`28426`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you expand this slightly to help a user understand what is actually changed here (previously raising in valid cases)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, expanded!
ping @jreback |
thanks @charlesdong1991 |
…mn in named aggregation (pandas-dev#28428)
…mn in named aggregation (pandas-dev#28428)
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
#27921 (comment)
this was kind of mentioned in the PR27921, and the case used in the issue is basically the same as having
min
ormax
. Previously it was not raising error is because I deliberately skipped this case since i thought this removal should be treated in different PR. Feel free to review and comment your ideas.