-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Ensure same index is returned for slow and fast path in groupby.apply #31613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7169acf
to
ffbcbe3
Compare
8838d36
to
ad1ec2f
Compare
pandas/tests/groupby/test_apply.py
Outdated
@@ -851,3 +851,39 @@ def test_apply_function_returns_non_pandas_non_scalar(function, expected_values) | |||
result = df.groupby("groups").apply(function) | |||
expected = pd.Series(expected_values, index=pd.Index(["A", "B"], name="groups")) | |||
tm.assert_series_equal(result, expected) | |||
|
|||
|
|||
def test_apply_fast_slow_identical(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see if we have similar fast/slow tests; place these by them (if not ok).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few tests further up the file wich affect similar code paths, e.g. test_fast_apply
or test_group_apply_once_per_group
which seems a suitable place
are any of the other issues you referenced closed by this? (aside from the ones you listed) |
is this a regression from 0.25.x? |
The issues I referenced in the issue are already closed. I primarily referenced them to highlight this as a hotspot.
I believe this is not a regression and has been around for a while now. I'll check anyhow to confirm.
|
If this is not a regression, let's keep it for 1.1 then, I would say |
Agreed. |
also pls merge master |
I've found that if I disable fast_apply, then there is a test failure in tests.groupby.test_whitelist for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ok, but have comments
can you merge master and ping on green and we can look |
sorry, was a bit swamped. Will rebase and address the comments |
ad1ec2f
to
ec34b4a
Compare
2187f4d
to
0f6f8a2
Compare
@jreback I rebased but it seems there is a (probably unrelated) test failure on the azure pipelines with |
It is unrelated, but not yet trouble-shot. |
0f6f8a2
to
8503c04
Compare
I rebased on master and kept the changelog for 1.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, if you can update the whatsnew note a bit (ok to make a bit longer; re-reading this is not that common i think), but still want to signal the intentions to the user a bit more.
doc/source/whatsnew/v1.1.0.rst
Outdated
@@ -827,6 +827,7 @@ Groupby/resample/rolling | |||
- Bug in :meth:`GroupBy.agg`, :meth:`GroupBy.transform`, and :meth:`GroupBy.resample` where subclasses are not preserved (:issue:`28330`) | |||
- Bug in :meth:`GroupBy.rolling.apply` ignores args and kwargs parameters (:issue:`33433`) | |||
- Bug in :meth:`DataFrameGroupby.std` and :meth:`DataFrameGroupby.sem` would modify grouped-by columns when ``as_index=False`` (:issue:`10355`) | |||
- Bug in :meth:`core.groupby.DataFrameGroupBy.apply` where the result shape was incorrect when the output index was not identical to the input index (:issue:`31612`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this more user informative, e.g. what case does this happen
doc/source/whatsnew/v1.1.0.rst
Outdated
@@ -827,6 +827,7 @@ Groupby/resample/rolling | |||
- Bug in :meth:`GroupBy.agg`, :meth:`GroupBy.transform`, and :meth:`GroupBy.resample` where subclasses are not preserved (:issue:`28330`) | |||
- Bug in :meth:`GroupBy.rolling.apply` ignores args and kwargs parameters (:issue:`33433`) | |||
- Bug in :meth:`DataFrameGroupby.std` and :meth:`DataFrameGroupby.sem` would modify grouped-by columns when ``as_index=False`` (:issue:`10355`) | |||
- Bug in :meth:`core.groupby.DataFrameGroupBy.apply` where the result shape was incorrect when the output index was not identical to the input index (:issue:`31612`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also list the other closed issue here #14927
ping on green. |
Trying to rewrite the changelog. pretty difficult to actually list all tickets affected by this since most users don't go into depths that far. |
what you wrote is ok, the idea is really to just list the affected tickets so that users can click thru if desired. when we close an issue we want it referenced in the whatsnew and a test to confirm :-> |
thanks @fjetter very nice! |
This fixes the internal check to be consistent with the slow apply path
closes API: inconsistent return format of groupby apply #13056
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff