-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: RollingGroupby ignored as_index=False #40789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: RollingGroupby ignored as_index=False #40789
Conversation
mroeschke
commented
Apr 5, 2021
- closes BUG: groupby.rolling: Originial index is not being preserved when using date_part of DatetimeIndex and as_index key word seems to have no effect #39433
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
Does this fix #31007? Haven't looked closely but seen this in the past |
@phofl unfortunately doesnt look like it
|
Pitty, thanks for checking |
@@ -619,6 +621,8 @@ def _apply( | |||
) | |||
|
|||
result.index = result_index | |||
if not self._as_index: | |||
result = result.reset_index(level=list(range(len(groupby_keys)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens here when the groupby is on an explicit list, e.g. in your test use groupby(["A", "A", "B", "B"])
instead. What is groupby_keys in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the result
In [4]: df.groupby(["A", "A", "B", "B"], as_index=False).rolling(window=2, min_periods=1).mean()
> /Users/matthewroeschke/pandas-mroeschke/pandas/core/window/rolling.py(580)_apply()
-> result_index_names = groupby_keys + grouped_index_name
(Pdb) groupby_keys
[None]
(Pdb) c
Out[4]:
level_0 num
date
2018-01-01 A 100.0
2018-01-02 A 150.0
2018-01-01 B 150.0
2018-01-02 B 200.0
In [5]: df.groupby(["A", "A", "B", "B"], as_index=False).mean()
Out[5]:
num
0 150.0
1 200.0
Not sure if the normal groupby
has the expected result but appears that groupby.rolling
brings the list into the dataframe as a column
pandas/tests/window/test_groupby.py
Outdated
@@ -732,6 +732,42 @@ def test_groupby_level(self): | |||
) | |||
tm.assert_series_equal(result, expected) | |||
|
|||
def test_as_index_false(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the multi-key groupby here as well (parameterize if you can)
df_res2 = df.groupby([df.id, df.index.weekday], as_index=False).rolling(window=2, min_periods=1).mean()
df_res3 = df.groupby([df.id]).rolling(window=2, min_periods=1).mean()
df_res4 = df.groupby([df.id], as_index=False).rolling(window=2, min_periods=1).mean()
e.g. 2 & 4 (we likley have 1 & 3 covered, but wouldn't object to those included as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm all these tested (except the as_index=True
cases which are tested everywhere else). So you just want it parameterized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they are elsewhere then don't (just the cases that is covering in this PR are fine). parameterize if you can.
thanks @mroeschke |