Skip to content

BUG: Fix BaseWindowGroupby.aggregate where as_index is ignored #54973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ Performance improvements
Bug fixes
~~~~~~~~~
- Bug in :class:`AbstractHolidayCalendar` where timezone data was not propagated when computing holiday observances (:issue:`54580`)
- Bug in :meth:`BaseWindowGroupby.aggregate` where ``_as_index`` attribute is ignored (:issue:`31007`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you write this in terms of public APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this now


Categorical
^^^^^^^^^^^
Expand Down
6 changes: 6 additions & 0 deletions pandas/core/window/ewm.py
Original file line number Diff line number Diff line change
Expand Up @@ -919,6 +919,12 @@ def _get_window_indexer(self) -> GroupbyIndexer:
)
return window_indexer

@doc(_shared_docs["aggregate"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this and below be inherited from the base class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. This was an ugly workaround to a mypy error I was getting but using type: ignore[misc] instead now

def aggregate(self, func, *args, **kwargs):
return super().aggregate(func, *args, **kwargs)

agg = aggregate


class OnlineExponentialMovingWindow(ExponentialMovingWindow):
def __init__(
Expand Down
6 changes: 6 additions & 0 deletions pandas/core/window/expanding.py
Original file line number Diff line number Diff line change
Expand Up @@ -962,3 +962,9 @@ def _get_window_indexer(self) -> GroupbyIndexer:
window_indexer=ExpandingIndexer,
)
return window_indexer

@doc(_shared_docs["aggregate"])
def aggregate(self, func, *args, **kwargs):
return super().aggregate(func, *args, **kwargs)

agg = aggregate
14 changes: 14 additions & 0 deletions pandas/core/window/rolling.py
Original file line number Diff line number Diff line change
Expand Up @@ -875,6 +875,14 @@ def _gotitem(self, key, ndim, subset=None):
subset = self.obj.set_index(self._on)
return super()._gotitem(key, ndim, subset=subset)

def aggregate(self, func, *args, **kwargs):
result = super().aggregate(func, *args, **kwargs)
if not self._as_index:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should live in BaseWindowGroupby so added this aggregate function. After adding, mypy complained about incompatibility of base classes for classes inheriting BaseWindowGroupby. Adding aggregate function to those solves the issue happy to extend their docs if required or implement another workaround if preferred over this

result = result.reset_index(level=list(range(len(self._grouper.names))))
return result

agg = aggregate


class Window(BaseWindow):
"""
Expand Down Expand Up @@ -2911,3 +2919,9 @@ def _validate_datetimelike_monotonic(self):
f"Each group within {on} must be monotonic. "
f"Sort the values in {on} first."
)

@doc(_shared_docs["aggregate"])
def aggregate(self, func, *args, **kwargs):
return super().aggregate(func, *args, **kwargs)

agg = aggregate
22 changes: 16 additions & 6 deletions pandas/tests/window/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -846,10 +846,20 @@ def test_groupby_level(self):
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize(
"by, expected_data",
"func, by, expected_data",
[
[["id"], {"num": [100.0, 150.0, 150.0, 200.0]}],
[
lambda x: x.agg({"num": "mean"}),
["id"],
{"num": [100.0, 150.0, 150.0, 200.0]},
],
[
lambda x: x.mean(),
["id"],
{"num": [100.0, 150.0, 150.0, 200.0]},
],
[
lambda x: x.mean(),
["id", "index"],
{
"date": [
Expand All @@ -863,8 +873,8 @@ def test_groupby_level(self):
],
],
)
def test_as_index_false(self, by, expected_data):
# GH 39433
def test_as_index_false(self, func, by, expected_data):
# GH 39433, 31007
data = [
["A", "2018-01-01", 100.0],
["A", "2018-01-02", 200.0],
Expand All @@ -876,8 +886,8 @@ def test_as_index_false(self, by, expected_data):
df = df.set_index(["date"])

gp_by = [getattr(df, attr) for attr in by]
result = (
df.groupby(gp_by, as_index=False).rolling(window=2, min_periods=1).mean()
result = func(
df.groupby(gp_by, as_index=False).rolling(window=2, min_periods=1)
)

expected = {"id": ["A", "A", "B", "B"]}
Expand Down