Skip to content

BUG: rolling aggregate() with a list of functions along axis 1 raises ValueError #46132 #46892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.5.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -862,6 +862,7 @@ Groupby/resample/rolling
- Bug in :meth:`Rolling.skew` and :meth:`Rolling.kurt` would give NaN with window of same values (:issue:`30993`)
- Bug in :meth:`.Rolling.var` would segfault calculating weighted variance when window size was larger than data size (:issue:`46760`)
- Bug in :meth:`Grouper.__repr__` where ``dropna`` was not included. Now it is (:issue:`46754`)
- Bug in :meth:`Rolling.aggregate` raise ValueError when ``func`` was a list of functions and ``axis=1`` (:issue:`46132`)
- Bug in :meth:`DataFrame.rolling` gives ValueError when center=True, axis=1 and win_type is specified (:issue:`46135`)
- Bug in :meth:`.DataFrameGroupBy.describe` and :meth:`.SeriesGroupBy.describe` produces inconsistent results for empty datasets (:issue:`41575`)
- Bug in :meth:`DataFrame.resample` reduction methods when used with ``on`` would attempt to aggregate the provided column (:issue:`47079`)
Expand Down
21 changes: 20 additions & 1 deletion pandas/core/window/rolling.py
Original file line number Diff line number Diff line change
Expand Up @@ -656,9 +656,28 @@ def _numba_apply(
return self._resolve_output(out, obj)

def aggregate(self, func, *args, **kwargs):
result = ResamplerWindowApply(self, func, args=args, kwargs=kwargs).agg()
# GH46132
# modifying axis and transposing dataframe should not be needed
# once ReamplerWindow supports axis = 1

obj = self.obj
axis = self.axis

self.obj = self.obj.T if self.axis == 1 else self.obj
self.axis = 0

try:
result = ResamplerWindowApply(self, func, args=args, kwargs=kwargs).agg()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just add an axis argument here and thread it thru. this state change is not at all obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just would like to make sure if you want me to make this comment more clear? And I notice that in PR #47078, it says rolling window doesn't support axis = 1 anymore and that caused this change didn't pass the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure what you mean. I don't really like this approach because its hiding the axis state. What I am suggesting is we actually fix this by passing thru the axis argument to do this properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the agg method of ResamplerWindowApply hasn't been implemented, yet, and it will call the agg method of Apply class which can't recognize that we want to agg on axis=1 instead of axis=0. Even though I am not entirely familiar with pandas, I think passing the axis argument could not help.

finally:
self.obj = obj
self.axis = axis

if axis == 1:
result = result.T if result is not None else result

if result is None:
return self.apply(func, raw=False, args=args, kwargs=kwargs)

return result

agg = aggregate
Expand Down
34 changes: 34 additions & 0 deletions pandas/tests/window/test_rolling.py
Original file line number Diff line number Diff line change
Expand Up @@ -1851,6 +1851,22 @@ def test_rolling_var_same_value_count_logic(values, window, min_periods, expecte
tm.assert_series_equal(expected == 0, result_std == 0)


def test_rolling_agg_on_columns():
# GH 46132

df = DataFrame({"a": [1, 3], "b": [2, 4]})
res = df.rolling(window=2, axis=1, min_periods=1).aggregate([np.sum, np.mean])
expected_val = np.array([[1, 3.0], [1, 1.5], [3, 7], [3, 3.5]])

expected_index = MultiIndex.from_tuples(
[(0, "sum"), (0, "mean"), (1, "sum"), (1, "mean")]
)

expected_frame = DataFrame(expected_val, index=expected_index, columns=["a", "b"])

tm.assert_frame_equal(expected_frame, res)


def test_rolling_mean_sum_floating_artifacts():
# GH 42064.

Expand All @@ -1862,6 +1878,24 @@ def test_rolling_mean_sum_floating_artifacts():
assert (result[-3:] == 0).all()


def test_rolling_agg_when_agg_fail():
# GH 46132
df = DataFrame({"a": [1, 3], "b": [2, 4]})
win = df.rolling(window=2, axis=1, min_periods=1)
try:
win.aggregate([np.log, np.sqrt])
except ValueError:
pass
tm.assert_frame_equal(win.obj, df)
# make sure if aggregate fails the attribute of Rolling/Window will not be changed
assert win.axis == 1
# make sure the attribute of Rolling/Window will not be changed
# when aggregate runs successfully
win.aggregate([np.mean, np.sum])
tm.assert_frame_equal(win.obj, df)
assert win.axis == 1


def test_rolling_skew_kurt_floating_artifacts():
# GH 42064 46431

Expand Down