-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Rolling window with step size (GH-15354) #45765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@mroeschke, @jreback: This PR replaces #45735. The current changeset I prepared passes the original unchanged window tests, indicating that the new behavior does not break existing behaviors. After this changeset is worked out in review, the next changeset will add new tests. |
@rtpsw can you merge master and i think a few open comments. |
I reverted to the original signature of |
I added the tests. Please review. |
AFAICS, the current test failures are not related to my changes and possibly got introduced by a merge from master. How to handle them? |
) -> tuple[np.ndarray, np.ndarray]: | ||
|
||
if step is not None: | ||
raise NotImplementedError("step not implemented for variable offset window") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests hit this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and there are multiple instances of this error in this file that get tested. This is done using @step_not_implemented
(defined in pandas/util/_test_decorators.py
) on test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't these check be performed in BaseWindow._validate
? Ideally e.g. df.rolling("2s" , step=5)/df.ewm(step=5)
would raise at the constructor and not necessarily when an aggregation call is made
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i agree should move this check to be more consistent
# this cannot be done in BaseWindow._validate because there _get_window_indexer | ||
# would erroneously create a fixed window given a window argument like "1s" due | ||
# to _win_freq_i8 not being set | ||
indexer = self._get_window_indexer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just
if self.step is not None:
raise NotImplementedError(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this would be too restrictive and would fail a numeric step that is actually valid with fixed-length window indexers.
pandas/tests/window/test_rolling.py
Outdated
{"value": np.arange(n)}, | ||
index=date_range("2015-12-24", periods=n, freq="D"), | ||
) | ||
f = lambda: df.rolling(window=window, step=step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this is better!
Could you test the actual constructors raising though for expanding/groupby rolling/ewm the indexers are not exposed to the user?
i.e. df.expanding(step=3)
, df.ewm(step=3)
, df.groupby(key).rolling(step3)
should raise.
Also are there tests for df.rolling.cov/corr
raising?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've uploaded tests that cover these cases.
@mroeschke: Does the latest commit address your points? |
index=date_range("2015-12-24", periods=10, freq="D"), | ||
) | ||
f = lambda: getattr(df.rolling(window=2, step=step), agg)(df) | ||
if step is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just test if step if not None
since otherwise we have pretty good testing for cov/corr
index=date_range("2015-12-24", periods=10, freq="D"), | ||
) | ||
f = lambda: func(df)(window=window, step=step) | ||
if step is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just test if step if not None since otherwise we have pretty good testing for cov/corr
timedelta(days=3), | ||
Timedelta(days=3), | ||
"3D", | ||
ExpandingIndexer(window_size=3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExpandingIndexer
, ExponentialMovingWindowIndexer
, and GroupbyIndexer
can be removed since they are tested below
I don't think this has been addressed yet. https://github.com/pandas-dev/pandas/pull/45765/files#r809553567 The pattern in these classes is for constructor argument to be validated either in |
Note that the current test failures appear to be benign (due to |
I see. For unsupported step-argument cases, the current version raises I looked into also raising in the places you noted. AFAICS, the step argument is not available in |
Sorry, yes by constructor I was referring to |
The current version raises in
I'm assuming you want to move the last item to |
For exapanding and ewm you can to step validation in those |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would really like to simply do the validation of step in the consructor and not in the aggregation itself. it feels that you have this NotImplementedError all over the place right now.
) -> tuple[np.ndarray, np.ndarray]: | ||
|
||
if step is not None: | ||
raise NotImplementedError("step not implemented for variable offset window") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i agree should move this check to be more consistent
np.zeros(num_values, dtype=np.int64), | ||
np.arange(1, num_values + 1, dtype=np.int64), | ||
) | ||
if step is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
) -> tuple[np.ndarray, np.ndarray]: | ||
if step is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -358,6 +384,8 @@ def get_window_bounds( | |||
) | |||
start_arrays.append(window_indices.take(ensure_platform_int(start))) | |||
end_arrays.append(window_indices.take(ensure_platform_int(end))) | |||
if len(start_arrays) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank line here
) -> tuple[np.ndarray, np.ndarray]: | ||
|
||
return np.array([0], dtype=np.int64), np.array([num_values], dtype=np.int64) | ||
if step is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@@ -667,6 +698,9 @@ def __init__( | |||
# groupby.<agg_func>, but unexpected to users in | |||
# groupby.rolling.<agg_func> | |||
obj = obj.drop(columns=self._grouper.names, errors="ignore") | |||
# GH 15354 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment
@@ -1580,6 +1636,11 @@ def cov( | |||
ddof: int = 1, | |||
**kwargs, | |||
): | |||
if self.step is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
thanks @rtpsw very nice! let's followup as needed |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.