-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: support corr and cov functions for custom BaseIndexer rolling windows #33804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…olling operations (pandas-dev#33057)" This reverts commit 2bf54a8.
@jreback @mroeschke |
if isinstance(self.window, BaseIndexer): | ||
window = self.window | ||
else: | ||
window = self._get_window(other) if not self.is_freq_type else self.win_freq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix for corr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment here and on cov explaining what e are doing (generally not specific to BaseIndexer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Done. Please take a look if it's what you had in mind.
if self.is_freq_type: | ||
window = self.win_freq | ||
if isinstance(self.window, BaseIndexer): | ||
window = self.window |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix for cov
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. small comment
if isinstance(self.window, BaseIndexer): | ||
window = self.window | ||
else: | ||
window = self._get_window(other) if not self.is_freq_type else self.win_freq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment here and on cov explaining what e are doing (generally not specific to BaseIndexer)
Looks like you've also committed a |
@mroeschke |
CI is being uncooperative. Probably temporary, will restart later. |
All green, please take a look at the post where I recorded the results of digging into |
thanks @AlexKirko very nice work on this series! keep em coming! |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Scope of PR
This is the final PR needed to solve #32865. It fixes
cov
andcorr
by passing the correct argument torolling
constructors if the functions are called with a customBaseIndexer
subclass. A separate test is added forcov
andcorr
as the previous tests were single-variable and wouldn't work with two columns of data. I also propose we revert #33057.Details
The reason
cov
andcorr
didn't work was that we were passing_get_window
results (usually, window width), instead of theBaseIndexer
subclass torolling
constructors called inside the function. Passingself.window
(when necessary) fixes the issue.The new test asserts only against an explicit array, because shoehorning a comparison against
np.cov
annp.corrcoef
is ugly (I've tried). Ourrolling.apply
isn't really suited to be called with bivariate statistics functions, and the necessary gymnastics didn't look good to me. I don't believe the minor benefit would be worth, but please say if you think coomparing against numpy is necessary.Thanks to @mroeschke for implementing the
NotImplemented
error-raising behavior in #33057 . Now that all the functions are fixed, I propose we revert that commit.Background on the wider issue
We currently don't support several rolling window functions when building a rolling window object using a custom class descended from
pandas.api.indexers.Baseindexer
. The implementations were written with backward-looking windows in mind, and this led to these functions breaking.Currently, using these functions returns a
NotImplemented
error thanks to #33057, but ideally we want to update the implementations, so that they will work without a performance hit. This is what I aim to do over a series of PRs.Perf notes
No changes to algorithms.