Skip to content

BUG: support corr and cov functions for custom BaseIndexer rolling windows #33804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 27, 2020

Conversation

AlexKirko
Copy link
Member

Scope of PR

This is the final PR needed to solve #32865. It fixes cov and corr by passing the correct argument to rolling constructors if the functions are called with a custom BaseIndexer subclass. A separate test is added for cov and corr as the previous tests were single-variable and wouldn't work with two columns of data. I also propose we revert #33057.

Details

The reason cov and corr didn't work was that we were passing _get_window results (usually, window width), instead of the BaseIndexer subclass to rolling constructors called inside the function. Passing self.window (when necessary) fixes the issue.

The new test asserts only against an explicit array, because shoehorning a comparison against np.cov an np.corrcoef is ugly (I've tried). Our rolling.apply isn't really suited to be called with bivariate statistics functions, and the necessary gymnastics didn't look good to me. I don't believe the minor benefit would be worth, but please say if you think coomparing against numpy is necessary.

Thanks to @mroeschke for implementing the NotImplemented error-raising behavior in #33057 . Now that all the functions are fixed, I propose we revert that commit.

Background on the wider issue

We currently don't support several rolling window functions when building a rolling window object using a custom class descended from pandas.api.indexers.Baseindexer. The implementations were written with backward-looking windows in mind, and this led to these functions breaking.
Currently, using these functions returns a NotImplemented error thanks to #33057, but ideally we want to update the implementations, so that they will work without a performance hit. This is what I aim to do over a series of PRs.

Perf notes

No changes to algorithms.

@AlexKirko
Copy link
Member Author

@jreback @mroeschke
Final PR to solve #32865. All green, details above, please tell me what you think.

@AlexKirko AlexKirko changed the title Rolling corr cov BUG: support corr and cov functions for custom BaseIndexer rolling windows Apr 26, 2020
if isinstance(self.window, BaseIndexer):
window = self.window
else:
window = self._get_window(other) if not self.is_freq_type else self.win_freq
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix for corr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment here and on cov explaining what e are doing (generally not specific to BaseIndexer)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback Done. Please take a look if it's what you had in mind.

if self.is_freq_type:
window = self.win_freq
if isinstance(self.window, BaseIndexer):
window = self.window
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix for cov.

@jreback jreback added the Window rolling, ewma, expanding label Apr 26, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. small comment

if isinstance(self.window, BaseIndexer):
window = self.window
else:
window = self._get_window(other) if not self.is_freq_type else self.win_freq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment here and on cov explaining what e are doing (generally not specific to BaseIndexer)

@jreback jreback added this to the 1.1 milestone Apr 26, 2020
@mroeschke
Copy link
Member

Looks like you've also committed a doc/example.feather. Could you remove that?

@AlexKirko
Copy link
Member Author

AlexKirko commented Apr 27, 2020

Looks like you've also committed a doc/example.feather. Could you remove that?

@mroeschke
Thanks for spotting that! Have no idea how it snuck in.

@AlexKirko
Copy link
Member Author

CI is being uncooperative. Probably temporary, will restart later.

@AlexKirko AlexKirko requested review from mroeschke and jreback April 27, 2020 11:53
@AlexKirko
Copy link
Member Author

All green, please take a look at the post where I recorded the results of digging into _get_window.

@jreback jreback merged commit 63651f3 into pandas-dev:master Apr 27, 2020
@jreback
Copy link
Contributor

jreback commented Apr 27, 2020

thanks @AlexKirko

very nice work on this series! keep em coming!

rhshadrach pushed a commit to rhshadrach/pandas that referenced this pull request May 10, 2020
@AlexKirko AlexKirko deleted the rolling-corr-cov branch July 9, 2020 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: rolling window functions don't support custom indexers
3 participants