Skip to content

BUG: support count function for custom BaseIndexer rolling windows #33605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 17, 2020

Conversation

AlexKirko
Copy link
Member

Scope of PR

This PR makes sure that when we call count with a BaseIndexer subclass, custom start and end arrays get calculated, and the algorithm in the aggregations.pyx gets called instead of the _Window.create_blocks.

Turns out that we were sending the calculation into aggregations.pyx only for frequency-based windows, and custom BaseIndexer subclasses got ignored. Fixed it and cleaned the code up a bit.

Background on the wider issue

We currently don't support several rolling window functions when building a rolling window object using a custom class descended from pandas.api.indexers.Baseindexer. The implementations were written with backward-looking windows in mind, and this led to these functions breaking.
Currently, using these functions returns a NotImplemented error thanks to #33057, but ideally we want to update the implementations, so that they will work without a performance hit. This is what I aim to do over a series of PRs.

Perf notes

No changes to the algorithms necessary.

if self.is_freq_type:
# GH 32865. Also use custom rolling windows calculation
# when using a BaseIndexer subclass
if self.is_freq_type or isinstance(self.window, BaseIndexer):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the crux of the issue. Because we didn't check for BaseIndexer subclass here, the algorithm went down the default path, and the BaseIndexer implementation got completely ignored as blocks were created using _Window.create_blocks.

@@ -1171,8 +1171,6 @@ class _Rolling_and_Expanding(_Rolling):
)

def count(self):
if isinstance(self.window, BaseIndexer):
validate_baseindexer_support("count")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom BaseIndexer implementations should never go into this function. Instead, _get_window_indexer gets called, and there is a validate_baseindexer_support there (checked that it works by temporarily blacklisting count again)

Removing this check as unnecessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u add an assertion here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@AlexKirko
Copy link
Member Author

@mroeschke @jreback
This PR implements count support for custom rolling windows.

All green, details above.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment

@AlexKirko
Copy link
Member Author

@jreback
All green, please take a look if it's what you had in mind.

@AlexKirko AlexKirko requested a review from jreback April 17, 2020 17:08
@jreback jreback added the Window rolling, ewma, expanding label Apr 17, 2020
@jreback jreback added this to the 1.1 milestone Apr 17, 2020
@jreback jreback merged commit 299e27d into pandas-dev:master Apr 17, 2020
@jreback
Copy link
Contributor

jreback commented Apr 17, 2020

thanks @AlexKirko

@AlexKirko AlexKirko deleted the rolling-count branch April 18, 2020 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants