-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: support skew function for custom BaseIndexer rolling windows #33745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Clarify that any method that has Sample before its name needs to be used carefully with windows as this is almost never what the user wants.
Also fix the value passed to calculate_min_periods for custom BaseIndexer implementations.
@jreback @mroeschke Fixed those and edited the docs to be more clear. Details in the description above. |
Did something happen? Why are we running CI only on macOS? Can't find the issue that led to this . . . |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@jreback Plenty horrible stuff gets into software because 'it should have worked'. |
k can you merge master again CI should run |
@jreback |
thanks @AlexKirko |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Scope of PR
This PR does a couple things to fix the performance of
skew
, and this also fixes the behavior of all the functions that haverequire_min_periods
for small values ofmin_periods
:_apply
that made us never go into theBaseIndexer
control flow branch_apply
: passwindow_indexer.window_size
intocalc_min_periods
instead ofmin_periods or 1
. We want the custom indexer window size there, so it's not clear to me why we were passingmin_periods or 1
.Details
The algorithm itself is robust, but it defaults to sample skewness, which is why there was a difference between its output and
numpy
. To prevent misunderstandings, I clarified the docs a bit.We were also passing a wrong value to
calc_min_periods
, and we weren't going into the proper if branch, because we were checking the type ofwindow
instead ofself.window
.Background on the wider issue
We currently don't support several rolling window functions when building a rolling window object using a custom class descended from
pandas.api.indexers.Baseindexer
. The implementations were written with backward-looking windows in mind, and this led to these functions breaking.Currently, using these functions returns a
NotImplemented
error thanks to #33057, but ideally we want to update the implementations, so that they will work without a performance hit. This is what I aim to do over a series of PRs.Perf notes
No changes to algorithms.