-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: provide standard BaseIndexers in pandas.api.indexers #33236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
looks good, if you can rebase and make the change to use the FixedForward indexer. Also I think worthile adding a benchmark. |
@TomAugspurger @jorisvandenbossche @jbrockmendel if any comments. |
6d4bb3f
to
4e5aa02
Compare
Travis failing has nothing to do with this PR. See #33280 |
Could you document this new indexer around here: https://pandas.pydata.org/pandas-docs/stable/user_guide/computation.html#custom-window-rolling |
Found a bug in the benchmark, needs a bit more work. UPDATE: fixed it and tested locally, the new benchmark is working now. |
@mroeschke Added documentation to the custom window rolling subsection. Can introduce a new subsection, but I think it fits in. Please tell me what you think. |
@jreback |
LGTM |
For some problems knowledge of the future is available for analysis. For example, this occurs when | ||
each data point is a full time series read from an experiment, and the task is to extract underlying | ||
conditions. In these cases it can be useful to perform forward-looking rolling window computations. | ||
``FixedForwardWindowIndexer`` class is available for this purpose. This ``BaseIndexer`` subclass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use the ref here, e.g. the one you are using in the whatsnew
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback I've included a ref in whatsnew, pointing to this subsection of the user guide. If you meant something different, please tell me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no I mean a ref on FixedForwardIndexer here that points to the api (makes a clickable link), and one for BaseIndexer
``FixedForwardWindowIndexer`` class is available for this purpose. This ``BaseIndexer`` subclass | ||
implements a closed fixed-width forward-looking rolling window, and we can use it as follows: | ||
|
||
.. code-block:: ipython |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use an ipython block here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback
Sorry, could you clarify? To me, my ipython block looks right, both in raw and in the docs I compiled locally. Do I need to do something differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't use code-blocks on executable code. pls use an ipython block so that the doc-build executes this, that way it will never get out of date.
pandas/core/window/indexers.py
Outdated
|
||
|
||
class FixedForwardWindowIndexer(BaseIndexer): | ||
"""Calculate fixed forward-looking rolling window bounds,""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add Examples section here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
def get_window_bounds( | ||
self, | ||
num_values: int = 0, | ||
min_periods: Optional[int] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should raise if min_periods, center, closed are not None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented it a bit differently, pinged you in the comment on the implementation.
Looks like the doc build fail is being handled in #33309 |
if closed is not None: | ||
raise ValueError( | ||
"Forward-looking windows don't support setting the closed argument" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback
Implemented the error-raising behavior this way. Reasons below.
As far as I've been able to test, the arguments are being set by the rolling
object in a general way. They are passed whether a particular BaseIndexer
implementation needs them or not.
min_periods
is getting set to rolling.min_periods
. Same stuff for center
: in my tests it got set to False
by default. Only closed
was set as None
.
I think we should raise if something that doesn't make sense got passed, but testing for None
is too crude for our purposes.
@jreback |
For some problems knowledge of the future is available for analysis. For example, this occurs when | ||
each data point is a full time series read from an experiment, and the task is to extract underlying | ||
conditions. In these cases it can be useful to perform forward-looking rolling window computations. | ||
``FixedForwardWindowIndexer`` class is available for this purpose. This ``BaseIndexer`` subclass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no I mean a ref on FixedForwardIndexer here that points to the api (makes a clickable link), and one for BaseIndexer
``FixedForwardWindowIndexer`` class is available for this purpose. This ``BaseIndexer`` subclass | ||
implements a closed fixed-width forward-looking rolling window, and we can use it as follows: | ||
|
||
.. code-block:: ipython |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't use code-blocks on executable code. pls use an ipython block so that the doc-build executes this, that way it will never get out of date.
if center: | ||
raise ValueError("Forward-looking windows can't have center=True") | ||
if closed is not None: | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a test that hits these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added to the test.
@jreback
|
doc/source/reference/window.rst
Outdated
@@ -85,3 +85,14 @@ Base class for defining custom window boundaries. | |||
:toctree: api/ | |||
|
|||
api.indexers.BaseIndexer | |||
|
|||
Forward-looking fixed window indexer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to add a whole new section for this. Just place api.indexers.FixedForwardWindowIndexer
right below api.indexers.BaseIndexer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Looks better now, thanks. Sorry, I'm still learning how our docs are built.
thanks @AlexKirko very nice! |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
This PR adds a forward-looking
pandas.api.indexers.BaseIndexer
subclass implementation and exposes it to users as discussed in #33180 .Exposing other indexers
It is also possible to expose a backward-looking indexer simply by exposing
pandas.core.window.indexers.FixedWindowIndexer
through the API. Seems redundant to me as this is basically the same as supplying an integer as the window argument when creating a rolling object but it can be done for consistency. I'd be grateful for a discussion.Note on tests
The class is incorporated into
test_rolling_forward_window
intest_base_indexer.py
and this serves as a test that the new indexer is working correctly.Notes on performance
I've tested the implementation and it shows comparable performance to
pandas.core.window.indexers.FixedWindowIndexer
. In my tests the new indexer was slightly faster, either by luck, or because I streamlined the backward-looking indexer a bit when making the forward-looking one.