Skip to content

ENH: provide standard BaseIndexers in pandas.api.indexers #33236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Apr 8, 2020

Conversation

AlexKirko
Copy link
Member

@AlexKirko AlexKirko commented Apr 2, 2020

This PR adds a forward-looking pandas.api.indexers.BaseIndexer subclass implementation and exposes it to users as discussed in #33180 .

Exposing other indexers

It is also possible to expose a backward-looking indexer simply by exposing pandas.core.window.indexers.FixedWindowIndexer through the API. Seems redundant to me as this is basically the same as supplying an integer as the window argument when creating a rolling object but it can be done for consistency. I'd be grateful for a discussion.

Note on tests

The class is incorporated into test_rolling_forward_window in test_base_indexer.py and this serves as a test that the new indexer is working correctly.

Notes on performance

I've tested the implementation and it shows comparable performance to pandas.core.window.indexers.FixedWindowIndexer. In my tests the new indexer was slightly faster, either by luck, or because I streamlined the backward-looking indexer a bit when making the forward-looking one.

@WillAyd
Copy link
Member

WillAyd commented Apr 2, 2020

/azp run

@azure-pipelines
Copy link
Contributor

Azure Pipelines successfully started running 1 pipeline(s).

@jreback jreback added Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Apr 3, 2020
@jreback jreback added this to the 1.1 milestone Apr 3, 2020
@jreback
Copy link
Contributor

jreback commented Apr 3, 2020

looks good, if you can rebase and make the change to use the FixedForward indexer. Also I think worthile adding a benchmark.

@jreback
Copy link
Contributor

jreback commented Apr 3, 2020

@TomAugspurger @jorisvandenbossche @jbrockmendel if any comments.

@AlexKirko
Copy link
Member Author

Travis failing has nothing to do with this PR. See #33280
Waiting for merge.

@mroeschke
Copy link
Member

Could you document this new indexer around here: https://pandas.pydata.org/pandas-docs/stable/user_guide/computation.html#custom-window-rolling

@AlexKirko
Copy link
Member Author

AlexKirko commented Apr 4, 2020

Found a bug in the benchmark, needs a bit more work.

UPDATE: fixed it and tested locally, the new benchmark is working now.

@AlexKirko
Copy link
Member Author

@mroeschke Added documentation to the custom window rolling subsection. Can introduce a new subsection, but I think it fits in. Please tell me what you think.

@AlexKirko
Copy link
Member Author

@jreback
Rebased, changed the test, added a benchmark to the suite. All green.

@jbrockmendel
Copy link
Member

LGTM

For some problems knowledge of the future is available for analysis. For example, this occurs when
each data point is a full time series read from an experiment, and the task is to extract underlying
conditions. In these cases it can be useful to perform forward-looking rolling window computations.
``FixedForwardWindowIndexer`` class is available for this purpose. This ``BaseIndexer`` subclass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use the ref here, e.g. the one you are using in the whatsnew

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I've included a ref in whatsnew, pointing to this subsection of the user guide. If you meant something different, please tell me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I mean a ref on FixedForwardIndexer here that points to the api (makes a clickable link), and one for BaseIndexer

``FixedForwardWindowIndexer`` class is available for this purpose. This ``BaseIndexer`` subclass
implements a closed fixed-width forward-looking rolling window, and we can use it as follows:

.. code-block:: ipython
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use an ipython block here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback
Sorry, could you clarify? To me, my ipython block looks right, both in raw and in the docs I compiled locally. Do I need to do something differently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't use code-blocks on executable code. pls use an ipython block so that the doc-build executes this, that way it will never get out of date.



class FixedForwardWindowIndexer(BaseIndexer):
"""Calculate fixed forward-looking rolling window bounds,"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add Examples section here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

def get_window_bounds(
self,
num_values: int = 0,
min_periods: Optional[int] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should raise if min_periods, center, closed are not None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented it a bit differently, pinged you in the comment on the implementation.

@AlexKirko
Copy link
Member Author

Looks like the doc build fail is being handled in #33309

if closed is not None:
raise ValueError(
"Forward-looking windows don't support setting the closed argument"
)
Copy link
Member Author

@AlexKirko AlexKirko Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback
Implemented the error-raising behavior this way. Reasons below.

As far as I've been able to test, the arguments are being set by the rolling object in a general way. They are passed whether a particular BaseIndexer implementation needs them or not.
min_periods is getting set to rolling.min_periods. Same stuff for center: in my tests it got set to False by default. Only closed was set as None.
I think we should raise if something that doesn't make sense got passed, but testing for None is too crude for our purposes.

@AlexKirko
Copy link
Member Author

@jreback
All green, pinged you on relevant changes. I think I've addressed everything, but I'm not sure if I understood you correctly on adding refs and the ipython block. Please tell me if I misunderstood anything.

@AlexKirko AlexKirko requested a review from jreback April 6, 2020 15:39
For some problems knowledge of the future is available for analysis. For example, this occurs when
each data point is a full time series read from an experiment, and the task is to extract underlying
conditions. In these cases it can be useful to perform forward-looking rolling window computations.
``FixedForwardWindowIndexer`` class is available for this purpose. This ``BaseIndexer`` subclass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I mean a ref on FixedForwardIndexer here that points to the api (makes a clickable link), and one for BaseIndexer

``FixedForwardWindowIndexer`` class is available for this purpose. This ``BaseIndexer`` subclass
implements a closed fixed-width forward-looking rolling window, and we can use it as follows:

.. code-block:: ipython
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't use code-blocks on executable code. pls use an ipython block so that the doc-build executes this, that way it will never get out of date.

if center:
raise ValueError("Forward-looking windows can't have center=True")
if closed is not None:
raise ValueError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a test that hits these

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added to the test.

@AlexKirko
Copy link
Member Author

AlexKirko commented Apr 7, 2020

@jreback
Thanks for the clarification! Sorry for getting confused: first time adding a public class.

  1. Added an entry to the window.rst so that the docstrings get published as part of API reference output.
  2. Made the links in the user guide clickable and tested locally.
  3. Replaced the code-block with an ipython block. Thanks.
  4. Added pytest.raises to test error-raising behavior. I think having it in the same test works, but please tell me if you'd rather have error-raising behavior be a separate test.

@AlexKirko AlexKirko requested a review from jreback April 7, 2020 15:05
@@ -85,3 +85,14 @@ Base class for defining custom window boundaries.
:toctree: api/

api.indexers.BaseIndexer

Forward-looking fixed window indexer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to add a whole new section for this. Just place api.indexers.FixedForwardWindowIndexer right below api.indexers.BaseIndexer

Copy link
Member Author

@AlexKirko AlexKirko Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Looks better now, thanks. Sorry, I'm still learning how our docs are built.

@jreback jreback merged commit 717d805 into pandas-dev:master Apr 8, 2020
@jreback
Copy link
Contributor

jreback commented Apr 8, 2020

thanks @AlexKirko very nice!

@AlexKirko AlexKirko deleted the forward-indexer branch April 9, 2020 06:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: provide standard BaseIndexers in pandas.api.indexers
5 participants