-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: allow exact matches in time-based .rolling() #13968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# this will raise ValueError on non-fixed freqs | ||
self.window = freq.nanos | ||
if self.left_closed: | ||
self.window += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, the logic adds one extra nanosecond to the search space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU, offset-based rolling windows rely on the conversion to nanoseconds in the line right above, so adding one not only works but is the optimal strategy — it includes the left endpoint and nothing else (since nanosecond is the smallest time resolution available).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test or two with values at finer resolutions than the offset to confirm this? for example
In [11]: df = pd.DataFrame({'B': [1, 1, 1]},
...: index = [pd.Timestamp('20130101 09:00:30'),
...: pd.Timestamp('20130101 09:01:00'),
...: pd.Timestamp('20130101 09:02:00')])
In [12]: df.rolling('1Min').sum()
Out[12]:
B
2013-01-01 09:00:30 1.0
2013-01-01 09:01:00 2.0
2013-01-01 09:02:00 1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks pretty good. We'll need documentation to explain what |
Thanks!
Yes, that'd be the lion's share of this. I'd like to get the opinion of at least a couple of people about the word |
Yes, some sort of |
It might make sense there (I haven't looked at |
Current coverage is 84.65% (diff: 100%)@@ master #13968 diff @@
==========================================
Files 144 144
Lines 51016 51023 +7
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43184 43191 +7
Misses 7832 7832
Partials 0 0
|
@@ -1044,18 +1055,26 @@ def validate(self): | |||
"for datetimelike and offset " | |||
"based windows") | |||
|
|||
if self.left_closed is not None and not is_bool(self.left_closed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add tests for invalid input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls also add tests with |
@@ -5382,11 +5382,12 @@ def _add_series_or_dataframe_operations(cls): | |||
from pandas.core import window as rwindow | |||
|
|||
@Appender(rwindow.rolling.__doc__) | |||
def rolling(self, window, min_periods=None, freq=None, center=False, | |||
win_type=None, on=None, axis=0): | |||
def rolling(self, window, min_periods=None, left_closed=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this after on
as its a new parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
we have the convention in
This makes more sense to me. default would be |
Are you speaking in favor of something different from |
Not sure what you mean. Could you please be more precise? |
@agraboso no, I think we should stick with the spelling: |
@jreback: the problem is that |
only allow right and both then I think changing to a non standard argument is even more confusing |
I proposed |
I'm open to whatever the consensus might be on the name. Github doesn't have any voting system AFAIK, so I'll add comments below for the proposed options and ask people to vote for them. We can, of course, have a discussion afterwards. |
Option: |
Option: |
@@ -377,6 +381,12 @@ class Window(_Window): | |||
|
|||
axis : int or string, default 0 | |||
|
|||
.. versionadded:: 0.19.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> 0.20.0
@@ -455,6 +465,17 @@ class Window(_Window): | |||
2013-01-01 09:00:05 NaN | |||
2013-01-01 09:00:06 4.0 | |||
|
|||
For time-based windows, it is possible to make the resulting window | |||
contain its left edge by setting closed='both'. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you show closed='right', for comparison
lgtm (slight doc correction). pls add a whatsnew for 0.20.0 cc @chrisaycock |
@agraboso pls add a whatsnew for 0.20.0 (other enhancements) |
can you rebase / update |
Looking forward to having this added to 0.20.0. It just needs a whatsnew entry plus a fix to the "0.19.0" versionadded entry. |
yes just needs minor changes. if @agraboso doesn't fix this up in next couple of days I'll fix on merge. |
actually prob need some slight doc updates as well (doc-string, maybe in computation.rst) |
@chrisaycock any other doc updates needed here? (e.g. maybe in merge.rst)? |
It still needs the |
@chrisaycock sorry, yes, meant |
@agraboso if you can update we can get this in. |
I do think the |
@carlosdanielcsantos so we have |
@jreback If by not including the rhs endpoint you mean not including the "right side" endpoint, yes. As in ti <= t < tf. For the sake of completeness, we could add the option to not include any endpoint at all, ti < t < tf, with something like Here is a quick example of what it would look like
|
@carlosdanielcsantos aside from @chrisaycock @agraboso thoughts? |
I could definitely see |
I think then we should directly push closed parameter to cython and change to handle these 4 different cases. Its possible to do shifting on the window (as is done here), and actually shift the data (to make 'left' and 'neither' work), but I think this will be more transparent. |
@jreback sounds good to me. I already started working on it and got the logic right on the code you pointed out. I hope @agraboso doesn't mind if I pick some lines from here to my branch. |
@carlosdanielcsantos ok by me! sure, on non-time based are ok as well. you'll want to add a sub-section to whatsnew showing this (as well as a similar section in the docs themselves & doc-strings). |
@carlosdanielcsantos I don't mind at all. Hack away! |
@carlosdanielcsantos any luck with this? |
@jreback already wrote tests and the code on window.pyx. Now trying to figure out the point to which I need to backpropagate the closed argument in window.py. Some thoughts on this? |
do you have a branch that you can post |
@carlosdanielcsantos sure, why don't you make a PR (easier to comment on). at a quick glance looks reasonable. |
superseded by #15795 |
git diff upstream/master | flake8 --diff
TODO: update docs.