-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Add example for pandas.DataFrame.rolling() with on
#50139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the docstring validation is failing. Could you investigate?
I think one example is sufficient
Ok, I will try to pass the docstring validation, and reduce two example to one. Thanks for your comments! |
When I use |
I'm so sorry that my improper operation messed up the PR! |
pandas/core/window/rolling.py
Outdated
3 NaN 6.0 | ||
4 7.0 8.0 | ||
|
||
>>> df.rolling(2, on='A').sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this give the same result even without on='A'
? might be more illustrative to have an example in which on=
makes a difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without on='A'
, the result will be like this:
>>> df.rolling(2).sum()
A B
0 NaN NaN
1 4.0 6.0
2 8.0 NaN
3 NaN NaN
4 NaN 14.0
Do I need to add this result as comparison?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more something like
In [62]: df = pd.DataFrame({
...: 'A': to_datetime(['2020-01-01', '2020-01-01', '2020-01-02']),
...: 'B': [1,2,3],
...: },
...: index=date_range('2020', periods=3))
in which if you do rolling('D')
, then the values of 'B'
differ if you do on='A'
(instead of the default, which uses the index)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see what that means. How about the following:
>>> df = pd.DataFrame({
... 'A': [pd.to_datetime('2020-01-01'),
... pd.to_datetime('2020-01-01'),
... pd.to_datetime('2020-01-02'),],
... 'B': [1, 2, 3], },
... index=pd.date_range('2020', periods=3))
>>> df
A B
2020-01-01 2020-01-01 1
2020-01-02 2020-01-01 2
2020-01-03 2020-01-02 3
>>> df['B'].rolling('2D').sum() # to avoid warning when sum on 'A'
2020-01-01 1.0
2020-01-02 3.0
2020-01-03 5.0
Freq: D, Name: B, dtype: float64
>>> df.rolling('2D', on='A').sum() # value of 'B' is differ from above
A B
2020-01-01 2020-01-01 1.0
2020-01-02 2020-01-01 3.0
2020-01-03 2020-01-02 6.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# value of 'B' differs from above
other than that, looks fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the correction, my English skills are a little rusty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there 💪
pandas/core/window/rolling.py
Outdated
|
||
**on** | ||
|
||
Rolling sum with a window length of 2 on specific columon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 days?
pandas/core/window/rolling.py
Outdated
>>> df['B'].rolling('2D').sum() # to avoid warning when sum on 'A' | ||
2020-01-01 1.0 | ||
2020-01-02 3.0 | ||
2020-01-03 5.0 | ||
Freq: D, Name: B, dtype: float64 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing as you start with
Rolling sum with a window length of 2 on specific columon.
we can probably exclude this example. If someone tries running it locally, then can remove on=
and see the difference, at which point it should be clear what it's done
pandas/core/window/rolling.py
Outdated
2020-01-03 5.0 | ||
Freq: D, Name: B, dtype: float64 | ||
|
||
>>> df.rolling('2D', on='A').sum() # value of 'B' differs from above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove the comment
How about this one 😄 Thank you for taking the time to share your thoughts with me. I really appreciate your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, thanks @luke396 !
on
kwarg ofDataFrame.rolling()
#50080I'm a newbie, and welcome any comments!