-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Incorrect behavior of window aggregation functions on disjoint windows skipping overflowing elements #45647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
3 tasks done
Comments
take |
rtpsw
added a commit
to rtpsw/pandas
that referenced
this issue
Jan 27, 2022
4 tasks
rtpsw
added a commit
to rtpsw/pandas
that referenced
this issue
Jan 27, 2022
rtpsw
added a commit
to rtpsw/pandas
that referenced
this issue
Jan 27, 2022
rtpsw
added a commit
to rtpsw/pandas
that referenced
this issue
Jan 27, 2022
rtpsw
added a commit
to rtpsw/pandas
that referenced
this issue
Jan 27, 2022
rtpsw
added a commit
to rtpsw/pandas
that referenced
this issue
Jan 28, 2022
jreback
pushed a commit
that referenced
this issue
Jan 28, 2022
meeseeksmachine
pushed a commit
to meeseeksmachine/pandas
that referenced
this issue
Jan 28, 2022
…er unused elements (pandas-devGH-45647)
jreback
pushed a commit
that referenced
this issue
Jan 28, 2022
…elements (GH-45647) (#45683) Co-authored-by: rtpsw <[email protected]>
phofl
pushed a commit
to phofl/pandas
that referenced
this issue
Feb 14, 2022
yehoshuadimarsky
pushed a commit
to yehoshuadimarsky/pandas
that referenced
this issue
Jul 13, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
The example outputs
for both
sum
andmean
. The correct behavior would output4.0
instead ofNaN
in the last two rows. This shows that at least these rolling window aggregations produce incorrect output on disjoint windows that skip elements whose sum overflows. This incorrect behavior originates in the window aggregation functions inpandas._libs.window.aggregations
that process, rather than skip over, elements outside the disjoint windows; many of these functions have this problem. Here is code that shows this forsum
andmean
:Because these window aggregation functions are not exposed to the user and are wrapped by defensive code within
Series.rolling
that handlesnp.inf
values, the above example is more involved and induces an overflow to expose the behavior.One issue where the need for handling disjoint windows occurs is in GH-15354 when the step size is larger than the window size, which is the use case described there. The current issue is a precursor for handling GH-15354.
Expected Behavior
The expected output is for both
sum
andmean
is:Installed Versions
This issue is confirmed on the main branch.
The text was updated successfully, but these errors were encountered: