Skip to content

Fix off-by-one in aggregations #55920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 13, 2023
Merged

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Nov 11, 2023

ref #55713 (comment)

I'm not super familiar with this function so maybe there is a more intelligent way of doing this. But the segfault occurs when self.window_size == 0 because of these lines:

end = np.arange(1 + offset, num_values + 1 + offset, step, dtype="int64")
start = end - self.window_size

In the segfaulting example, num_values is 20 for an array with 20 elements. end ends up becoming 1-20 as does start with the window_size of 0. Later code then tries to index the 20th index in array, hence the address violation

@lithomas1 lithomas1 requested a review from mroeschke November 13, 2023 17:09
@lithomas1 lithomas1 added this to the 2.1.4 milestone Nov 13, 2023
@@ -102,7 +102,7 @@ def get_window_bounds(
closed: str | None = None,
step: int | None = None,
) -> tuple[np.ndarray, np.ndarray]:
if center:
if center or self.window_size == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this result in the same resulting offset as before? i.e.

In [1]: 0 - 1 // 2
Out[1]: 0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your code 1 // 2 is getting evaluated first. The parentheses matter

>>> (0 - 1) // 2
-1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified this fixed the segfault via a local run of #55102 - hopefully can make progress soon to getting that up in CI since this is now 2 releases in a row where we have a segfault when cutting the release.

#55151 is a big step towards getting UBSAN/ASAN set up as the datetime code issues a ton of errors, and by design those tools stop at the first error encountered. If you have any time to help review that and push through would appreciate it

@mroeschke mroeschke added Window rolling, ewma, expanding Segfault Non-Recoverable Error labels Nov 13, 2023
@mroeschke
Copy link
Member

Thanks @WillAyd (for fixing my math mistake 😅 )

@WillAyd WillAyd deleted the fix-off-by-one branch November 13, 2023 19:02
mroeschke pushed a commit that referenced this pull request Nov 13, 2023
…55942)

Backport PR #55920: Fix off-by-one in aggregations

Co-authored-by: William Ayd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Segfault Non-Recoverable Error Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants