-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN/PERF: no need for kahan for int group_cumsum #41874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -253,18 +253,16 @@ def group_cumsum(numeric[:, ::1] out, | |||
t = accum[lab, j] + y | |||
compensation[lab, j] = t - accum[lab, j] - y | |||
accum[lab, j] = t | |||
out[i, j] = accum[lab, j] | |||
out[i, j] = t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doubt this affects compiled result, but may as well to not depend on a smart compiler avoiding this extra indexing step
y = val - compensation[lab, j] | ||
t = accum[lab, j] + y | ||
compensation[lab, j] = t - accum[lab, j] - y | ||
t = val + accum[lab, j] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm this is affecting all dtypes. do we not have tests for this for small floats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inside the else block from an if statement if numeric == float32_t or numeric == float64_t:
so only non-floats should end up here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh ok that was not clear from the difff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment to that effect (maybe just on the float32/64 branch, e.g. using Kahan summation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have added a comment
hmm seemingly unrelated failures. maybe on master? cc @jbrockmendel |
thanks @mzeitlin11 |
Surprised at lack of impact here - doesn't noticeably affect benchmarks.
Targeting the cython algo specifically shows an improvement (but smaller than I'd expect given the removed operations):