-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
CLN: group_quantile #43489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: group_quantile #43489
Conversation
# with the calculations. | ||
labels_for_lexsort = np.where(labels == -1, labels.max() + 1, labels) | ||
else: | ||
labels_for_lexsort = labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This labels.any()
conditional is strange - special casing the all 0 label case (which seems rare and specific) when we really care about -1 labels. Regardless, since we already check for -1 in loop above, no need to check with any
and where
here
@@ -1420,7 +1424,7 @@ cdef cummin_max(groupby_t[:, ::1] out, | |||
cdef: | |||
Py_ssize_t i, j, N, K | |||
groupby_t val, mval, na_val | |||
uint8_t[:, ::1] seen_na | |||
uint8_t[:, ::1] seen_na = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this because previously seen_na
was treated as potentially uninitialized, leading to an emission of extra code whenever it was accessed, for example
if (unlikely(!__pyx_v_seen_na.memview)) { __Pyx_RaiseUnboundMemoryviewSliceNogil("seen_na"); __PYX_ERR(0, 1462, __pyx_L7_error)
order = (values, labels_for_lexsort) | ||
sort_arr = np.lexsort(order).astype(np.int64, copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW the big perf gain comes from moving this call out of cython and not repeating it for each column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, makes sense. If you're planning on doing that, I'll just close this for now and collect the other cleanups elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually suggesting you do that in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sounds good :) I'll take a look at that
No description provided.