-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: uses_mask in group_any_all #52043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Set the position as masked if `out[lab] != flag_val`, which | ||
# would indicate True/False has not yet been seen for any/all, | ||
# so by Kleene logic the result is currently unknown | ||
if out[lab, j] != flag_val: | ||
out[lab, j] = -1 | ||
result_mask[lab, j] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For any: If NA is encountered as first value in the group you are setting the mask to 1 here but you don't reset it if you find another value in the group that is not NA. You'll have to update the result_mask if you find another value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh OK. I missed that we are checking out[lab, j] != ...
here as opposed to values[i, j] != ...
. Thanks.
pp_kwargs["result_mask"] = result_mask | ||
|
||
result = post_processing(result, inferences, **pp_kwargs) | ||
result = post_processing(result, inferences, result_mask=result_mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll probably have to remove the flag is_nullable for the std post_processing function (the default is False) and check for result_mask is not None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this, this was on my todo list as well
Your suggestion helped, green |
thx @jbrockmendel |
any, all, and std go through GroupBy._get_cythonized_result instead of the more-standard WrappedCythonOp. I'm trying to refactor any/all to use the other path, and as a step toward that am trying to make group_any_all follow the same patterns as the other functions in libgroupby.
The implementation here looks to me like it should behave identically to the existing implementation, but it fails a bunch of tests (mostly in tests/groupby/test_any_all.py). Hoping @phofl can explain where I'm going wrong.