-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: any/all not returning booleans for object type #41102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -486,6 +486,12 @@ def nanany( | |||
False | |||
""" | |||
values, _, _, _, _ = _get_values(values, skipna, fill_value=False, mask=mask) | |||
|
|||
# For object type, any won't necessarily return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do this in _get_values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now _get_values
is unaware of the nanop it is being used for, so would need slight refactoring to allow that.
While the current solution duplicates the workaround, I like it better than adding in _get_values
since this workaround only affects any/all. I think keeping this weird condition out of a path all other nanops hit makes sense.
assert_bool_op_api( | ||
opname, bool_frame_with_na, float_string_frame, has_bool_only=True | ||
) | ||
|
||
@pytest.mark.xfail(reason="GH12863: numpy result won't match for object type") | ||
@pytest.mark.parametrize("opname", ["any", "all"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is trying to match numpy? do we need this anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt the original purpose was matching numpy, probably more for adding tests without having to hardcode an expected result. For now I modified this slightly to remove the need for the xfail, but probably could be removed (there's coverage, but it's scattered).
@@ -1108,6 +1112,23 @@ def test_any_all_extra(self): | |||
result = df[["C"]].all(axis=None).item() | |||
assert result is True | |||
|
|||
@pytest.mark.parametrize("axis", [0, 1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have testing of the skipna=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some scattered testing elsewhere, added skipna
parameterization in this test
@@ -270,6 +270,7 @@ def _badobj_wrap(self, value, func, allow_complex=True, **kwargs): | |||
value = value.astype("f8") | |||
return func(value, **kwargs) | |||
|
|||
@pytest.mark.xfail(reason="GH12863: numpy result won't match for object type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this ? (same reason above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar response - think more for coverage than matching numpy. Since the nanop gets indirectly tested for all other any/all reductions, I think it could be removed
thanks @mzeitlin11 |
Also allows removing some of the any/all specific boolean conversion logic in frame reduction code, which (hopefully) also fixes part of #41079.