-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
API: disallow div/floordiv/pow operators for BooleanArray ? #41165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@jorisvandenbossche What's the argument for disallowing at the moment? To me it seems more natural / Pythonic to allow this operation since booleans are essentially ints. |
I am not sure what the historic reasons are. |
@jorisvandenbossche Yeah, I can see the argument of "why would anyone do this?" but if it's easy to allow and makes for greater consistency with |
conditional on the Series behavior (which i wouldnt object to deprecating), I lean towards having BooleanArray behave like Series, i.e. raising here |
Current thought here: BooleanArray (and IntegerArray and FloatingArray) ops should be wrappers around their core.ops.array_ops counterparts. This will allow for pushing more logic down from BooleanArray/NumericArray into BaseMaskedArray, which in turn will make it easier to extend BaseMaskedArray to wrap arbitrary dtypes. |
FWIW, booleans are a canonical identification for GF(2) so I would argue that these operations are all well-defined and have a unique interpretation. |
We don't have an official policy on this, but in general pandas is more averse to overflows than numpy, which corresponds to not treating arithmetic as modular. |
Currently, for the plain
bool
dtype we explicitly check for some operations and raise an error, while those actually work in numpy. For example:This is done for the division and power operations (
not_allowed={"/", "//", "**"}
):pandas/pandas/core/computation/expressions.py
Lines 215 to 218 in 934cad6
For the nullable BooleanArray, for now we simply relied on the operations as defined by the underlying numpy bool array:
That's for the
BooleanArray
, but the check is currently done on the "array_op" level (but because it is done within expressions.py, we don't run that check for EAs, xref #41161).So questions:
pd.Series(arr) / 1
does work, it's only disallowed if both operands are boolean)BooleanArray
level, and not only check it on the DataFrame/Series ops level inarray_ops.py
?The text was updated successfully, but these errors were encountered: