-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Simplify bool ops closures, improve tests #19795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
619d2bb
7cf9a7a
f5b262b
c03cc55
235f969
9aa41ec
e721b64
a591078
8243ec8
f19a596
f9ff20b
0d99c94
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1058,69 +1058,76 @@ def _bool_method_SERIES(cls, op, special): | |
Wrapper function for Series arithmetic operations, to avoid | ||
code duplication. | ||
""" | ||
name = _get_op_name(op, special) | ||
|
||
def na_op(x, y): | ||
try: | ||
result = op(x, y) | ||
except TypeError: | ||
if isinstance(y, list): | ||
y = construct_1d_object_array_from_listlike(y) | ||
|
||
if isinstance(y, (np.ndarray, ABCSeries)): | ||
if (is_bool_dtype(x.dtype) and is_bool_dtype(y.dtype)): | ||
result = op(x, y) # when would this be hit? | ||
else: | ||
x = _ensure_object(x) | ||
y = _ensure_object(y) | ||
result = lib.vec_binop(x, y, op) | ||
assert not isinstance(y, (list, ABCSeries, pd.Index)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is oddly specific. what about a tuple? basicaly you are trying to catch ndarray or non-list-like right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This assertion is codifying existing behavior. ATM the line following this checks for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. remove this and let's instead qualify the below with
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add tuple here, or assert There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. or maybe just remove the assertion and make the else into an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As mentioned above, I’m not inclined to change any behavior until the docs/tests are fleshed out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. then let's do that first (in this PR or a pre-cursor) |
||
if isinstance(y, np.ndarray): | ||
# This next assertion is here because there used to be | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is not a useful comment. imagine a new reader coming along with no history There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yah, not sure what to put here. ATM there is a non-reachable branch with a TODO comment specifically asking how it would be reached. This seemed worth documenting somehow since someone at some point found it worth special-casing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i think the assertion pretty much explains this, though maybe add some text that we are trying to compare non-bool things here |
||
# a branch that specifically caught this case. | ||
assert not (is_bool_dtype(x) and is_bool_dtype(y)) | ||
x = _ensure_object(x) | ||
y = _ensure_object(y) | ||
result = lib.vec_binop(x, y, op) | ||
else: | ||
# let null fall thru | ||
assert is_scalar(y) | ||
if not isna(y): | ||
y = bool(y) | ||
try: | ||
result = lib.scalar_binop(x, y, op) | ||
except: | ||
msg = ("cannot compare a dtyped [{dtype}] array " | ||
"with a scalar of type [{type}]" | ||
).format(dtype=x.dtype, type=type(y).__name__) | ||
raise TypeError(msg) | ||
raise TypeError("cannot compare a dtyped [{dtype}] array " | ||
"with a scalar of type [{typ}]" | ||
.format(dtype=x.dtype, | ||
typ=type(y).__name__)) | ||
|
||
return result | ||
|
||
fill_int = lambda x: x.fillna(0) | ||
fill_bool = lambda x: x.fillna(False).astype(bool) | ||
|
||
def wrapper(self, other): | ||
is_self_int_dtype = is_integer_dtype(self.dtype) | ||
|
||
fill_int = lambda x: x.fillna(0) | ||
fill_bool = lambda x: x.fillna(False).astype(bool) | ||
|
||
self, other = _align_method_SERIES(self, other, align_asobject=True) | ||
|
||
res_name = _get_series_op_result_name(self, other) | ||
|
||
if isinstance(other, ABCDataFrame): | ||
# Defer to DataFrame implementation; fail early | ||
return NotImplemented | ||
|
||
elif isinstance(other, ABCSeries): | ||
name = com._maybe_match_name(self, other) | ||
elif isinstance(other, (ABCSeries, pd.Index)): | ||
is_other_int_dtype = is_integer_dtype(other.dtype) | ||
other = fill_int(other) if is_other_int_dtype else fill_bool(other) | ||
|
||
filler = (fill_int if is_self_int_dtype and is_other_int_dtype | ||
else fill_bool) | ||
|
||
res_values = na_op(self.values, other.values) | ||
unfilled = self._constructor(res_values, | ||
index=self.index, name=name) | ||
return filler(unfilled) | ||
ovalues = other.values | ||
|
||
else: | ||
# scalars, list, tuple, np.array | ||
filler = (fill_int if is_self_int_dtype and | ||
is_integer_dtype(np.asarray(other)) else fill_bool) | ||
is_other_int_dtype = is_integer_dtype(np.asarray(other)) | ||
if isinstance(other, list): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. a list only? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ATM lists are handled inside na_op on 1066-1067. This is just moving the same two lines up the call stack (and outside of an exception-handling block) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the comment also includes tuple and np.array, should not this check for also for tuple? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess it depends on if we want to treat tuples as "an array the user forgot to wrap in an array". For lists we do that pretty consistently (e.g. on line 1034 and 1172 in core.ops), but not so much for tuples. I don't have a strong opinion as to what the ideal behavior is, but would prefer to address it separately from this cleanup/testing PR. (it could go in the same Issue as the TODO comment just below this isinstance check) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should be is_list_like |
||
# TODO: Can we do this before the is_integer_dtype check? | ||
# could the is_integer_dtype check be checking the wrong | ||
# thing? e.g. other = [[0, 1], [2, 3], [4, 5]]? | ||
other = construct_1d_object_array_from_listlike(other) | ||
ovalues = other | ||
|
||
res_values = na_op(self.values, other) | ||
unfilled = self._constructor(res_values, index=self.index) | ||
return filler(unfilled).__finalize__(self) | ||
filler = (fill_int if is_self_int_dtype and is_other_int_dtype | ||
else fill_bool) | ||
|
||
res_values = na_op(self.values, ovalues) | ||
unfilled = self._constructor(res_values, index=self.index, | ||
name=res_name) | ||
result = filler(unfilled) | ||
if not isinstance(other, ABCSeries): | ||
result = result.__finalize__(self) | ||
return result | ||
|
||
wrapper.__name__ = name | ||
return wrapper | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a bug, rather an API breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will move.