-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
REGR: isin with nullable types with missing values raising #42473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
d9132d7
1902d7a
d2b988d
e60c0d4
e902218
814f978
0a9b4e6
6a68490
44bcd67
821c6b6
893343f
e7b66b7
3cfe98e
ec61ab1
17a50b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -538,6 +538,24 @@ def has_infs(floating[:] arr) -> bool: | |
return ret | ||
|
||
|
||
@cython.wraparound(False) | ||
@cython.boundscheck(False) | ||
def has_NA(ndarray arr) -> bool: | ||
""" | ||
Return True if NA present in arr, False otherwise | ||
""" | ||
cdef: | ||
Py_ssize_t i | ||
|
||
assert arr.ndim == 1, "'arr' must be 1-D." | ||
|
||
for i in range(len(arr)): | ||
if arr[i] is C_NA: | ||
return True | ||
|
||
return False | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This may be overkill, but fits with some other cython methods like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could live in _libs.missing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. have moved |
||
|
||
|
||
def maybe_indices_to_slice(ndarray[intp_t] indices, int max_len): | ||
cdef: | ||
Py_ssize_t i, n = len(indices) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -403,12 +403,18 @@ def isin(self, values) -> BooleanArray: # type: ignore[override] | |
|
||
from pandas.core.arrays import BooleanArray | ||
|
||
result = isin(self._data, values) | ||
# algorithms.isin will eventually convert values to an ndarray, so no extra | ||
# cost to doing it here first | ||
values_arr = np.asarray(values) | ||
result = isin(self._data, values_arr) | ||
|
||
if self._hasna: | ||
if libmissing.NA in values: | ||
result += self._mask | ||
else: | ||
result *= np.invert(self._mask) | ||
values_have_NA = lib.has_NA(values_arr) | ||
|
||
# For now, NA does not propagate so set result according to presence of NA, | ||
# see https://github.com/pandas-dev/pandas/pull/38379 for some discussion | ||
result[self._mask] = values_have_NA | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line should be equivalent to the deleted if/else AFAICT, but I think makes logic easier to follow |
||
|
||
mask = np.zeros_like(self, dtype=bool) | ||
return BooleanArray(result, mask, copy=False) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ndarray[object]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I try that (or
object[:]
), end up getting errors likeValueError: Buffer dtype mismatch, expected 'Python object' but got 'double'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yah, this should only be called on object-dtype, in other cases we already know the result is False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it thanks, have changed