Skip to content

Commit 75ff6b0

Browse files
BUG: fix get_indexer_non_unique() with 'object' targets with NaNs (#44482)
numpy.searchsorted() does not handle NaNs in 'object' arrays as expected (numpy/numpy/#15499). Therefore we cannot search NaNs using binary search. So we use binary search only for targets without NaNs.
1 parent 700be61 commit 75ff6b0

File tree

3 files changed

+18
-2
lines changed

3 files changed

+18
-2
lines changed

doc/source/whatsnew/v1.4.0.rst

+3-1
Original file line numberDiff line numberDiff line change
@@ -578,7 +578,7 @@ Strings
578578

579579
Interval
580580
^^^^^^^^
581-
- Bug in :meth:`IntervalIndex.get_indexer_non_unique` returning boolean mask instead of array of integers for a non unique and non monotonic index (:issue:`44084`)
581+
-
582582
-
583583

584584
Indexing
@@ -608,6 +608,8 @@ Indexing
608608
- Bug in :meth:`Series.__setitem__` with a boolean mask indexer setting a listlike value of length 1 incorrectly broadcasting that value (:issue:`44265`)
609609
- Bug in :meth:`DataFrame.loc.__setitem__` and :meth:`DataFrame.iloc.__setitem__` with mixed dtypes sometimes failing to operate in-place (:issue:`44345`)
610610
- Bug in :meth:`DataFrame.loc.__getitem__` incorrectly raising ``KeyError`` when selecting a single column with a boolean key (:issue:`44322`).
611+
- Bug in :meth:`IntervalIndex.get_indexer_non_unique` returning boolean mask instead of array of integers for a non unique and non monotonic index (:issue:`44084`)
612+
- Bug in :meth:`IntervalIndex.get_indexer_non_unique` not handling targets of ``dtype`` 'object' with NaNs correctly (:issue:`44482`)
611613

612614
Missing
613615
^^^^^^^

pandas/_libs/index.pyx

+6-1
Original file line numberDiff line numberDiff line change
@@ -338,7 +338,12 @@ cdef class IndexEngine:
338338
missing = np.empty(n_t, dtype=np.intp)
339339

340340
# map each starget to its position in the index
341-
if stargets and len(stargets) < 5 and self.is_monotonic_increasing:
341+
if (
342+
stargets and
343+
len(stargets) < 5 and
344+
np.nan not in stargets and
345+
self.is_monotonic_increasing
346+
):
342347
# if there are few enough stargets and the index is monotonically
343348
# increasing, then use binary search for each starget
344349
remaining_stargets = set()

pandas/tests/indexes/test_indexing.py

+9
Original file line numberDiff line numberDiff line change
@@ -332,3 +332,12 @@ def test_get_indexer_non_unique_multiple_nans(idx, target, expected):
332332
axis = Index(idx)
333333
actual = axis.get_indexer_for(target)
334334
tm.assert_numpy_array_equal(actual, expected)
335+
336+
337+
def test_get_indexer_non_unique_nans_in_object_dtype_target():
338+
idx = Index([1.0, 2.0])
339+
target = Index([1, np.nan], dtype="object")
340+
341+
result_idx, result_missing = idx.get_indexer_non_unique(target)
342+
tm.assert_numpy_array_equal(result_idx, np.array([0, -1], dtype=np.intp))
343+
tm.assert_numpy_array_equal(result_missing, np.array([1], dtype=np.intp))

0 commit comments

Comments
 (0)