Skip to content

Commit 289cd6d

Browse files
josh-howesjreback
authored andcommitted
BUG: fix str.contains for series containing only nan values
closes pandas-dev#14171 Author: Josh Howes <[email protected]> Closes pandas-dev#14182 from josh-howes/bugfix/14171-series-str-contains-only-nan-values and squashes the following commits: c7e9721 [Josh Howes] BUG: fix str.contains for series containing only nan values
1 parent 939a221 commit 289cd6d

File tree

4 files changed

+24
-2
lines changed

4 files changed

+24
-2
lines changed

doc/source/whatsnew/v0.19.0.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1552,7 +1552,7 @@ Bug Fixes
15521552
- Bug in ``.to_excel()`` when DataFrame contains a MultiIndex which contains a label with a NaN value (:issue:`13511`)
15531553
- Bug in invalid frequency offset string like "D1", "-2-3H" may not raise ``ValueError (:issue:`13930`)
15541554
- Bug in ``concat`` and ``groupby`` for hierarchical frames with ``RangeIndex`` levels (:issue:`13542`).
1555-
1555+
- Bug in ``Series.str.contains()`` for Series containing only ``NaN`` values of ``object`` dtype (:issue:`14171`)
15561556
- Bug in ``agg()`` function on groupby dataframe changes dtype of ``datetime64[ns]`` column to ``float64`` (:issue:`12821`)
15571557
- Bug in using NumPy ufunc with ``PeriodIndex`` to add or subtract integer raise ``IncompatibleFrequency``. Note that using standard operator like ``+`` or ``-`` is recommended, because standard operators use more efficient path (:issue:`13980`)
15581558
- Bug in operations on ``NaT`` returning ``float`` instead of ``datetime64[ns]`` (:issue:`12941`)

doc/source/whatsnew/v0.20.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -81,3 +81,4 @@ Performance Improvements
8181

8282
Bug Fixes
8383
~~~~~~~~~
84+

pandas/core/strings.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,8 @@ def _map(f, arr, na_mask=False, na_value=np.nan, dtype=object):
165165
if na_mask:
166166
mask = isnull(arr)
167167
try:
168-
result = lib.map_infer_mask(arr, f, mask.view(np.uint8))
168+
convert = not all(mask)
169+
result = lib.map_infer_mask(arr, f, mask.view(np.uint8), convert)
169170
except (TypeError, AttributeError):
170171

171172
def g(x):

pandas/tests/test_strings.py

+20
Original file line numberDiff line numberDiff line change
@@ -2439,6 +2439,26 @@ def test_more_contains(self):
24392439
True, False, False])
24402440
assert_series_equal(result, expected)
24412441

2442+
def test_contains_nan(self):
2443+
# PR #14171
2444+
s = Series([np.nan, np.nan, np.nan], dtype=np.object_)
2445+
2446+
result = s.str.contains('foo', na=False)
2447+
expected = Series([False, False, False], dtype=np.bool_)
2448+
assert_series_equal(result, expected)
2449+
2450+
result = s.str.contains('foo', na=True)
2451+
expected = Series([True, True, True], dtype=np.bool_)
2452+
assert_series_equal(result, expected)
2453+
2454+
result = s.str.contains('foo', na="foo")
2455+
expected = Series(["foo", "foo", "foo"], dtype=np.object_)
2456+
assert_series_equal(result, expected)
2457+
2458+
result = s.str.contains('foo')
2459+
expected = Series([np.nan, np.nan, np.nan], dtype=np.object_)
2460+
assert_series_equal(result, expected)
2461+
24422462
def test_more_replace(self):
24432463
# PR #1179
24442464
s = Series(['A', 'B', 'C', 'Aaba', 'Baca', '', NA, 'CABA',

0 commit comments

Comments
 (0)