-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fixed .str.contains(..., na=False) for categorical series #22170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 31 commits
077d136
2ae44d1
a1b3d7b
dbd990b
9d5d2c2
90aef7b
69f16af
78cf8c7
93bb24a
6c2700f
6649129
5c87e81
a09dcc5
3abdea5
d136599
b942e16
82f9b9e
53e9253
9f0286f
ffa9969
07c1d73
7f1f2e2
f6cb04f
1f0256a
7025f34
7542448
f1b4274
386ab98
7a09c44
3408920
3288d11
6c87770
d242647
def1b4e
fd99431
44b36a4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -543,10 +543,26 @@ def test_contains(self): | |
assert result.dtype == np.bool_ | ||
tm.assert_numpy_array_equal(result, expected) | ||
|
||
# na | ||
values = Series(['om', 'foo', np.nan]) | ||
res = values.str.contains('foo', na="foo") | ||
assert res.loc[2] == "foo" | ||
# na for category | ||
jreback marked this conversation as resolved.
Show resolved
Hide resolved
|
||
values = Series(["a", "b", "c", "a", np.nan], dtype="category") | ||
result = values.str.contains('a', na=True).astype(object) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why the |
||
expected = Series([True, False, False, True, True], dtype=np.object_) | ||
assert isinstance(result, Series) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this can be removed. The assert_series_equal checks that. |
||
tm.assert_series_equal(result, expected) | ||
|
||
result = values.str.contains('a', na=False).astype(object) | ||
expected = Series([True, False, False, True, False], dtype=np.object_) | ||
tm.assert_series_equal(result, expected) | ||
|
||
# na for objects | ||
values = Series(["a", "b", "c", "a", np.nan]) | ||
result = values.str.contains('a', na=True).astype(object) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment for the astype object. |
||
expected = Series([True, False, False, True, True], dtype=np.object_) | ||
tm.assert_series_equal(result, expected) | ||
|
||
result = values.str.contains('a', na=False).astype(object) | ||
expected = Series([True, False, False, True, False], dtype=np.object_) | ||
tm.assert_series_equal(result, expected) | ||
|
||
def test_startswith(self): | ||
values = Series(['om', NA, 'foo_nom', 'nom', 'bar_foo', NA, 'foo']) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use double backticks on Series, can you use the actual reference here, e.g.
:func:`Series.str.contains`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Thanks, I've updated it. Please let me know if there's anything else you'd like to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Can you please review and merge this PR.