-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
str.replace('.','') should replace every character? (fix) #24809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c984582
67b0870
340ae89
5a4e131
cf8dc79
924ecc8
97bf73a
da16172
93a0715
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1008,6 +1008,21 @@ def test_replace(self): | |
values = klass(data) | ||
pytest.raises(TypeError, values.str.replace, 'a', repl) | ||
|
||
# GH 24804 | ||
def test_replace_single_pattern(self): | ||
values = Series(['abc', '123']) | ||
|
||
result = values.str.replace('.', 'foo', regex=True) | ||
expected = Series(['foofoofoo', 'foofoofoo']) | ||
tm.assert_series_equal(result, expected) | ||
|
||
def test_replace_without_specifying_regex_parameter(self): | ||
values = Series(['a.c']) | ||
|
||
result = values.str.replace('.', 'b') | ||
expected = Series(['abc']) | ||
tm.assert_series_equal(result, expected) | ||
|
||
def test_replace_callable(self): | ||
# GH 15055 | ||
values = Series(['fooBAD__barBAD', NA]) | ||
|
@@ -2924,7 +2939,7 @@ def test_pipe_failures(self): | |
|
||
tm.assert_series_equal(result, exp) | ||
|
||
result = s.str.replace('|', ' ') | ||
result = s.str.replace('|', ' ', regex=False) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm Ok. I think this is correct but arguably an API breaking change so make sure we make note of that in the whatsnew There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how is this an api change? regex=True is the default There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was just thinking of this particular instance and any where a user was passing in a single character that may have special meaning with a regex. This previously would directly replace a pipe but now requires Being extra conservative but not tied to the request then if you feel its over communicating. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly, but if you think this documentation is not necessary, let me know and I can change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. This is a potentially disruptive change... Can we:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK I agree - that's probably the best go-forward path, save the first point which I don't understand. @alarangeiras can you raise a FutureWarning here instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Changing the default If the user passes If the user passes We'll use regex=None to see if the user is explicit or not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not just warn when the length of the pattern is 1 and There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Then I think there would be no way to have In [5]: pd.Series(['a.c']).str.replace('.', 'b')
# Warning: Interpreting '.' as a literal, not a regex... The default will change in the future.
Out[5]:
0 abc
dtype: object # no warning
In [5]: pd.Series(['a.c']).str.replace('.', 'b', regex=True)
Out[5]:
0 bbb
dtype: object unless I"m missing something. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Following this line of reasoning, from what I understand, every bug found should issue a warning of future adjustment? |
||
exp = Series(['A B C']) | ||
|
||
tm.assert_series_equal(result, exp) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not needed, this is a simple bug fix