-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: pd.Series.replace does not preserve the original dtype #33622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
9ef9f9e
8be0127
078aca1
571ab8a
6ae7a09
b227024
a62c896
1a9fa40
b777345
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1005,8 +1005,15 @@ def coerce_to_target_dtype(self, other): | |
we can also safely try to coerce to the same dtype | ||
and will receive the same block | ||
""" | ||
# if we cannot then coerce to object | ||
dtype, _ = infer_dtype_from(other, pandas_dtype=True) | ||
if isinstance(other, str): | ||
dtype = "string" | ||
if is_dtype_equal(self.dtype, dtype): | ||
return self | ||
else: | ||
dtype = np.object_ | ||
else: | ||
# if we cannot then coerce to object | ||
dtype, _ = infer_dtype_from(other, pandas_dtype=True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried to move into infer_dtype_from(), but a lot of tests fail. So, I didn't move. |
||
|
||
if is_dtype_equal(self.dtype, dtype): | ||
return self | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -254,7 +254,7 @@ def test_replace2(self): | |
def test_replace_with_dictlike_and_string_dtype(self): | ||
# GH 32621 | ||
s = pd.Series(["one", "two", np.nan], dtype="string") | ||
expected = pd.Series(["1", "2", np.nan]) | ||
expected = pd.Series(["1", "2", np.nan], dtype="string") | ||
result = s.replace({"one": "1", "two": "2"}) | ||
tm.assert_series_equal(expected, result) | ||
|
||
|
@@ -406,3 +406,20 @@ def test_replace_only_one_dictlike_arg(self): | |
msg = "Series.replace cannot use dict-value and non-None to_replace" | ||
with pytest.raises(ValueError, match=msg): | ||
ser.replace(to_replace, value) | ||
|
||
@pytest.mark.parametrize( | ||
"series, to_replace, expected", | ||
[ | ||
( | ||
pd.Series(["one", "two"], dtype="string"), | ||
{"one": "1", "two": "2"}, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a bunch more tests which can hit the other EA types, e.g. Intervval, Period, Datetime w/tz There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added test cases. |
||
"string", | ||
), | ||
(pd.Series([1, 2], dtype="int64"), {1: 10, 2: 20}, "int64"), | ||
(pd.Series([True, False], dtype="bool"), {True: False}, "bool"), | ||
], | ||
) | ||
def test_replace_dtype(self, series, to_replace, expected): | ||
kotamatsuoka marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# GH 33484 | ||
result = str(series.replace(to_replace).dtype) | ||
assert expected == result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't hard-code things like this, instead you can try to teach infer_dtype_from to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry...I removed hard-code.