Skip to content

TST: Inconsistent behavior of .replace() in Int64 series with NA #38693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Dec 28, 2020
Merged

TST: Inconsistent behavior of .replace() in Int64 series with NA #38693

merged 13 commits into from
Dec 28, 2020

Conversation

ftrihardjo
Copy link
Contributor

@ftrihardjo ftrihardjo commented Dec 25, 2020

@MarcoGorelli MarcoGorelli self-requested a review December 25, 2020 11:03
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ftrihardjo

Comment on lines 212 to 213
def test_replace_int_with_na(self, dtype):
result = pd.Series([0, None]).astype(dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put the issue number as a comment here? see some other tests for an example

@pytest.mark.parametrize('dtype', ['int8', 'int16', 'int32', 'int64'])
def test_replace_int_with_na(self, dtype):
result = pd.Series([0, None]).astype(dtype)
result.replace(0, pd.NA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll want

result = result.replace(0, pd.NA)

result = pd.Series([0, None]).astype(dtype)
result.replace(0, pd.NA)
expected = pd.Series([0, None]).astype(dtype)
expected.fillna(0).replace(0, pd.NA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this line needed? Just construct expected directly

@jreback jreback changed the title pandas-dev issue #38267 BUG: Inconsistent behavior of .replace() in Int64 series with <NA>. Dec 25, 2020
@ftrihardjo
Copy link
Contributor Author

ftrihardjo commented Dec 26, 2020 via email

def test_replace_int_with_na(self, dtype):
# GH 38267
result = pd.Series([0, None]).astype(dtype).replace(0, pd.NA)
expected = pd.Series([0, None]).astype(dtype).fillna(0).replace(0, pd.NA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to construct expected directly (e.g. expected = pd.Series([pd.NA, pd.NA]))?

@MarcoGorelli MarcoGorelli self-requested a review December 26, 2020 09:10
@@ -208,6 +208,13 @@ def test_replace_with_dict_with_bool_keys(self):
expected = pd.Series(["yes", False, "yes"])
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("dtype", ["int8", "int16", "int32", "int64"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the types need to be capitalised, as in the issue, e.g. Int64 not int64

@@ -208,6 +208,13 @@ def test_replace_with_dict_with_bool_keys(self):
expected = pd.Series(["yes", False, "yes"])
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("dtype", ["Int8", "Int16", "Int32", "Int64"])
def test_replace_int_with_na(self, dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use any_nullable_int_dtype fixture here

@pytest.mark.parametrize("dtype", ["Int8", "Int16", "Int32", "Int64"])
def test_replace_int_with_na(self, dtype):
# GH 38267
result = pd.Series([0, None]).astype(dtype).replace(0, pd.NA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass dtype directly to the Series constructor

@@ -208,6 +208,13 @@ def test_replace_with_dict_with_bool_keys(self):
expected = pd.Series(["yes", False, "yes"])
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("dtype", ["Int8", "Int16", "Int32", "Int64"])
def test_replace_int_with_na(self, dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call it test_replace_nullable_int_with_na or test_replace_Int_with_na

def test_replace_int_with_na(self, dtype):
# GH 38267
result = pd.Series([0, None]).astype(dtype).replace(0, pd.NA)
expected = pd.Series([pd.NA, pd.NA])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also need to pass dtype here else (looking at CI output) this casts to object. That's why currently this test doesn't actually pass

@MarcoGorelli
Copy link
Member

Thanks @ftrihardjo , looks like just #38693 (comment) still needs to be addressed

@jreback jreback added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate NA - MaskedArrays Related to pd.NA and nullable extension arrays Testing pandas testing functions or related to the test suite labels Dec 27, 2020
@jreback jreback added this to the 1.3 milestone Dec 27, 2020
@pytest.mark.parametrize("dtype", ["Int8", "Int16", "Int32", "Int64"])
def test_replace_Int_with_na(self, dtype):
# GH 38267
result = pd.Series([0, None], dtype=dtype).replace(0, pd.NA)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the working example as well e.g. 0,1 as another case.

@@ -208,6 +208,13 @@ def test_replace_with_dict_with_bool_keys(self):
expected = pd.Series(["yes", False, "yes"])
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("dtype", ["Int8", "Int16", "Int32", "Int64"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use the fixture: any_nullable_int_dtype instead here.

@pep8speaks
Copy link

pep8speaks commented Dec 27, 2020

Hello @ftrihardjo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-12-28 02:00:00 UTC

@jorisvandenbossche jorisvandenbossche changed the title BUG: Inconsistent behavior of .replace() in Int64 series with <NA>. TST: Inconsistent behavior of .replace() in Int64 series with NA Dec 28, 2020
@MarcoGorelli MarcoGorelli self-requested a review December 28, 2020 10:28
@jreback jreback merged commit 0b7ce00 into pandas-dev:master Dec 28, 2020
@jreback
Copy link
Contributor

jreback commented Dec 28, 2020

thanks @ftrihardjo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate NA - MaskedArrays Related to pd.NA and nullable extension arrays Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Inconsistent behavior of .replace() in Int64 series with <NA>.
5 participants