Skip to content

TST: astyp via loc to int64 #36942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

hardikpnsp
Copy link
Contributor

@hardikpnsp hardikpnsp commented Oct 7, 2020

Adds a test for a special case of astype via loc to extension type Int64 when data has np.nan or pd.NA.

@hardikpnsp hardikpnsp changed the title Test astyp via loc to int64 TST: astyp via loc to int64 Oct 9, 2020
Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hardikpnsp for the PR!

Looks good! Some small comments

def test_astype_cast_via_loc_nan_int(self, val, dtype):
# see GH#31861
expected = DataFrame({"a": ["foo"], "b": integer_array([val], dtype=dtype)})
result = DataFrame({"a": ["foo"], "b": [val]})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick but i'd call this df. It's not really the result yet since the point of the test is to change it in the next line and then assert

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input! I agree, its better to call it df, I will try to quickly fix these.

@pytest.mark.parametrize("dtype", ["Int64", "Int32", "Int16"])
@pytest.mark.parametrize("val", [np.nan, NA])
def test_astype_cast_via_loc_nan_int(self, val, dtype):
# GH31861
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the working conditions as well (e.g. nan + a value) from the OP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! let me try.

@@ -136,6 +137,15 @@ def test_astype_cast_nan_inf_int(self, val, dtype):
with pytest.raises(ValueError, match=msg):
df.astype(dtype)

@pytest.mark.parametrize("dtype", ["Int64", "Int32", "Int16"])
@pytest.mark.parametrize("val", [np.nan, NA])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use nulls_fixture instead and any_nullable_int_dtype fixture

Copy link
Contributor Author

@hardikpnsp hardikpnsp Oct 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback , here in nulls_fixture, the test case fails for pd.NaT because it is inferred as datetime64. I think it is expected behavior but would like your opinion on it.

code snippet for reference

>>> df = pd.DataFrame({"a": ["foo"], "b": [pd.NaT]})
>>> df.dtypes
a            object
b    datetime64[ns]
dtype: object
>>> df.loc[:, "b"] = df.loc[:, "b"].astype('Int64')
TypeError: datetime64[ns] cannot be converted to an IntegerDtype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arw2019 @jreback , Any updates on this one? would appreciate some help and direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use pytest.skipif for the datetime case
or test for the TypeError somehow but I'm not sure that's necessary

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves NA - MaskedArrays Related to pd.NA and nullable extension arrays Testing pandas testing functions or related to the test suite labels Oct 10, 2020
@jreback jreback added this to the 1.2 milestone Oct 10, 2020
@hardikpnsp hardikpnsp closed this Oct 27, 2020
Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @hardikpnsp for being quiet

Responded to the question below. If you feel like reopening and pushing the changes to parameterization (use fixtures as suggested) this should be good to go in

@@ -136,6 +137,15 @@ def test_astype_cast_nan_inf_int(self, val, dtype):
with pytest.raises(ValueError, match=msg):
df.astype(dtype)

@pytest.mark.parametrize("dtype", ["Int64", "Int32", "Int16"])
@pytest.mark.parametrize("val", [np.nan, NA])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use pytest.skipif for the datetime case
or test for the TypeError somehow but I'm not sure that's necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves NA - MaskedArrays Related to pd.NA and nullable extension arrays Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assigned conversion via loc to Int64 fails under peculiar conditions
3 participants