-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TST: astyp via loc to int64 #36942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: astyp via loc to int64 #36942
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hardikpnsp for the PR!
Looks good! Some small comments
def test_astype_cast_via_loc_nan_int(self, val, dtype): | ||
# see GH#31861 | ||
expected = DataFrame({"a": ["foo"], "b": integer_array([val], dtype=dtype)}) | ||
result = DataFrame({"a": ["foo"], "b": [val]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick but i'd call this df
. It's not really the result yet since the point of the test is to change it in the next line and then assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the input! I agree, its better to call it df
, I will try to quickly fix these.
@pytest.mark.parametrize("dtype", ["Int64", "Int32", "Int16"]) | ||
@pytest.mark.parametrize("val", [np.nan, NA]) | ||
def test_astype_cast_via_loc_nan_int(self, val, dtype): | ||
# GH31861 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the working conditions as well (e.g. nan + a value) from the OP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! let me try.
@@ -136,6 +137,15 @@ def test_astype_cast_nan_inf_int(self, val, dtype): | |||
with pytest.raises(ValueError, match=msg): | |||
df.astype(dtype) | |||
|
|||
@pytest.mark.parametrize("dtype", ["Int64", "Int32", "Int16"]) | |||
@pytest.mark.parametrize("val", [np.nan, NA]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use nulls_fixture
instead and any_nullable_int_dtype
fixture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback , here in nulls_fixture
, the test case fails for pd.NaT
because it is inferred as datetime64
. I think it is expected behavior but would like your opinion on it.
code snippet for reference
>>> df = pd.DataFrame({"a": ["foo"], "b": [pd.NaT]})
>>> df.dtypes
a object
b datetime64[ns]
dtype: object
>>> df.loc[:, "b"] = df.loc[:, "b"].astype('Int64')
TypeError: datetime64[ns] cannot be converted to an IntegerDtype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use pytest.skipif
for the datetime case
or test for the TypeError
somehow but I'm not sure that's necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @hardikpnsp for being quiet
Responded to the question below. If you feel like reopening and pushing the changes to parameterization (use fixtures as suggested) this should be good to go in
@@ -136,6 +137,15 @@ def test_astype_cast_nan_inf_int(self, val, dtype): | |||
with pytest.raises(ValueError, match=msg): | |||
df.astype(dtype) | |||
|
|||
@pytest.mark.parametrize("dtype", ["Int64", "Int32", "Int16"]) | |||
@pytest.mark.parametrize("val", [np.nan, NA]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use pytest.skipif
for the datetime case
or test for the TypeError
somehow but I'm not sure that's necessary
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Adds a test for a special case of
astype
vialoc
to extension type Int64 when data has np.nan or pd.NA.