Skip to content

BUG: _validate_setitem_value fails to raise for PandasArray #51044 #54575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

40 changes: 40 additions & 0 deletions pandas/core/arrays/numpy_.py
Original file line number Diff line number Diff line change
Expand Up @@ -507,6 +507,46 @@ def to_numpy(

return result

def _validate_setitem_value(self, value):
if type(value) == int:
if (
self.dtype == NumpyEADtype("int64")
or self.dtype == NumpyEADtype("float64")
or self.dtype == NumpyEADtype("uint16")
or self.dtype == NumpyEADtype("object")
or self.dtype is None
):
return value
elif type(value) == float:
if (
self.dtype == NumpyEADtype("float64")
or self.dtype == NumpyEADtype("object")
or self.dtype is None
):
return value
elif type(value) == str:
if (
self.dtype == NumpyEADtype("str")
or self.dtype == NumpyEADtype("object")
or self.dtype is None
or self.dtype == NumpyEADtype("U32")
):
return value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these checks are correct for scalar value, not sure if validate_setitem_value needs to handle listlike?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TypeError still occurs if listlike values are entered in to validate causing tests to fail, even if it's supposed to work, see pandas/tests/arrays/numpy_/test_numpy.py test_ufunc_unary

elif NumpyEADtype(type(value)) == NumpyEADtype(self.dtype) or NumpyEADtype(
type(value)
) == NumpyEADtype(type(value)):
return value
else:
if self.dtype == NumpyEADtype("object") or self.dtype is None:
return value
if NumpyEADtype(type(value)) != self.dtype:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just add a check for NumpyEADtype(type(value)) == self.dtype as an or when you do NumpyEADtype(type(value)) == NumpyEADtype(self.dtype)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you can just raise a TypeError without any more checks

raise TypeError(
"value cannot be inserted without changing the dtype. value:"
f"{value}, type(value): {type(value)}, NumpyEADtype(type(value)):"
f" {NumpyEADtype(type(value))}, self.dtype: {self.dtype}"
)
return value

# ------------------------------------------------------------------------
# Ops

Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/arrays/numpy_/test_numpy.py
Original file line number Diff line number Diff line change
Expand Up @@ -322,3 +322,12 @@ def test_factorize_unsigned():
tm.assert_numpy_array_equal(res_codes, exp_codes)

tm.assert_extension_array_equal(res_unique, NumpyExtensionArray(exp_unique))


def test_array_validate_setitem_value():
# Issue# 51044
arr = pd.Series(range(5)).array
with pytest.raises(TypeError, match="bad"):
arr._validate_setitem_value("foo")
with pytest.raises(TypeError, match="bad"):
arr._validate_setitem_value(1.5)