-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Special-case setting nan into integer series #54527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
9be06be
3622ff4
1bfd932
26bd53c
d4e989a
99767c0
5f99673
4f6869f
13c6870
f88bcc1
02cf661
19a7aa1
a2b5144
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,7 +60,10 @@ | |
from pandas.core.dtypes.common import ( | ||
ensure_platform_int, | ||
is_1d_only_ea_dtype, | ||
is_float_dtype, | ||
is_integer_dtype, | ||
is_list_like, | ||
is_scalar, | ||
is_string_dtype, | ||
) | ||
from pandas.core.dtypes.dtypes import ( | ||
|
@@ -454,6 +457,32 @@ def coerce_to_target_dtype(self, other, warn_on_upcast: bool = False) -> Block: | |
and will receive the same block | ||
""" | ||
new_dtype = find_result_type(self.values.dtype, other) | ||
|
||
# In a future version of pandas, the default will be that | ||
# setting `nan` into an integer series won't raise. | ||
if is_scalar(other) and is_integer_dtype(self.values.dtype): | ||
try: | ||
is_nan = bool(np.isnan(other)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should check for all NA values since we hopefully end up with NA only? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Setting pd.NA or None also cast to float with NaN nowadays, so agreed that those should be catched here as well. But setting pd.NaT currently casts to object dtype, and that one is maybe not something we want to allow? (i.e. continue to warn for that is fine?) So maybe add a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah I had that earlier, but @phofl brought up that in the future that would also be treated as NA here and so wouldn't upcast There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but before you were only checking for nan with |
||
except TypeError: | ||
# ufunc 'isnan' not supported for the input types | ||
# boolean value of NA is ambiguous | ||
is_nan = False | ||
try: | ||
is_nat = np.isnat(other) | ||
except TypeError: | ||
# ufunc 'isnat' is only defined for np.datetime64 and np.timedelta64 | ||
is_nat = False | ||
if is_nan and not is_nat: | ||
warn_on_upcast = False | ||
elif ( | ||
isinstance(other, np.ndarray) | ||
and other.ndim == 1 | ||
and is_integer_dtype(self.values.dtype) | ||
and is_float_dtype(other.dtype) | ||
and lib.has_only_ints_or_nan(other) | ||
): | ||
warn_on_upcast = False | ||
|
||
if warn_on_upcast: | ||
warnings.warn( | ||
f"Setting an item of incompatible dtype is deprecated " | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one thing that this doesn't yet do, is checking for the proper range for lower bitwidth integers.
For example setting [1000.0, np.nan] into a int8 Series should still raise the warning because 1000 is too big, but this helper function will now say it are only integers or NaN.