-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: return "correct" missing value scalar from Categorical? #29962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think returning the NA value for the dtype (NaT in that case) makes the most sense.
Maybe a meaningless question, but: does this count as values-dependent, or dtype-dependnat? |
To be fully correct, I think it is "dtype-instance-dependent" behaviour (so in the sense that it depends on one of the properties of the dtype instance, not on the dtype class) |
I agree with @TomAugspurger here, this is strictly dtype dependent of the values. We have an almost identical pattern with Interval, where we have a sub-dtype. Maybe we could formalize this a bit, e.g. making a more explicit ContainerDtype. Not sure its worth the effort, but IIRC we discussed this before about working with sub-dtypes. |
i think that 1) we should do this and 2) a deprecation cycle for this would be a hassle, so maybe do directly in 2.0? |
@jorisvandenbossche I honestly think this is the wrong thing to do. Having instead
Having both |
From #27929 (comment). In
__getitem__
orCategorical.min(..)
, we always returnnp.nan
as scalar missing value, regardless of the dtype:In the above, this could also be
pd.NaT
instead?(similar issue will come up once we can use the EAs that use the new NA scalar in categoricals)
However,
CategoricalDtype.na_value
now also returnsnp.nan
(which should be consistent with what we return in the cases above):We can of course let the
CategoricalDtype.na_value
be dependent on thena_value
of the dtype of the categories. But I am not fully sure we want such values-dependent behaviour?The text was updated successfully, but these errors were encountered: