IntervalDtype inconsistencies and bugs #18980
Labels
Categorical
Categorical Data Type
Compat
pandas objects compatability with Numpy or Python functions
Dtype Conversions
Unexpected or buggy dtype conversions
Interval
Interval data type
Milestone
1. Inconsistent comparisons versus string
'interval'
:I'd expect all of these to return
True
, like howCategoricalDtype(*, *) == 'category'
always returnsTrue
.2. Inconsistent comparisons versus
IntervalDtype(None)
:I'd expect all of these to return
True
, like howCDT(None, None) == CDT(*, *)
always returnsTrue
.3.
IntervalDtype.name
attribute changesCategoricalDtype.name
attribute is always the same:I'd expect
IntervalDtype.name
to always return'interval'
, like howCDT.name
always returns'category'
. This makes the code for checking equality against strings (i.e. what I described in 1) simpler. I don't think the behavior ofstr(IntervalDtype)
should change, which is currently the same asIntervalDtype.name
, so I'd still have that return strings specifying the subtype.4.
(No longer an issue due to #19022)CategoricalDtype
gets cached incorrectly:This looks to be caused by the caching being done by string representation, and
str(CDT(*, *))
always returns'category'
:pandas/pandas/core/dtypes/dtypes.py
Lines 673 to 679 in e1d5a27
Can caching be removed entirely for
IntervalDtype
, or is there some need/advantage that I'm not seeing? Looking at the other dtypes,CategoricalDtype
appears to have had the caching code removed, butPeriodDtype
andDatetimeTZDtype
are using it.The text was updated successfully, but these errors were encountered: