You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think that str(cat.dtype) should be changed back to always being 'category'. I've seen several places where people use str(thing.dtype) == 'category' as a way to check for Categoricals. Even subtle things like this arrow PR would break.
So instead of
In [6]: str(pd.Categorical([1, 2, 3]).dtype)
Out[6]: 'CategoricalDtype(categories=[1, 2, 3], ordered=False)'
it would be
Out[6]: 'category'
We can leave __repr__ to be unambiguous.
The text was updated successfully, but these errors were encountered:
Any reason you would do it just for when categories is None categories?
To be clear, in most places like the output in the console, in the Series / DataFrame repr, you'll still have the informative CategoricalDtype(categories=...) repr. It's only when you call str(x.dtype) that you get 'category'.
Possible option is also to only change str back to 'category', and keeping repr as it is now. That gives the more informative repr in the console, but doesn't break code that used str(dtype).
But, that of course hides a bit that you can do str(dtype) == 'category' (but that is maybe not a bad thing? as we want them to use pd.api.types.is_categorical ?)
I think that
str(cat.dtype)
should be changed back to always being 'category'. I've seen several places where people usestr(thing.dtype) == 'category'
as a way to check for Categoricals. Even subtle things like this arrow PR would break.So instead of
it would be
We can leave
__repr__
to be unambiguous.The text was updated successfully, but these errors were encountered: