Skip to content

API: CategoricalDtype str, repr #17782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Oct 4, 2017 · 4 comments
Closed

API: CategoricalDtype str, repr #17782

TomAugspurger opened this issue Oct 4, 2017 · 4 comments
Labels
Categorical Categorical Data Type
Milestone

Comments

@TomAugspurger
Copy link
Contributor

I think that str(cat.dtype) should be changed back to always being 'category'. I've seen several places where people use str(thing.dtype) == 'category' as a way to check for Categoricals. Even subtle things like this arrow PR would break.

So instead of

In [6]: str(pd.Categorical([1, 2, 3]).dtype)
Out[6]: 'CategoricalDtype(categories=[1, 2, 3], ordered=False)'

it would be

Out[6]: 'category'

We can leave __repr__ to be unambiguous.

@TomAugspurger TomAugspurger added the Categorical Categorical Data Type label Oct 4, 2017
@TomAugspurger TomAugspurger added this to the 0.21.0 milestone Oct 4, 2017
@jreback
Copy link
Contributor

jreback commented Oct 4, 2017

hmm, I only think you do do that for a CategoricalDtype(None, ordered=False).

@TomAugspurger
Copy link
Contributor Author

Any reason you would do it just for when categories is None categories?

To be clear, in most places like the output in the console, in the Series / DataFrame repr, you'll still have the informative CategoricalDtype(categories=...) repr. It's only when you call str(x.dtype) that you get 'category'.

@jorisvandenbossche
Copy link
Member

Possible option is also to only change str back to 'category', and keeping repr as it is now. That gives the more informative repr in the console, but doesn't break code that used str(dtype).
But, that of course hides a bit that you can do str(dtype) == 'category' (but that is maybe not a bad thing? as we want them to use pd.api.types.is_categorical ?)

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Oct 5, 2017

Ah, I see that is what you have done in the PR (and stated in the top post)! :-) Should have looked there first and read better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type
Projects
None yet
Development

No branches or pull requests

3 participants