Skip to content

API: Change str for CategoricalDtype to category #17783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 5, 2017

Conversation

TomAugspurger
Copy link
Contributor

Better compatibility with older versions

Closes #17782

Better compatibility with older versions
@TomAugspurger TomAugspurger added this to the 0.21.0 milestone Oct 4, 2017
@TomAugspurger TomAugspurger added the Categorical Categorical Data Type label Oct 4, 2017
@codecov
Copy link

codecov bot commented Oct 4, 2017

Codecov Report

Merging #17783 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17783      +/-   ##
==========================================
- Coverage   91.24%   91.22%   -0.02%     
==========================================
  Files         163      163              
  Lines       49916    49910       -6     
==========================================
- Hits        45544    45530      -14     
- Misses       4372     4380       +8
Flag Coverage Δ
#multiple 89.02% <ø> (ø) ⬆️
#single 40.23% <ø> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/core/dtypes/dtypes.py 95.14% <ø> (+0.19%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.73% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48d0460...c1774b6. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Oct 5, 2017

I am not sure this is a great idea, what is the purpose here?

@TomAugspurger
Copy link
Contributor Author

There's, lot's of code out there that expects str(dtype) to be 'category ', and an established tradition in Python of str(obj) being simple, while repr(obj) is more complicated.

Why are you against it?

@jreback
Copy link
Contributor

jreback commented Oct 5, 2017

this loses informtion, no reason why for that. where does code == 'str(dtype)` live? IOW can you show me an example from a project?

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Oct 5, 2017 via email

@jreback
Copy link
Contributor

jreback commented Oct 5, 2017

the if str(dtype) == 'category' was never a way to do this. to be honest I am in favor of breaking this.

@TomAugspurger
Copy link
Contributor Author

Ok, I'm +1 for this change, and so is Joris.

@jorisvandenbossche
Copy link
Member

It maybe never really was meant to do that, but before is_categorical was consistently publicly exposed, I think it was a sensible thing to do.
Personally I don't find undoing the loss of information when converting dtype to string worth it to potentially break people's code.

@jreback
Copy link
Contributor

jreback commented Oct 5, 2017

ok then, merge away.

@jreback jreback merged commit 7740a6e into pandas-dev:master Oct 5, 2017
@TomAugspurger TomAugspurger deleted the categoricaldtype-str branch October 5, 2017 18:58
@jorisvandenbossche
Copy link
Member

Another reason that I think this was a good change, is because it is used in the 'dtypes' overview. Before this PR you got:

In [19]: df.dtypes
Out[19]:
cats    CategoricalDtype(categories=['a', 'b', 'c'], o...
vals                                                int64
dtype: object

which I think is not ideal (although that might have been solved in another way as well).

But that did me think, we could think of some kind of 'shorter but still informative' repr like datetimetz has (although once you have more categories, it will always be truncated, which is not that nice in a default repr I think).

ghost pushed a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017
alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants