Skip to content

API: value_counts on a Categorical Series should have a CategoricalIndex #10704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Jul 30, 2015 · 4 comments
Closed
Labels
API Design Categorical Categorical Data Type
Milestone

Comments

@TomAugspurger
Copy link
Contributor

Thoughts on this?

In [8]: import pandas.util.testing as tm

In [9]: s = pd.Series(tm.makeCategoricalIndex(k=100))

In [10]: s.value_counts()
Out[10]:
vcKH    40
vXR9    31
Zn8J    29
dtype: int64

In [11]: s.value_counts().index
Out[11]: Index(['vcKH', 'vXR9', 'Zn8J'], dtype='object')

Now that we have CategoricalIndex (thanks Jeff), should that type be preserved so that Out[11] is a CategoricalIndex? My use-case (not shown in this example) is when the original categories are ordered, you get your value_counts and then want to sort the index.

@TomAugspurger TomAugspurger added API Design Categorical Categorical Data Type labels Jul 30, 2015
@jorisvandenbossche
Copy link
Member

I think that is the logical consequence of having a categorical type in series and having a CategorialIndex, so OK for me!
It is similar as how df.groupby('cat').count() now also preserves this categorical type. So in that regard, it would only be consistent that value_counts returns that as well.

@jorisvandenbossche
Copy link
Member

There can of course be some incompatibilities, such as with plotting afterwards, but that is similar as eg the groupby case (we had some reports on that, eg #10140)

@jreback
Copy link
Contributor

jreback commented Jul 30, 2015

yes I would agree with that logic. go for it @TomAugspurger

@jreback jreback added this to the Next Major Release milestone Jul 30, 2015
@TomAugspurger
Copy link
Contributor Author

Cool. I'll shot for next week.

@jreback jreback modified the milestones: 0.17.0, Next Major Release Aug 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Categorical Categorical Data Type
Projects
None yet
Development

No branches or pull requests

3 participants