API: value_counts on a Categorical Series should have a CategoricalIndex #10704

TomAugspurger · 2015-07-30T14:12:21Z

Thoughts on this?

In [8]: import pandas.util.testing as tm

In [9]: s = pd.Series(tm.makeCategoricalIndex(k=100))

In [10]: s.value_counts()
Out[10]:
vcKH    40
vXR9    31
Zn8J    29
dtype: int64

In [11]: s.value_counts().index
Out[11]: Index(['vcKH', 'vXR9', 'Zn8J'], dtype='object')

Now that we have CategoricalIndex (thanks Jeff), should that type be preserved so that Out[11] is a CategoricalIndex? My use-case (not shown in this example) is when the original categories are ordered, you get your value_counts and then want to sort the index.

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2015-07-30T14:18:43Z

I think that is the logical consequence of having a categorical type in series and having a CategorialIndex, so OK for me!
It is similar as how df.groupby('cat').count() now also preserves this categorical type. So in that regard, it would only be consistent that value_counts returns that as well.

jorisvandenbossche · 2015-07-30T14:21:23Z

There can of course be some incompatibilities, such as with plotting afterwards, but that is similar as eg the groupby case (we had some reports on that, eg #10140)

jreback · 2015-07-30T14:37:53Z

yes I would agree with that logic. go for it @TomAugspurger

TomAugspurger · 2015-07-30T14:38:41Z

Cool. I'll shot for next week.

TomAugspurger added API Design Categorical Categorical Data Type labels Jul 30, 2015

jreback added this to the Next Major Release milestone Jul 30, 2015

TomAugspurger mentioned this issue Aug 3, 2015

API: CategoricalIndex for value_counts #10729

Merged

jreback modified the milestones: 0.17.0, Next Major Release Aug 3, 2015

TomAugspurger closed this as completed in #10729 Aug 4, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: value_counts on a Categorical Series should have a CategoricalIndex #10704

API: value_counts on a Categorical Series should have a CategoricalIndex #10704

TomAugspurger commented Jul 30, 2015

jorisvandenbossche commented Jul 30, 2015

jorisvandenbossche commented Jul 30, 2015

jreback commented Jul 30, 2015

TomAugspurger commented Jul 30, 2015

API: value_counts on a Categorical Series should have a CategoricalIndex #10704

API: value_counts on a Categorical Series should have a CategoricalIndex #10704

Comments

TomAugspurger commented Jul 30, 2015

jorisvandenbossche commented Jul 30, 2015

jorisvandenbossche commented Jul 30, 2015

jreback commented Jul 30, 2015

TomAugspurger commented Jul 30, 2015