Categorical columns are slow in groupby operations

#### Code Sample, a copy-pastable example if possible

```python
import pandas as pd
import random

SIZE = 100000
GROUPS = 10000  # The larger, the more extreme the timing differences
CARDINALITY = 10
CAT = pd.CategoricalDtype([i for i in range(CARDINALITY)])
df_int = pd.DataFrame({
    'group': [random.randint(0, GROUPS) for i in range(SIZE)],
    'cat': [random.choice(CAT.categories) for i in range(SIZE)],
})
df_cat = df_int.astype({'cat': CAT})
df_str = df_cat.astype({'cat': 'str'})

%timeit df_int.groupby('group').last()
# 100 loops, best of 3: 5.51 ms per loop
%timeit df_str.groupby('group').last()
# 100 loops, best of 3: 11.4 ms per loop
%timeit df_cat.groupby('group').last()
# 1 loop, best of 3: 4.62 s per loop
```

#### Problem description

We have a dataframe like the above in production and noticed a particular groupby call was drastically slowing down our pipeline. We were able to workaround the issue by converting categorical types to strings before the groupby operation.

The issue occurs when categorical values are in the grouped frame (not when they're included in the groupby index). The problem scales with the number of groups. The larger the GROUPS value in the snippet above, the worse the timing differences are.

We didn't expect using categories to cause these kind of performance issues. Is this expected, or could it be something worth optimizing for?

#### Output of ``pd.show_versions()``

<details>

[paste the output of ``pd.show_versions()`` here below this line]

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.6.9.final.0
python-bits      : 64
OS               : Linux
OS-release       : 4.14.137+
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 0.25.3
numpy            : 1.18.2
pytz             : 2018.9
dateutil         : 2.8.1
pip              : 19.3.1
setuptools       : 46.0.0
Cython           : 0.29.15
pytest           : 3.6.4
hypothesis       : None
sphinx           : 1.8.5
blosc            : None
feather          : 0.4.0
xlsxwriter       : None
lxml.etree       : 4.2.6
html5lib         : 1.0.1
pymysql          : None
psycopg2         : 2.7.6.1 (dt dec pq3 ext lo64)
jinja2           : 2.11.1
IPython          : 5.5.0
pandas_datareader: 0.7.4
bs4              : 4.6.3
bottleneck       : 1.3.2
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.2.6
matplotlib       : 3.2.1
numexpr          : 2.7.1
odfpy            : None
openpyxl         : 2.5.9
pandas_gbq       : 0.11.0
pyarrow          : 0.14.1
pytables         : None
s3fs             : 0.4.0
scipy            : 1.4.1
sqlalchemy       : 1.3.15
tables           : 3.4.4
xarray           : 0.15.0
xlrd             : 1.1.0
xlwt             : 1.3.0
xlsxwriter       : None
</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Categorical columns are slow in groupby operations #32976

Code Sample, a copy-pastable example if possible

Problem description

Output of `pd.show_versions()`

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Categorical columns are slow in groupby operations #32976

Description

Code Sample, a copy-pastable example if possible

Problem description

Output of pd.show_versions()

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`