Skip to content

Commit 6c0d3fb

Browse files
lukemanleycodamuse
authored andcommitted
PERF: df.groupby(categorical) (pandas-dev#49596)
* Categorical.reorder_categories perf * whatsnew
1 parent 5cbe002 commit 6c0d3fb

File tree

2 files changed

+5
-1
lines changed

2 files changed

+5
-1
lines changed

doc/source/whatsnew/v2.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -531,6 +531,7 @@ Performance improvements
531531
- Performance improvements to :func:`read_sas` (:issue:`47403`, :issue:`47405`, :issue:`47656`, :issue:`48502`)
532532
- Memory improvement in :meth:`RangeIndex.sort_values` (:issue:`48801`)
533533
- Performance improvement in :class:`DataFrameGroupBy` and :class:`SeriesGroupBy` when ``by`` is a categorical type and ``sort=False`` (:issue:`48976`)
534+
- Performance improvement in :class:`DataFrameGroupBy` and :class:`SeriesGroupBy` when ``by`` is a categorical type and ``observed=False`` (:issue:`49596`)
534535
- Performance improvement in :func:`merge` when not merging on the index - the new index will now be :class:`RangeIndex` instead of :class:`Int64Index` (:issue:`49478`)
535536

536537
.. ---------------------------------------------------------------------------

pandas/core/arrays/categorical.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -1019,7 +1019,10 @@ def reorder_categories(self, new_categories, ordered=None):
10191019
remove_unused_categories : Remove categories which are not used.
10201020
set_categories : Set the categories to the specified ones.
10211021
"""
1022-
if set(self.dtype.categories) != set(new_categories):
1022+
if (
1023+
len(self.categories) != len(new_categories)
1024+
or not self.categories.difference(new_categories).empty
1025+
):
10231026
raise ValueError(
10241027
"items in new_categories are not the same as in old categories"
10251028
)

0 commit comments

Comments
 (0)