You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BUG: Fix groupby sorting on ordered Categoricals (GH25871) (#1)
* BUG: Fix groupby on ordered Categoricals (GH25871)
As documented in pandas-dev#25871, groupby() on an ordered Categorical messes up category order when 'observed=True' is specified.
Specifically, group labels will be ordered by first occurrence (as for an unordered Categorical), but grouped aggregation results will retain the Categorical's order.
The fix is a modified subset of pandas-dev#25173, which fixes a related case, but has not been merged yet.
* BUG: Fix groupby on ordered Categoricals (GH25871)
* new test
* Fix groupby on ordered Categoricals (GH25871)
Testing all combinations of:
- ordered vs. unordered grouping column
- 'observed' True vs. False
- 'sort' True vs. False
In all cases, result group ordering must be correct.
The test is built such that the result index labels are equal to aggregation results if all goes well (except for the one unobserved category)
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.25.0.rst
+1-1
Original file line number
Diff line number
Diff line change
@@ -347,7 +347,7 @@ Groupby/Resample/Rolling
347
347
- Bug in :func:`pandas.core.groupby.GroupBy.agg` when applying a aggregation function to timezone aware data (:issue:`23683`)
348
348
- Bug in :func:`pandas.core.groupby.GroupBy.first` and :func:`pandas.core.groupby.GroupBy.last` where timezone information would be dropped (:issue:`21603`)
349
349
- Ensured that ordering of outputs in ``groupby`` aggregation functions is consistent across all versions of Python (:issue:`25692`)
350
-
350
+
- Ensured that result group order is correct when grouping on an ordered Categorical and specifying ``observed=True`` (:issue:`25871`)
0 commit comments