Skip to content

Commit f55dd11

Browse files
Backport PR pandas-dev#31668: REGR: Fixed handling of Categorical in cython ops (pandas-dev#31678)
Co-authored-by: Tom Augspurger <[email protected]>
1 parent db669b0 commit f55dd11

File tree

3 files changed

+20
-1
lines changed

3 files changed

+20
-1
lines changed

doc/source/whatsnew/v1.0.1.rst

+1
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ Fixed regressions
1919
- Fixed regression when indexing a ``Series`` or ``DataFrame`` indexed by ``DatetimeIndex`` with a slice containg a :class:`datetime.date` (:issue:`31501`)
2020
- Fixed regression in ``DataFrame.__setitem__`` raising an ``AttributeError`` with a :class:`MultiIndex` and a non-monotonic indexer (:issue:`31449`)
2121
- Fixed regression in :class:`Series` multiplication when multiplying a numeric :class:`Series` with >10000 elements with a timedelta-like scalar (:issue:`31457`)
22+
- Fixed regression in ``.groupby()`` aggregations with categorical dtype using Cythonized reduction functions (e.g. ``first``) (:issue:`31450`)
2223
- Fixed regression in :meth:`GroupBy.apply` if called with a function which returned a non-pandas non-scalar object (e.g. a list or numpy array) (:issue:`31441`)
2324
- Fixed regression in :meth:`DataFrame.groupby` whereby taking the minimum or maximum of a column with period dtype would raise a ``TypeError``. (:issue:`31471`)
2425
- Fixed regression in :meth:`to_datetime` when parsing non-nanosecond resolution datetimes (:issue:`31491`)

pandas/core/groupby/groupby.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -1379,7 +1379,9 @@ def f(self, **kwargs):
13791379
except DataError:
13801380
pass
13811381
except NotImplementedError as err:
1382-
if "function is not implemented for this dtype" in str(err):
1382+
if "function is not implemented for this dtype" in str(
1383+
err
1384+
) or "category dtype not supported" in str(err):
13831385
# raised in _get_cython_function, in some cases can
13841386
# be trimmed by implementing cython funcs for more dtypes
13851387
pass

pandas/tests/groupby/aggregate/test_aggregate.py

+16
Original file line numberDiff line numberDiff line change
@@ -378,6 +378,22 @@ def test_agg_index_has_complex_internals(index):
378378
tm.assert_frame_equal(result, expected)
379379

380380

381+
def test_agg_cython_category_not_implemented_fallback():
382+
# https://github.com/pandas-dev/pandas/issues/31450
383+
df = pd.DataFrame({"col_num": [1, 1, 2, 3]})
384+
df["col_cat"] = df["col_num"].astype("category")
385+
386+
result = df.groupby("col_num").col_cat.first()
387+
expected = pd.Series(
388+
[1, 2, 3], index=pd.Index([1, 2, 3], name="col_num"), name="col_cat"
389+
)
390+
tm.assert_series_equal(result, expected)
391+
392+
result = df.groupby("col_num").agg({"col_cat": "first"})
393+
expected = expected.to_frame()
394+
tm.assert_frame_equal(result, expected)
395+
396+
381397
class TestNamedAggregationSeries:
382398
def test_series_named_agg(self):
383399
df = pd.Series([1, 2, 3, 4])

0 commit comments

Comments
 (0)