Skip to content

Commit b43d79d

Browse files
authored
BUG: groupby.rank with nullable types (#54460)
1 parent 52036d9 commit b43d79d

File tree

3 files changed

+13
-0
lines changed

3 files changed

+13
-0
lines changed

doc/source/whatsnew/v2.1.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -789,6 +789,7 @@ Plotting
789789
Groupby/resample/rolling
790790
^^^^^^^^^^^^^^^^^^^^^^^^
791791
- Bug in :meth:`.DataFrameGroupBy.idxmin`, :meth:`.SeriesGroupBy.idxmin`, :meth:`.DataFrameGroupBy.idxmax`, :meth:`.SeriesGroupBy.idxmax` returns wrong dtype when used on an empty DataFrameGroupBy or SeriesGroupBy (:issue:`51423`)
792+
- Bug in :meth:`DataFrame.groupby.rank` on nullable datatypes when passing ``na_option="bottom"`` or ``na_option="top"`` (:issue:`54206`)
792793
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` in incorrectly allowing non-fixed ``freq`` when resampling on a :class:`TimedeltaIndex` (:issue:`51896`)
793794
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` losing time zone when resampling empty data (:issue:`53664`)
794795
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` where ``origin`` has no effect in resample when values are outside of axis (:issue:`53662`)

pandas/core/arrays/masked.py

+3
Original file line numberDiff line numberDiff line change
@@ -1487,6 +1487,9 @@ def _groupby_op(
14871487
else:
14881488
result_mask = np.zeros(ngroups, dtype=bool)
14891489

1490+
if how == "rank" and kwargs.get("na_option") in ["top", "bottom"]:
1491+
result_mask[:] = False
1492+
14901493
res_values = op._cython_op_ndim_compat(
14911494
self._data,
14921495
min_count=min_count,

pandas/tests/groupby/test_rank.py

+9
Original file line numberDiff line numberDiff line change
@@ -710,3 +710,12 @@ def test_rank_categorical():
710710

711711
expected = df.astype(object).groupby("col1").rank()
712712
tm.assert_frame_equal(res, expected)
713+
714+
715+
@pytest.mark.parametrize("na_option", ["top", "bottom"])
716+
def test_groupby_op_with_nullables(na_option):
717+
# GH 54206
718+
df = DataFrame({"x": [None]}, dtype="Float64")
719+
result = df.groupby("x", dropna=False)["x"].rank(method="min", na_option=na_option)
720+
expected = Series([1.0], dtype="Float64", name=result.name)
721+
tm.assert_series_equal(result, expected)

0 commit comments

Comments
 (0)