Skip to content

Commit 277cc2a

Browse files
JDkubarhshadrach
authored andcommitted
BUG: None converted to NaN after groupby first and last (pandas-dev#33462)
* BUG: None converted after groupby first and last * BUG: whatsnew entry
1 parent 4c4dd9e commit 277cc2a

File tree

3 files changed

+18
-2
lines changed

3 files changed

+18
-2
lines changed

doc/source/whatsnew/v1.1.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -598,6 +598,7 @@ Groupby/resample/rolling
598598
- Bug in :meth:`DataFrameGroupBy.agg` with dictionary input losing ``ExtensionArray`` dtypes (:issue:`32194`)
599599
- Bug in :meth:`DataFrame.resample` where an ``AmbiguousTimeError`` would be raised when the resulting timezone aware :class:`DatetimeIndex` had a DST transition at midnight (:issue:`25758`)
600600
- Bug in :meth:`DataFrame.groupby` where a ``ValueError`` would be raised when grouping by a categorical column with read-only categories and ``sort=False`` (:issue:`33410`)
601+
- Bug in :meth:`GroupBy.first` and :meth:`GroupBy.last` where None is not preserved in object dtype (:issue:`32800`)
601602

602603
Reshaping
603604
^^^^^^^^^

pandas/_libs/groupby.pyx

+6-2
Original file line numberDiff line numberDiff line change
@@ -893,7 +893,9 @@ def group_last(rank_t[:, :] out,
893893
for j in range(K):
894894
val = values[i, j]
895895

896-
if not checknull(val):
896+
# None should not be treated like other NA-like
897+
# so that it won't be converted to nan
898+
if not checknull(val) or val is None:
897899
# NB: use _treat_as_na here once
898900
# conditional-nogil is available.
899901
nobs[lab, j] += 1
@@ -986,7 +988,9 @@ def group_nth(rank_t[:, :] out,
986988
for j in range(K):
987989
val = values[i, j]
988990

989-
if not checknull(val):
991+
# None should not be treated like other NA-like
992+
# so that it won't be converted to nan
993+
if not checknull(val) or val is None:
990994
# NB: use _treat_as_na here once
991995
# conditional-nogil is available.
992996
nobs[lab, j] += 1

pandas/tests/groupby/test_nth.py

+11
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,17 @@ def test_nth_with_na_object(index, nulls_fixture):
9494
tm.assert_frame_equal(result, expected)
9595

9696

97+
@pytest.mark.parametrize("method", ["first", "last"])
98+
def test_first_last_with_None(method):
99+
# https://github.com/pandas-dev/pandas/issues/32800
100+
# None should be preserved as object dtype
101+
df = pd.DataFrame.from_dict({"id": ["a"], "value": [None]})
102+
groups = df.groupby("id", as_index=False)
103+
result = getattr(groups, method)()
104+
105+
tm.assert_frame_equal(result, df)
106+
107+
97108
def test_first_last_nth_dtypes(df_mixed_floats):
98109

99110
df = df_mixed_floats.copy()

0 commit comments

Comments
 (0)