Skip to content

Commit e036cf9

Browse files
debnathshohamfeefladder
authored andcommitted
BUG: groupby.apply incorrectly dropping nan (pandas-dev#43236)
1 parent 7583668 commit e036cf9

File tree

3 files changed

+11
-12
lines changed

3 files changed

+11
-12
lines changed

doc/source/whatsnew/v1.3.3.rst

+1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Fixed regressions
1717
- Fixed regression in :class:`DataFrame` constructor failing to broadcast for defined :class:`Index` and len one list of :class:`Timestamp` (:issue:`42810`)
1818
- Performance regression in :meth:`core.window.ewm.ExponentialMovingWindow.mean` (:issue:`42333`)
1919
- Fixed regression in :meth:`.GroupBy.agg` incorrectly raising in some cases (:issue:`42390`)
20+
- Fixed regression in :meth:`.GroupBy.apply` where ``nan`` values were dropped even with ``dropna=False`` (:issue:`43205`)
2021
- Fixed regression in :meth:`.GroupBy.quantile` which was failing with ``pandas.NA`` (:issue:`42849`)
2122
- Fixed regression in :meth:`merge` where ``on`` columns with ``ExtensionDtype`` or ``bool`` data types were cast to ``object`` in ``right`` and ``outer`` merge (:issue:`40073`)
2223
- Fixed regression in :meth:`RangeIndex.where` and :meth:`RangeIndex.putmask` raising ``AssertionError`` when result did not represent a :class:`RangeIndex` (:issue:`43240`)

pandas/core/groupby/groupby.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -1011,7 +1011,11 @@ def reset_identity(values):
10111011

10121012
if not not_indexed_same:
10131013
result = concat(values, axis=self.axis)
1014-
ax = self.filter(lambda x: True).axes[self.axis]
1014+
ax = (
1015+
self.filter(lambda x: True).axes[self.axis]
1016+
if self.dropna
1017+
else self._selected_obj._get_axis(self.axis)
1018+
)
10151019

10161020
# this is a very unfortunate situation
10171021
# we can't use reindex to restore the original order

pandas/tests/groupby/test_apply.py

+5-11
Original file line numberDiff line numberDiff line change
@@ -1062,25 +1062,19 @@ def test_apply_by_cols_equals_apply_by_rows_transposed():
10621062
tm.assert_frame_equal(by_cols, df)
10631063

10641064

1065-
def test_apply_dropna_with_indexed_same():
1065+
@pytest.mark.parametrize("dropna", [True, False])
1066+
def test_apply_dropna_with_indexed_same(dropna):
10661067
# GH 38227
1067-
1068+
# GH#43205
10681069
df = DataFrame(
10691070
{
10701071
"col": [1, 2, 3, 4, 5],
10711072
"group": ["a", np.nan, np.nan, "b", "b"],
10721073
},
10731074
index=list("xxyxz"),
10741075
)
1075-
result = df.groupby("group").apply(lambda x: x)
1076-
expected = DataFrame(
1077-
{
1078-
"col": [1, 4, 5],
1079-
"group": ["a", "b", "b"],
1080-
},
1081-
index=list("xxz"),
1082-
)
1083-
1076+
result = df.groupby("group", dropna=dropna).apply(lambda x: x)
1077+
expected = df.dropna() if dropna else df.iloc[[0, 3, 1, 2, 4]]
10841078
tm.assert_frame_equal(result, expected)
10851079

10861080

0 commit comments

Comments
 (0)