Skip to content

Commit fa87040

Browse files
phoflnoatamir
authored andcommitted
BUG: Groupby cumprod nan influences other columns with skipna False (pandas-dev#48064)
1 parent 80a9242 commit fa87040

File tree

3 files changed

+15
-1
lines changed

3 files changed

+15
-1
lines changed

doc/source/whatsnew/v1.5.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -1077,6 +1077,7 @@ Groupby/resample/rolling
10771077
- Bug in :meth:`.GroupBy.cummin` and :meth:`.GroupBy.cummax` with nullable dtypes incorrectly altering the original data in place (:issue:`46220`)
10781078
- Bug in :meth:`DataFrame.groupby` raising error when ``None`` is in first level of :class:`MultiIndex` (:issue:`47348`)
10791079
- Bug in :meth:`.GroupBy.cummax` with ``int64`` dtype with leading value being the smallest possible int64 (:issue:`46382`)
1080+
- Bug in :meth:`GroupBy.cumprod` ``NaN`` influences calculation in different columns with ``skipna=False`` (:issue:`48064`)
10801081
- Bug in :meth:`.GroupBy.max` with empty groups and ``uint64`` dtype incorrectly raising ``RuntimeError`` (:issue:`46408`)
10811082
- Bug in :meth:`.GroupBy.apply` would fail when ``func`` was a string and args or kwargs were supplied (:issue:`46479`)
10821083
- Bug in :meth:`SeriesGroupBy.apply` would incorrectly name its result when there was a unique group (:issue:`46369`)

pandas/_libs/groupby.pyx

-1
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,6 @@ def group_cumprod_float64(
204204
out[i, j] = NaN
205205
if not skipna:
206206
accum[lab, j] = NaN
207-
break
208207

209208

210209
@cython.boundscheck(False)

pandas/tests/groupby/test_function.py

+14
Original file line numberDiff line numberDiff line change
@@ -650,6 +650,20 @@ def test_groupby_cumprod():
650650
tm.assert_series_equal(actual, expected)
651651

652652

653+
def test_groupby_cumprod_nan_influences_other_columns():
654+
# GH#48064
655+
df = DataFrame(
656+
{
657+
"a": 1,
658+
"b": [1, np.nan, 2],
659+
"c": [1, 2, 3.0],
660+
}
661+
)
662+
result = df.groupby("a").cumprod(numeric_only=True, skipna=False)
663+
expected = DataFrame({"b": [1, np.nan, np.nan], "c": [1, 2, 6.0]})
664+
tm.assert_frame_equal(result, expected)
665+
666+
653667
def scipy_sem(*args, **kwargs):
654668
from scipy.stats import sem
655669

0 commit comments

Comments
 (0)