Skip to content

Commit 10a95d5

Browse files
topper-123victor
authored and
victor
committed
BUG: bug in GroupBy.count where arg minlength passed to np.bincount must be None for np<1.13 (pandas-dev#21957)
1 parent 306e794 commit 10a95d5

File tree

3 files changed

+14
-4
lines changed

3 files changed

+14
-4
lines changed

doc/source/whatsnew/v0.24.0.txt

+3-3
Original file line numberDiff line numberDiff line change
@@ -536,11 +536,11 @@ Groupby/Resample/Rolling
536536

537537
- Bug in :func:`pandas.core.groupby.GroupBy.first` and :func:`pandas.core.groupby.GroupBy.last` with ``as_index=False`` leading to the loss of timezone information (:issue:`15884`)
538538
- Bug in :meth:`DatetimeIndex.resample` when downsampling across a DST boundary (:issue:`8531`)
539-
-
540-
-
541-
539+
- Bug where ``ValueError`` is wrongly raised when calling :func:`~pandas.core.groupby.SeriesGroupBy.count` method of a
540+
``SeriesGroupBy`` when the grouping variable only contains NaNs and numpy version < 1.13 (:issue:`21956`).
542541
- Multiple bugs in :func:`pandas.core.Rolling.min` with ``closed='left'` and a
543542
datetime-like index leading to incorrect results and also segfault. (:issue:`21704`)
543+
-
544544

545545
Sparse
546546
^^^^^^

pandas/core/groupby/generic.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1207,7 +1207,7 @@ def count(self):
12071207

12081208
mask = (ids != -1) & ~isna(val)
12091209
ids = ensure_platform_int(ids)
1210-
out = np.bincount(ids[mask], minlength=ngroups or 0)
1210+
out = np.bincount(ids[mask], minlength=ngroups or None)
12111211

12121212
return Series(out,
12131213
index=self.grouper.result_index,

pandas/tests/groupby/test_counting.py

+10
Original file line numberDiff line numberDiff line change
@@ -212,3 +212,13 @@ def test_count_with_datetimelike(self, datetimelike):
212212
expected = DataFrame({'y': [2, 1]}, index=['a', 'b'])
213213
expected.index.name = "x"
214214
assert_frame_equal(expected, res)
215+
216+
def test_count_with_only_nans_in_first_group(self):
217+
# GH21956
218+
df = DataFrame({'A': [np.nan, np.nan], 'B': ['a', 'b'], 'C': [1, 2]})
219+
result = df.groupby(['A', 'B']).C.count()
220+
mi = MultiIndex(levels=[[], ['a', 'b']],
221+
labels=[[], []],
222+
names=['A', 'B'])
223+
expected = Series([], index=mi, dtype=np.int64, name='C')
224+
assert_series_equal(result, expected, check_index_type=False)

0 commit comments

Comments
 (0)