Skip to content

Commit b552e28

Browse files
committed
groupy apply: Ensure same index is returned for slow and fast path
1 parent d106b81 commit b552e28

File tree

3 files changed

+21
-1
lines changed

3 files changed

+21
-1
lines changed

doc/source/whatsnew/v1.1.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -550,6 +550,7 @@ Groupby/resample/rolling
550550
- Bug in :meth:`DataFrameGroupBy.agg` with dictionary input losing ``ExtensionArray`` dtypes (:issue:`32194`)
551551
- Bug in :meth:`DataFrame.resample` where an ``AmbiguousTimeError`` would be raised when the resulting timezone aware :class:`DatetimeIndex` had a DST transition at midnight (:issue:`25758`)
552552
- Bug in :meth:`DataFrame.groupby` where a ``ValueError`` would be raised when grouping by a categorical column with read-only categories and ``sort=False`` (:issue:`33410`)
553+
- Bug in :meth:`core.groupby.DataFrameGroupBy.apply` where the result shape was incorrect when the output index was not identical to the input index (:issue:`31612`)
553554

554555
Reshaping
555556
^^^^^^^^^

pandas/_libs/reduction.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -502,7 +502,7 @@ def apply_frame_axis0(object frame, object f, object names,
502502
# Need to infer if low level index slider will cause segfaults
503503
require_slow_apply = i == 0 and piece is chunk
504504
try:
505-
if piece.index is not chunk.index:
505+
if not piece.index.equals(chunk.index):
506506
mutated = True
507507
except AttributeError:
508508
# `piece` might not have an index, could be e.g. an int

pandas/tests/groupby/test_apply.py

+19
Original file line numberDiff line numberDiff line change
@@ -901,3 +901,22 @@ def fn(x):
901901
name="col2",
902902
)
903903
tm.assert_series_equal(result, expected)
904+
905+
906+
def test_apply_fast_slow_identical():
907+
# GH 31613
908+
909+
df = DataFrame({"A": [0, 0, 1], "b": range(3)})
910+
911+
# For simple index structures we check for fast/slow apply using
912+
# an identity check on in/output
913+
def slow(group):
914+
return group
915+
916+
def fast(group):
917+
return group.copy()
918+
919+
fast_df = df.groupby("A").apply(fast)
920+
slow_df = df.groupby("A").apply(slow)
921+
922+
tm.assert_frame_equal(fast_df, slow_df)

0 commit comments

Comments
 (0)