Skip to content

Commit 947f5ae

Browse files
Backport PR #57323 on branch 2.2.x (REGR: Fix regression when grouping over a Series) (#57339)
Backport PR #57323: REGR: Fix regression when grouping over a Series Co-authored-by: Patrick Hoefler <[email protected]>
1 parent 10b26fe commit 947f5ae

File tree

3 files changed

+14
-3
lines changed

3 files changed

+14
-3
lines changed

doc/source/whatsnew/v2.2.1.rst

+1
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ Fixed regressions
2121
- Fixed regression in :meth:`.DataFrameGroupBy.idxmin`, :meth:`.DataFrameGroupBy.idxmax`, :meth:`.SeriesGroupBy.idxmin`, :meth:`.SeriesGroupBy.idxmax` ignoring the ``skipna`` argument (:issue:`57040`)
2222
- Fixed regression in :meth:`.DataFrameGroupBy.idxmin`, :meth:`.DataFrameGroupBy.idxmax`, :meth:`.SeriesGroupBy.idxmin`, :meth:`.SeriesGroupBy.idxmax` where values containing the minimum or maximum value for the dtype could produce incorrect results (:issue:`57040`)
2323
- Fixed regression in :meth:`CategoricalIndex.difference` raising ``KeyError`` when other contains null values other than NaN (:issue:`57318`)
24+
- Fixed regression in :meth:`DataFrame.groupby` raising ``ValueError`` when grouping by a :class:`Series` in some cases (:issue:`57276`)
2425
- Fixed regression in :meth:`DataFrame.loc` raising ``IndexError`` for non-unique, masked dtype indexes where result has more than 10,000 rows (:issue:`57027`)
2526
- Fixed regression in :meth:`DataFrame.merge` raising ``ValueError`` for certain types of 3rd-party extension arrays (:issue:`57316`)
2627
- Fixed regression in :meth:`DataFrame.sort_index` not producing a stable sort for a index with duplicates (:issue:`57151`)

pandas/core/internals/managers.py

+2-3
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@
1212
cast,
1313
)
1414
import warnings
15-
import weakref
1615

1716
import numpy as np
1817

@@ -282,8 +281,8 @@ def references_same_values(self, mgr: BaseBlockManager, blkno: int) -> bool:
282281
Checks if two blocks from two different block managers reference the
283282
same underlying values.
284283
"""
285-
ref = weakref.ref(self.blocks[blkno])
286-
return ref in mgr.blocks[blkno].refs.referenced_blocks
284+
blk = self.blocks[blkno]
285+
return any(blk is ref() for ref in mgr.blocks[blkno].refs.referenced_blocks)
287286

288287
def get_dtypes(self) -> npt.NDArray[np.object_]:
289288
dtypes = np.array([blk.dtype for blk in self.blocks], dtype=object)

pandas/tests/copy_view/test_methods.py

+11
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,17 @@ def test_reset_index_series_drop(using_copy_on_write, index):
280280
tm.assert_series_equal(ser, ser_orig)
281281

282282

283+
def test_groupby_column_index_in_references():
284+
df = DataFrame(
285+
{"A": ["a", "b", "c", "d"], "B": [1, 2, 3, 4], "C": ["a", "a", "b", "b"]}
286+
)
287+
df = df.set_index("A")
288+
key = df["C"]
289+
result = df.groupby(key, observed=True).sum()
290+
expected = df.groupby("C", observed=True).sum()
291+
tm.assert_frame_equal(result, expected)
292+
293+
283294
def test_rename_columns(using_copy_on_write):
284295
# Case: renaming columns returns a new dataframe
285296
# + afterwards modifying the result

0 commit comments

Comments
 (0)