Skip to content

Commit 869e15a

Browse files
Backport PR pandas-dev#36114: REGR: fix consolidation/cache issue with take operation (pandas-dev#36135)
Co-authored-by: Joris Van den Bossche <[email protected]>
1 parent 45bf911 commit 869e15a

File tree

3 files changed

+26
-0
lines changed

3 files changed

+26
-0
lines changed

doc/source/whatsnew/v1.1.2.rst

+1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Fixed regressions
1717
- Regression in :meth:`DatetimeIndex.intersection` incorrectly raising ``AssertionError`` when intersecting against a list (:issue:`35876`)
1818
- Fix regression in updating a column inplace (e.g. using ``df['col'].fillna(.., inplace=True)``) (:issue:`35731`)
1919
- Performance regression for :meth:`RangeIndex.format` (:issue:`35712`)
20+
- Fix regression in invalid cache after an indexing operation; this can manifest when setting which does not update the data (:issue:`35521`)
2021
- Regression in :meth:`DataFrame.replace` where a ``TypeError`` would be raised when attempting to replace elements of type :class:`Interval` (:issue:`35931`)
2122
- Fix regression in pickle roundtrip of the ``closed`` attribute of :class:`IntervalIndex` (:issue:`35658`)
2223
- Fixed regression in :meth:`DataFrameGroupBy.agg` where a ``ValueError: buffer source array is read-only`` would be raised when the underlying array is read-only (:issue:`36014`)

pandas/core/generic.py

+2
Original file line numberDiff line numberDiff line change
@@ -3342,6 +3342,8 @@ class max_speed
33423342

33433343
nv.validate_take(tuple(), kwargs)
33443344

3345+
self._consolidate_inplace()
3346+
33453347
new_data = self._mgr.take(
33463348
indices, axis=self._get_block_manager_axis(axis), verify=True
33473349
)

pandas/tests/frame/test_block_internals.py

+23
Original file line numberDiff line numberDiff line change
@@ -640,3 +640,26 @@ def test_update_inplace_sets_valid_block_values():
640640

641641
# smoketest for OP bug from GH#35731
642642
assert df.isnull().sum().sum() == 0
643+
644+
645+
def test_nonconsolidated_item_cache_take():
646+
# https://github.com/pandas-dev/pandas/issues/35521
647+
648+
# create non-consolidated dataframe with object dtype columns
649+
df = pd.DataFrame()
650+
df["col1"] = pd.Series(["a"], dtype=object)
651+
df["col2"] = pd.Series([0], dtype=object)
652+
653+
# access column (item cache)
654+
df["col1"] == "A"
655+
# take operation
656+
# (regression was that this consolidated but didn't reset item cache,
657+
# resulting in an invalid cache and the .at operation not working properly)
658+
df[df["col2"] == 0]
659+
660+
# now setting value should update actual dataframe
661+
df.at[0, "col1"] = "A"
662+
663+
expected = pd.DataFrame({"col1": ["A"], "col2": [0]}, dtype=object)
664+
tm.assert_frame_equal(df, expected)
665+
assert df.at[0, "col1"] == "A"

0 commit comments

Comments
 (0)