Skip to content

Commit 4434f48

Browse files
committed
BUG: fix unpickling of some pre-0.14.1 pickles with non-unique items
Series, frames and panels that contain only one block can be unpickled under the assumption that block items correspond to manager items 1-to-1.
1 parent 9eed2e4 commit 4434f48

File tree

2 files changed

+18
-3
lines changed

2 files changed

+18
-3
lines changed

doc/source/v0.15.0.txt

+2
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,8 @@ Bug Fixes
191191

192192
- Bug in pickles contains ``DateOffset`` may raise ``AttributeError`` when ``normalize`` attribute is reffered internally (:issue:`7748`)
193193

194+
- Bug in pickle deserialization that failed for pre-0.14.1 containers with dup items trying to avoid ambiguity
195+
when matching block and manager items, when there's only one block there's no ambiguity (:issue:`7794`)
194196

195197

196198
- Bug in ``is_superperiod`` and ``is_subperiod`` cannot handle higher frequencies than ``S`` (:issue:`7760`, :issue:`7772`, :issue:`7803`)

pandas/core/internals.py

+16-3
Original file line numberDiff line numberDiff line change
@@ -2271,10 +2271,23 @@ def unpickle_block(values, mgr_locs):
22712271
ax_arrays, bvalues, bitems = state[:3]
22722272

22732273
self.axes = [_ensure_index(ax) for ax in ax_arrays]
2274+
2275+
if len(bitems) == 1 and self.axes[0].equals(bitems[0]):
2276+
# This is a workaround for pre-0.14.1 pickles that didn't
2277+
# support unpickling multi-block frames/panels with non-unique
2278+
# columns/items, because given a manager with items ["a", "b",
2279+
# "a"] there's no way of knowing which block's "a" is where.
2280+
#
2281+
# Single-block case can be supported under the assumption that
2282+
# block items corresponded to manager items 1-to-1.
2283+
all_mgr_locs = [slice(0, len(bitems[0]))]
2284+
else:
2285+
all_mgr_locs = [self.axes[0].get_indexer(blk_items)
2286+
for blk_items in bitems]
2287+
22742288
self.blocks = tuple(
2275-
unpickle_block(values,
2276-
self.axes[0].get_indexer(items))
2277-
for values, items in zip(bvalues, bitems))
2289+
unpickle_block(values, mgr_locs)
2290+
for values, mgr_locs in zip(bvalues, all_mgr_locs))
22782291

22792292
self._post_setstate()
22802293

0 commit comments

Comments
 (0)