Skip to content

BUG: type change breaks BlockManager integrity #8853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 20, 2014

Conversation

behzadnouri
Copy link
Contributor

closes #8850

on master:

>>> cols = MultiIndex.from_tuples([('1st', 'a'), ('2nd', 'b'), ('3rd', 'c')])
>>> df = DataFrame([[1.0, 2, 3], [4.0, 5, 6]], columns=cols)
>>> df['2nd'] = df['2nd'] * 2.0  # type change in block manager
/usr/lib/python3.4/site-packages/numpy/lib/function_base.py:3612: FutureWarning: in the future negative indices will not be ignored by `numpy.delete`.
  "`numpy.delete`.", FutureWarning)
>>> df.values
...
  File "/usr/lib/python3.4/site-packages/pandas-0.15.1_72_gf504885-py3.4-linux-x86_64.egg/pandas/core/internals.py", line 2392, in _verify_integrity
    tot_items))
AssertionError: Number of manager items must equal union of block items
# manager items: 3, # tot_items: 4
>>> df.blocks
...
  File "/usr/lib/python3.4/site-packages/pandas-0.15.1_72_gf504885-py3.4-linux-x86_64.egg/pandas/core/internals.py", line 2392, in _verify_integrity
    tot_items))
AssertionError: Number of manager items must equal union of block items
# manager items: 3, # tot_items: 4

._data is also broken:

>>> df._data
BlockManager
Items:       
1st  a
2nd  b
3rd  c
Axis 1: Int64Index([0, 1], dtype='int64')
FloatBlock: slice(0, 1, 1), 1 x 2, dtype: float64
IntBlock: slice(1, 3, 1), 2 x 2, dtype: int64
FloatBlock: slice(1, 2, 1), 1 x 2, dtype: float64

integer block is bigger than what it should be and overlaps with one of the float blocks.

@jreback
Copy link
Contributor

jreback commented Nov 19, 2014

cc @immerrr

@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 19, 2014
@jreback jreback added this to the 0.15.2 milestone Nov 19, 2014
@immerrr
Copy link
Contributor

immerrr commented Nov 20, 2014

Yeah, I guess I didn't expect loc to be a slice in BlockManager.set.

Such corruption would be also a lot easier to prevent if blkloc invalidation (self._blklocs[blk.mgr_locs.indexer] = -1) used np.info(self._blklocs.dtype).min instead of -1 which is a valid indexer most of the time.

jreback added a commit that referenced this pull request Nov 20, 2014
BUG: type change breaks BlockManager integrity
@jreback jreback merged commit 5470f5c into pandas-dev:master Nov 20, 2014
@jreback
Copy link
Contributor

jreback commented Nov 20, 2014

@behzadnouri thanks

@immerrr I'd u would like to post an issue about this invalidation go ahead

@behzadnouri behzadnouri deleted the blk-mgr branch November 21, 2014 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Accessing DataFrame multi-index column *seems* to modify its content
3 participants