Skip to content

REGR: frame/frame op with unaligned blocks + non-slice-like placement failing with assertion error #34367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue May 25, 2020 · 2 comments · Fixed by #34421
Labels
Blocker Blocking issue or pull request for an upcoming release Numeric Operations Arithmetic, Comparison, and Logical operations Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@jorisvandenbossche
Copy link
Member

In [7]: arr = np.random.randint(0, 1000, (100, 10))  

In [8]: df1 = pd.DataFrame(arr)  

In [9]: df2 = df1.copy() 
   ...: df2.iloc[0, [1, 3, 7]] =  np.nan    

In [10]: df1 + df2  
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-10-fa4784095cc3> in <module>
----> 1 df1 + df2

~/scipy/pandas/pandas/core/ops/__init__.py in f(self, other, axis, level, fill_value)
    663         if isinstance(other, ABCDataFrame):
    664             # Another DataFrame
--> 665             new_data = self._combine_frame(other, na_op, fill_value)
    666 
    667         elif isinstance(other, ABCSeries):

~/scipy/pandas/pandas/core/frame.py in _combine_frame(self, other, func, fill_value)
   5734                 return func(left, right)
   5735 
-> 5736         new_data = ops.dispatch_to_series(self, other, _arith_op)
   5737         return new_data
   5738 

~/scipy/pandas/pandas/core/ops/__init__.py in dispatch_to_series(left, right, func, axis)
    283 
    284         array_op = get_array_op(func)
--> 285         bm = left._mgr.operate_blockwise(right._mgr, array_op)
    286         return type(left)(bm)
    287 

~/scipy/pandas/pandas/core/internals/managers.py in operate_blockwise(self, other, array_op)
    358         Apply array_op blockwise with another (aligned) BlockManager.
    359         """
--> 360         return operate_blockwise(self, other, array_op)
    361 
    362     def apply(self: T, f, align_keys=None, **kwargs) -> T:

~/scipy/pandas/pandas/core/internals/ops.py in operate_blockwise(left, right, array_op)
     34             right_ea = not isinstance(rblk.values, np.ndarray)
     35 
---> 36             lvals, rvals = _get_same_shape_values(blk, rblk, left_ea, right_ea)
     37 
     38             res_values = array_op(lvals, rvals)

~/scipy/pandas/pandas/core/internals/ops.py in _get_same_shape_values(lblk, rblk, left_ea, right_ea)
     84 
     85     # Require that the indexing into lvals be slice-like
---> 86     assert rblk.mgr_locs.is_slice_like, rblk.mgr_locs
     87 
     88     # TODO(EA2D): with 2D EAs pnly this first clause would be needed

AssertionError: BlockPlacement([0 2 4 5 6 8 9])

cc @jbrockmendel I suppose caused by the frame-frame blockwise PR (#32779), but didn't yet look into detail

Just removing the assertion seems to still work (for this case), but I don't know if that _get_same_shape_values functino relies on the is_slice_like characteristic for its implementation to be correct

@jorisvandenbossche jorisvandenbossche added Regression Functionality that used to work in a prior pandas version Numeric Operations Arithmetic, Comparison, and Logical operations labels May 25, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.1 milestone May 25, 2020
@jorisvandenbossche
Copy link
Member Author

And another assertion error with a slightly different case (left and right not exactly the same):

In [11]: df2 = df1.copy() 
    ...: df2.iloc[0, [1, 3, 7]] =  np.nan 

In [12]: df3 = df1.copy() 
    ...: df3.iloc[0, [5]] =  np.nan

In [13]: df2 + df3  
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-13-f31d5cc84129> in <module>
----> 1 df2 + df3

~/scipy/pandas/pandas/core/ops/__init__.py in f(self, other, axis, level, fill_value)
    663             axis = self._get_axis_number(axis) if axis is not None else 1
    664             new_data = _combine_series_frame(self, other, op, axis=axis)
--> 665         else:
    666             # in this case we always have `np.ndim(other) == 0`
    667             if fill_value is not None:

~/scipy/pandas/pandas/core/frame.py in _combine_frame(self, other, func, fill_value)
   5734         result = self.copy()
   5735 
-> 5736         if axis == 0:
   5737             assert isinstance(result.index, MultiIndex)
   5738             result.index = result.index.reorder_levels(order)

~/scipy/pandas/pandas/core/ops/__init__.py in dispatch_to_series(left, right, func, axis)
    283     elif isinstance(right, ABCSeries) and axis == "columns":
    284         # We only get here if called via _combine_series_frame,
--> 285         # in which case we specifically want to operate row-by-row
    286         assert right.index.equals(left.columns)
    287 

~/scipy/pandas/pandas/core/internals/managers.py in operate_blockwise(self, other, array_op)
    358 
    359             if np.ndim(bres) == 0:
--> 360                 # EA
    361                 assert blk.shape[0] == 1
    362                 new_res = zip(blk.mgr_locs.as_array, [bres])

~/scipy/pandas/pandas/core/internals/ops.py in operate_blockwise(left, right, array_op)
     23         left_ea = not isinstance(blk_vals, np.ndarray)
     24 
---> 25         rblks = right._slice_take_blocks_ax0(locs.indexer, only_slice=True)
     26 
     27         # Assertions are disabled for performance, but should hold:

~/scipy/pandas/pandas/core/internals/managers.py in _slice_take_blocks_ax0(self, slice_or_indexer, fill_value, only_slice)
   1383                     # A non-consolidatable block, it's easy, because there's
   1384                     # only one item and each mgr loc is a copy of that single
-> 1385                     # item.
   1386                     for mgr_loc in mgr_locs:
   1387                         newblk = blk.copy(deep=False)

~/scipy/pandas/pandas/core/internals/blocks.py in getitem_block(self, slicer, new_mgr_locs)
    297             raise ValueError("Only same dim slicing is allowed")
    298 
--> 299         return self.make_block_same_class(new_values, new_mgr_locs)
    300 
    301     @property

~/scipy/pandas/pandas/core/internals/blocks.py in make_block_same_class(self, values, placement, ndim)
    252         if ndim is None:
    253             ndim = self.ndim
--> 254         return type(self)(values, placement=placement, ndim=ndim)
    255 
    256     def __repr__(self) -> str:

~/scipy/pandas/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
    112     def __init__(self, values, placement, ndim=None):
    113         self.ndim = self._check_ndim(values, ndim)
--> 114         self.mgr_locs = placement
    115         self.values = values
    116 

~/scipy/pandas/pandas/core/internals/blocks.py in mgr_locs(self, new_mgr_locs)
    230     def mgr_locs(self, new_mgr_locs):
    231         if not isinstance(new_mgr_locs, libinternals.BlockPlacement):
--> 232             new_mgr_locs = libinternals.BlockPlacement(new_mgr_locs)
    233 
    234         self._mgr_locs = new_mgr_locs

~/scipy/pandas/pandas/_libs/internals.pyx in pandas._libs.internals.BlockPlacement.__init__()

AssertionError: ()

@jbrockmendel
Copy link
Member

im having hardware issues, wont be able to troubleshoot this for a few days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocker Blocking issue or pull request for an upcoming release Numeric Operations Arithmetic, Comparison, and Logical operations Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants