Setting DataFrame of Python objects and MultiIndex columns wth single-element NDFrame inserts list #14592

toobaz · 2016-11-05T11:03:46Z

A small, complete example of the issue

In [2]: t = pd.DataFrame('a', index=range(2),
                              columns=pd.MultiIndex.from_product([range(2), range(2)]))

In [3]: t.loc[0, [(0,1)]] = t.loc[0, [(0,1)]]

In [4]: t
Out[4]: 
   0       1   
   0    1  0  1
0  a  [a]  a  a
1  a    a  a  a

The same happens when providing, rather than a list of indices, a mask with only one True value.

The above line is an useless example, but this is a problem in in-place operations.

I suspect the fix should not be very complicated, given that everything works smoothly if we have numbers rather than Python objects (the bug instead arises both if we assign a cell of a numbers-only DF to a cell of a DF of objects, and if we do the opposite)

Expected Output

Just the original t.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: 7a2bcb6
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.7.0-1-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.utf8
LOCALE: it_IT.UTF-8

pandas: 0.19.0+67.g7a2bcb6.dirty
nose: 1.3.7
pip: 8.1.2
setuptools: 28.0.0
Cython: 0.23.4
numpy: 1.11.2
scipy: 0.18.1
statsmodels: 0.8.0.dev0+f80669e
xarray: None
IPython: 5.1.0.dev
sphinx: 1.4.8
patsy: 0.3.0-dev
dateutil: 2.5.3
pytz: 2015.7
blosc: None
bottleneck: 1.2.0dev
tables: 3.2.2
numexpr: 2.6.0
matplotlib: 1.5.3
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: None
bs4: 4.5.1
html5lib: 0.999
httplib2: 0.9.1
apiclient: 1.5.2
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.40.0
pandas_datareader: 0.2.1

The text was updated successfully, but these errors were encountered:

toobaz · 2016-11-05T11:17:17Z

Actually, I think the "numbers only" case has problems too. Take

t = pd.DataFrame(3, index=range(2),columns=pd.MultiIndex.from_product([range(2), range(2)]))

Then the following works fine:

In [3]: t.loc[0, [(0,1), (1,1)]] = [5,6]

while...

In [4]: t.loc[0, [(0,0)]] = [7]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-57ac0089af6c> in <module>()
----> 1 t.loc[0, [(0,0)]] = [7]

/home/nobackup/repo/pandas/pandas/core/indexing.py in __setitem__(self, key, value)
    138             key = com._apply_if_callable(key, self.obj)
    139         indexer = self._get_setitem_indexer(key)
--> 140         self._setitem_with_indexer(indexer, value)
    141 
    142     def _has_valid_type(self, k, axis):

/home/nobackup/repo/pandas/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
    530                 # we have an equal len list/ndarray
    531                 elif can_do_equal_len():
--> 532                     setter(labels[0], value)
    533 
    534                 # per label values

/home/nobackup/repo/pandas/pandas/core/indexing.py in setter(item, v)
    470                     s._consolidate_inplace()
    471                     s = s.copy()
--> 472                     s._data = s._data.setitem(indexer=pi, value=v)
    473                     s._maybe_update_cacher(clear=True)
    474 

/home/nobackup/repo/pandas/pandas/core/internals.py in setitem(self, **kwargs)
   3167 
   3168     def setitem(self, **kwargs):
-> 3169         return self.apply('setitem', **kwargs)
   3170 
   3171     def putmask(self, **kwargs):

/home/nobackup/repo/pandas/pandas/core/internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   3055 
   3056             kwargs['mgr'] = self
-> 3057             applied = getattr(b, f)(**kwargs)
   3058             result_blocks = _extend_blocks(applied, result_blocks)
   3059 

/home/nobackup/repo/pandas/pandas/core/internals.py in setitem(self, indexer, value, mgr)
    727             # GH 6043
    728             elif _is_scalar_indexer(indexer):
--> 729                 values[indexer] = value
    730 
    731             # if we are an exact match (ex-broadcasting),

ValueError: setting an array element with a sequence.

mroeschke · 2019-10-21T00:18:08Z

This looks to give the correct result on master. Could use a test.

In [236]: In [2]: t = pd.DataFrame('a', index=range(2),
     ...:                               columns=pd.MultiIndex.from_product([range(2), range(2)]))

In [237]: t
Out[237]:
   0     1
   0  1  0  1
0  a  a  a  a
1  a  a  a  a

In [238]: In [3]: t.loc[0, [(0,1)]] = t.loc[0, [(0,1)]]

In [239]: t
Out[239]:
   0     1
   0  1  0  1
0  a  a  a  a
1  a  a  a  a

In [240]: pd.__version__
Out[240]: '0.26.0.dev0+593.g9d45934af'

jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Feb 8, 2018

jreback added this to the Next Major Release milestone Feb 8, 2018

jreback mentioned this issue Feb 8, 2018

BUG: loc/iloc insertion inserts array, scalar expected #19590

Closed

mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Oct 21, 2019

mroeschke mentioned this issue Jan 20, 2020

TST: Add regression tests for fixed issues #31161

Merged

10 tasks

jreback modified the milestones: Contributions Welcome, 1.1 Jan 20, 2020

mroeschke closed this as completed in #31161 Jan 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting DataFrame of Python objects and MultiIndex columns wth single-element NDFrame inserts list #14592

Setting DataFrame of Python objects and MultiIndex columns wth single-element NDFrame inserts list #14592

toobaz commented Nov 5, 2016

INSTALLED VERSIONS

toobaz commented Nov 5, 2016

mroeschke commented Oct 21, 2019

Setting DataFrame of Python objects and MultiIndex columns wth single-element NDFrame inserts list #14592

Setting DataFrame of Python objects and MultiIndex columns wth single-element NDFrame inserts list #14592

Comments

toobaz commented Nov 5, 2016

A small, complete example of the issue

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

toobaz commented Nov 5, 2016

mroeschke commented Oct 21, 2019

Output of `pd.show_versions()`