DataFrame.loc[n] = dict(..) fails with some type combinations #16309

bmcfee · 2017-05-09T19:47:50Z

Code Sample, a copy-pastable example if possible

This one fails:

# Your code here
In [9]: d = pd.DataFrame(columns=['time', 'value'])                    
In [9]: d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value='foo')
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-9-b557eb950858> in <module>()
----> 1 d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value='foo')

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    177             key = com._apply_if_callable(key, self.obj)
    178         indexer = self._get_setitem_indexer(key)
--> 179         self._setitem_with_indexer(indexer, value)
    180 
    181     def _has_valid_type(self, k, axis):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
    423                                        name=indexer)
    424 
--> 425                     self.obj._data = self.obj.append(value)._data
    426                     self.obj._maybe_update_cacher(clear=True)
    427                     return self.obj

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in append(self, other, ignore_index, verify_integrity)
   4628             other = DataFrame(other.values.reshape((1, len(other))),
   4629                               index=index,
-> 4630                               columns=combined_columns)
   4631             other = other._convert(datetime=True, timedelta=True)
   4632             if not self.columns.equals(combined_columns):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    304             else:
    305                 mgr = self._init_ndarray(data, index, columns, dtype=dtype,
--> 306                                          copy=copy)
    307         elif isinstance(data, (list, types.GeneratorType)):
    308             if isinstance(data, types.GeneratorType):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/frame.py in _init_ndarray(self, values, index, columns, dtype, copy)
    481             values = maybe_infer_to_datetimelike(values)
    482 
--> 483         return create_block_manager_from_blocks([values], [columns, index])
    484 
    485     @property

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in create_block_manager_from_blocks(blocks, axes)
   4294                                      placement=slice(0, len(axes[0])))]
   4295 
-> 4296         mgr = BlockManager(blocks, axes)
   4297         mgr._consolidate_inplace()
   4298         return mgr

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
   2790                     raise AssertionError('Number of Block dimensions (%d) '
   2791                                          'must equal number of axes (%d)' %
-> 2792                                          (block.ndim, self.ndim))
   2793 
   2794         if do_integrity_check:

AssertionError: Number of Block dimensions (1) must equal number of axes (2)

But this one succeeds:

In [11]: d.loc[0] = dict(time=pd.to_timedelta(5, unit='s'), value=5)

In [12]: d
Out[12]: 
      time value
0 00:00:05     5

This one also succeeds:

In [13]: d = pd.DataFrame(columns=['time', 'value'])

In [14]: d.loc[0] = dict(time=3, value='foo')

In [15]: d
Out[15]: 
  time value
0    3   foo

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

The current behavior is a problem because it is inconsistent, and depends on the type of data provided. Mixing timedelta with str fails, but timedelta with int works, as does int with str.

I believe this is related to aggressive type inference previously noted in #13829.

Expected Output

Not crashing.

Output of `pd.show_versions()`

In [16]: pd.show_versions() /home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/xarray/core/formatting.py:16: FutureWarning: The pandas.tslib module is deprecated and will be removed in a future version. from pandas.tslib import OutOfBoundsDatetime

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-77-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 35.0.2
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: 0.9.5
IPython: 6.0.0
sphinx: 1.5.5
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: 2.6.0
feather: None
matplotlib: 2.0.1
openpyxl: None
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
jinja2: 2.9.5
s3fs: 0.1.0
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

sinhrks · 2017-05-11T01:25:10Z

Thanks for the report. Yeah it looks like #13829. I once prepared a draft fix, and hopefully work on it again.

jorisvandenbossche · 2017-06-19T14:23:31Z

@bmcfee I cannot reproduce this anymore on master or with 0.20.2 (but fails on 0.20.1), so it seems this somehow got fixed.
Can you see if you can confirm it is fixed?

And if so, would also be nice to add some tests to keep it working.

bmcfee · 2017-06-19T14:58:28Z

Can you see if you can confirm it is fixed?

Confirmed that the above example now works on 0.20.2 (conda distribution).

However, I now get a different error if I try to update an existing record, even with identical contents:

In [1]: import pandas as pd

In [2]: d = pd.DataFrame(columns=['time', 'value'])

In [3]: d.loc[1] = dict(time=pd.to_timedelta(6, unit='s'), value='foo')

In [4]: d.loc[1] = dict(time=pd.to_timedelta(6, unit='s'), value='foo')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-20345bf5ca35> in <module>()
----> 1 d.loc[1] = dict(time=pd.to_timedelta(6, unit='s'), value='foo')

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    177             key = com._apply_if_callable(key, self.obj)
    178         indexer = self._get_setitem_indexer(key)
--> 179         self._setitem_with_indexer(indexer, value)
    180 
    181     def _has_valid_type(self, k, axis):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
    581 
    582                     for item, v in zip(labels, value):
--> 583                         setter(item, v)
    584             else:
    585 

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/indexing.py in setter(item, v)
    511                     s._consolidate_inplace()
    512                     s = s.copy()
--> 513                     s._data = s._data.setitem(indexer=pi, value=v)
    514                     s._maybe_update_cacher(clear=True)
    515 

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in setitem(self, **kwargs)
   3201 
   3202     def setitem(self, **kwargs):
-> 3203         return self.apply('setitem', **kwargs)
   3204 
   3205     def putmask(self, **kwargs):

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   3089 
   3090             kwargs['mgr'] = self
-> 3091             applied = getattr(b, f)(**kwargs)
   3092             result_blocks = _extend_blocks(applied, result_blocks)
   3093 

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in setitem(self, indexer, value, mgr)
    684 
    685         # coerce args
--> 686         values, _, value, _ = self._try_coerce_args(self.values, value)
    687         arr_value = np.array(value)
    688 

/home/bmcfee/miniconda/envs/py35/lib/python3.5/site-packages/pandas/core/internals.py in _try_coerce_args(self, values, other)
   1754         else:
   1755             # scalar
-> 1756             other = Timedelta(other)
   1757             other_mask = isnull(other)
   1758             other = other.value

pandas/_libs/tslib.pyx in pandas._libs.tslib.Timedelta.__new__ (pandas/_libs/tslib.c:50070)()

pandas/_libs/tslib.pyx in pandas._libs.tslib.parse_timedelta_string (pandas/_libs/tslib.c:60542)()

ValueError: unit abbreviation w/o a number

I suspect this is an entirely different kind of error, so it might make sense to close this one out and start a new issue, but I'll leave that call to you.

jorisvandenbossche · 2017-06-19T16:08:51Z

Ah, yes, I see that as well (so when the label already exists). This already is raising in 0.19.2, so not a new bug ..

phofl · 2020-11-15T00:20:28Z

Works now, setting once and setting twice

bmcfee mentioned this issue May 9, 2017

pandas 0.20 compatibility marl/jams#147

Closed

sinhrks added the Indexing Related to indexing on series/frames, not to indexes themselves label May 11, 2017

sinhrks added the Dtype Conversions Unexpected or buggy dtype conversions label May 11, 2017

jorisvandenbossche added the Regression Functionality that used to work in a prior pandas version label Jun 19, 2017

jorisvandenbossche added this to the 0.20.3 milestone Jun 19, 2017

jorisvandenbossche added Bug and removed Regression Functionality that used to work in a prior pandas version labels Jun 19, 2017

jorisvandenbossche mentioned this issue Jun 19, 2017

Unexpected result when setting a row by a dict #16724

Closed

jreback modified the milestones: Next Major Release, 0.20.3 Jul 6, 2017

phofl added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug labels Nov 15, 2020

mroeschke mentioned this issue May 21, 2021

TST: Old issues #41607

Merged

10 tasks

jreback modified the milestones: Contributions Welcome, 1.3 May 21, 2021

jreback closed this as completed in #41607 May 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.loc[n] = dict(..) fails with some type combinations #16309

DataFrame.loc[n] = dict(..) fails with some type combinations #16309

bmcfee commented May 9, 2017

INSTALLED VERSIONS

sinhrks commented May 11, 2017

jorisvandenbossche commented Jun 19, 2017

bmcfee commented Jun 19, 2017

jorisvandenbossche commented Jun 19, 2017

phofl commented Nov 15, 2020

DataFrame.loc[n] = dict(..) fails with some type combinations #16309

DataFrame.loc[n] = dict(..) fails with some type combinations #16309

Comments

bmcfee commented May 9, 2017

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

sinhrks commented May 11, 2017

jorisvandenbossche commented Jun 19, 2017

bmcfee commented Jun 19, 2017

jorisvandenbossche commented Jun 19, 2017

phofl commented Nov 15, 2020

Output of `pd.show_versions()`