Skip to content

Effect of assigning list to DataFrame cell depends on unrelated column's type #25806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
RauliRuohonen opened this issue Mar 20, 2019 · 1 comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).

Comments

@RauliRuohonen
Copy link

Code Sample, a copy-pastable example if possible

import pandas as pd

df = pd.DataFrame([{'foo': None, 'bar': None}], index=['a'])
df.loc['a', 'foo'] = ['123']
print(df)
print()

df = pd.DataFrame([{'foo': None, 'bar': None}], index=['a'])
df['bar'] = df['bar'].astype(float)
df.loc['a', 'foo'] = ['123']
print(df)
print()

df = pd.DataFrame([{'foo': None, 'bar': None}], index=['a'])
df.at['a', 'foo'] = ['123']
print(df)
print()

df = pd.DataFrame([{'foo': None, 'bar': None}], index=['a'])
df['bar'] = df['bar'].astype(float)
df.index = df.index.astype('category')
df.at['a', 'foo'] = ['123']
print(df)

Problem description

Output is:

    bar    foo
a  None  [123]

   bar  foo
a  NaN  123

    bar    foo
a  None  [123]

   bar  foo
a  NaN  123

Behavior of the operation df.loc['a', 'foo'] = ['123'] depends on the type of the bar column, even though the bar column has nothing to do with the operation. If one writes a function using such operations, and a new column containing some unrelated data is added to the data frame, the function will break. The dependency on unrelated columns is also surprising.

If bar is of type object, then the list ['123'] is assigned to the cell. If bar is of type float, then the string '123' is assigned to the cell. If the list is longer than one element, then in the latter case one gets the exception ValueError: Must have equal len keys and value when setting with an iterable.

Instead of loc, using at works consistently for the first two examples. However, if one changes the index's type to category instead of object as in the last two examples, then one observes the same inconsistency as with loc. This is even more surprising, since one gets the same ValueError: Must have equal len keys and value when setting with an iterable exception for longer lists, and at is supposed to do single element assignment, but that exception suggests that it is trying to do multiple element assignment instead.

Expected Output

    bar    foo
a  None  [123]

   bar  foo
a  NaN  [123]

    bar    foo
a  None  [123]

   bar  foo
a  NaN  [123]

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Darwin
OS-release: 17.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: None
pip: 18.1
setuptools: 39.0.1
Cython: 0.28.5
numpy: 1.15.2
scipy: 1.1.0
pyarrow: None
xarray: 0.10.9
IPython: 6.5.0
sphinx: 1.8.1
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.11
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: 0.1.6
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@WillAyd
Copy link
Member

WillAyd commented Mar 22, 2019

We don't offer first class support for nesting containers within a DataFrame. With that said if you investigate and see an easy way to make it consistent PRs would be welcome

@mroeschke mroeschke added the Indexing Related to indexing on series/frames, not to indexes themselves label Nov 2, 2019
@jbrockmendel jbrockmendel added the Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). label Sep 22, 2020
@mroeschke mroeschke added the Bug label Jun 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).
Projects
None yet
Development

No branches or pull requests

4 participants