Unexpected result when setting a row by a dict #16724

Yevgnen · 2017-06-19T04:01:16Z

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd


persons = [
    {
        'name': ''.join([
            np.random.choice([chr(x) for x in range(97, 97 + 26)])
            for i in range(5)
        ]),
        'age': np.random.randint(0, 100),
        'sex': np.random.choice(['male', 'female']),
        'job': np.random.choice(['staff', 'cook', 'student']),
        'birthday': np.random.choice(pd.date_range('1990-01-01', '2010-01-01')),
        'hobby': np.random.choice(['cs', 'war3', 'dota'])
    }
    for i in range(10)
]

df = pd.DataFrame(persons)
df.set_index('birthday', inplace=True)
print(df)
df.iloc[0] = {
    'name': 'john',
    'age': int(10),
    'sex': 'male',
    'hobby': 'nohobby',
    'job': 'haha'
}
print(df)

Problem description

             age hobby      job   name     sex
birthday
2007-12-31  name   age      sex  hobby     job
2004-07-31    20  dota  student  uwxhn  female
2001-10-22    34  war3     cook  udknv  female
2002-10-13    91  dota     cook  bofcv  female
1992-05-25    54  war3     cook  tcqew    male
2009-09-02    95  war3    staff  jcolr  female
1998-12-15    61  war3  student  dibkw  female
2004-07-03     4  war3  student  mntqh    male
2000-06-08    88  war3    staff  jknxm  female
2006-10-19    82    cs  student  asrpz    male

Have no idea why the keys are set to the rows.

Expected Output

             age hobby      job   name     sex
birthday  
2007-12-31    10 nohobby  haha john male
2004-07-31    20  dota  student  uwxhn  female
2001-10-22    34  war3     cook  udknv  female
2002-10-13    91  dota     cook  bofcv  female
1992-05-25    54  war3     cook  tcqew    male
2009-09-02    95  war3    staff  jcolr  female
1998-12-15    61  war3  student  dibkw  female
2004-07-03     4  war3  student  mntqh    male
2000-06-08    88  war3    staff  jknxm  female
2006-10-19    82    cs  student  asrpz    male

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Darwin
OS-release: 16.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 35.0.2
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: 0.9.5
IPython: 6.0.0
sphinx: 1.5.5
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.0
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

TomAugspurger · 2017-06-19T12:30:58Z

Hmm, all our setting code deals with iterables, which means the .keys() for a dictionary. The only caveat is that Series and DataFrames are aligned before setting.

I'll let Jeff weigh in, but I don't think we'll want to support this. If you want that behavior, you'll need to pass the dictionary to a pd.Series first.

Yevgnen · 2017-06-19T12:35:46Z

Thanks for your reply. I think if this will not be supported, I think the example in the document should be removed for reducing confusion.😅

TomAugspurger · 2017-06-19T12:42:20Z

Your example probably should work then, my mistake. It seems like we don't probably set when there are multiple dtypes?

# two dtypes
In [44]: x = pd.DataFrame({'x': [1, 2, 3], 'y': ['3', '4', '5']})

In [46]: x.iloc[1] = {'x': 9, 'y': '99'}

In [47]: x  # set incorrectly with the keys
Out[47]:
   x  y
0  1  3
1  x  y
2  3  5

# single dtype
In [48]: x = pd.DataFrame({'x': [1, 2, 3], 'y': [3, 4, 5]})

In [49]: x.iloc[1] = {'x': 9, 'y': 99}

In [50]: x  # sets correctly
Out[50]:
   x   y
0  1   3
1  9  99
2  3   5

Yevgnen · 2017-06-19T12:45:23Z

Probably. :-)

jorisvandenbossche · 2017-06-19T16:34:01Z

The ability to do this is certainly deliberate (I mean assigning with a dict, not the fact that it uses the keys instead of values), although I am not sure I am too happy about that :-) (way too many corner cases that come up by allowing this, eg #10219)

I thought there has been some related discussion about this recently, but only find #16309,

(side note: it's would maybe be worth opening a discussion whether we would want to deprecate this?)

jreback · 2017-06-19T22:30:44Z

I don't think this is inconsistent with setting with a Series, in fact this does work when setting with a Series. I suppose this should work. Special casing things is not generally a good thing.

oguzhanogreden · 2019-11-05T21:32:55Z

~~FYI, I'm planning to give this a go soon.~~

oguzhanogreden · 2019-11-17T20:22:42Z

This fails since dict is considered a list-like and the reported cases is handled under the assumption that it's a list, here.

I find the follow_split_path==True path and the conditionals here quite hard to follow. If we can assume that _can_do_equal_len will return False for the case described here (and variations of it), the solution is simply distinguishing a dict from a list-like in the else case.

I found some discussions here but couldn't crack it yet.

jorisvandenbossche · 2019-11-28T19:09:23Z

Duplicate report: #29917

mroeschke · 2021-06-12T03:23:35Z

This looks to work on master. Could use a test

In [1]: In [48]: x = pd.DataFrame({'x': [1, 2, 3], 'y': [3, 4, 5]})
   ...:
   ...: In [49]: x.iloc[1] = {'x': 9, 'y': 99}

In [2]: x
Out[2]:
   x   y
0  1   3
1  9  99
2  3   5

mroeschke · 2021-10-31T03:00:47Z

Looks like there's a test for this issue: test_iloc_setitem_dictionary_value. Closing

TomAugspurger added the Indexing Related to indexing on series/frames, not to indexes themselves label Jun 19, 2017

jreback added Bug Difficulty Intermediate labels Jun 19, 2017

jreback added this to the Next Major Release milestone Jul 6, 2017

chris-b1 marked this as a duplicate of #17072 Jul 25, 2017

chris-b1 mentioned this issue Jul 25, 2017

Adding a new DataFrame row using dict() gives unexpected behaviour #17072

Closed

jbrockmendel removed Effort Medium labels Oct 21, 2019

jorisvandenbossche mentioned this issue Nov 28, 2019

Modifying DataFrame first row produces wrong result #29917

Closed

mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 12, 2021

mroeschke closed this as completed Oct 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected result when setting a row by a dict #16724

Unexpected result when setting a row by a dict #16724

Yevgnen commented Jun 19, 2017

INSTALLED VERSIONS

TomAugspurger commented Jun 19, 2017

Yevgnen commented Jun 19, 2017 •

edited

Loading

TomAugspurger commented Jun 19, 2017

Yevgnen commented Jun 19, 2017

jorisvandenbossche commented Jun 19, 2017 •

edited

Loading

jreback commented Jun 19, 2017 •

edited

Loading

oguzhanogreden commented Nov 5, 2019 •

edited

Loading

oguzhanogreden commented Nov 17, 2019 •

edited

Loading

jorisvandenbossche commented Nov 28, 2019

mroeschke commented Jun 12, 2021

mroeschke commented Oct 31, 2021

Unexpected result when setting a row by a dict #16724

Unexpected result when setting a row by a dict #16724

Comments

Yevgnen commented Jun 19, 2017

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

TomAugspurger commented Jun 19, 2017

Yevgnen commented Jun 19, 2017 • edited Loading

TomAugspurger commented Jun 19, 2017

Yevgnen commented Jun 19, 2017

jorisvandenbossche commented Jun 19, 2017 • edited Loading

jreback commented Jun 19, 2017 • edited Loading

oguzhanogreden commented Nov 5, 2019 • edited Loading

oguzhanogreden commented Nov 17, 2019 • edited Loading

jorisvandenbossche commented Nov 28, 2019

mroeschke commented Jun 12, 2021

mroeschke commented Oct 31, 2021

Output of `pd.show_versions()`

Yevgnen commented Jun 19, 2017 •

edited

Loading

jorisvandenbossche commented Jun 19, 2017 •

edited

Loading

jreback commented Jun 19, 2017 •

edited

Loading

oguzhanogreden commented Nov 5, 2019 •

edited

Loading

oguzhanogreden commented Nov 17, 2019 •

edited

Loading