BUG: Regression in pandas 1.2.3 in dataframe.setitem #40204

galipremsagar · 2021-03-03T19:13:49Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

>>> import pandas as pd
>>> pd.__version__
'1.2.3'
>>> df = pd.DataFrame({"a": [1, 2, 3]})
>>> df[[True, False, True]] = pd.DataFrame({"a": [-1, -2]})
>>> df
     a
0 -1.0
1  2.0
2  NaN

Whereas this was fixed in 1.2.2:

>>> import pandas as pd
>>> pd.__version__
'1.2.2'
>>> df = pd.DataFrame({"a": [1, 2, 3]})
>>> df[[True, False, True]] = pd.DataFrame({"a": [-1, -2]})
>>> df
   a
0 -1
1  2
2 -2

Problem description

Setting a column based on boolean values seems to be broken in 1.2.3. This previously was broken and fixed in 1.2.2.

Expected Output

>>> import pandas as pd
>>> df = pd.DataFrame({"a": [1, 2, 3]})
>>> df[[True, False, True]] = pd.DataFrame({"a": [-1, -2]})
>>> df
   a
0 -1
1  2
2 -2

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : f2c8480
python : 3.7.3.final.0
python-bits : 64
OS : Darwin
OS-release : 20.3.0
Version : Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_US.UTF-8
pandas : 1.2.3
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.3
setuptools : 50.3.0
Cython : None
pytest : None
hypothesis : 5.29.0
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 1.0.1
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

The text was updated successfully, but these errors were encountered:

jreback · 2021-03-04T03:11:30Z

cc @phofl

phofl · 2021-03-04T09:07:21Z

I am not sure if this is a Regression, This was caused because of the fix for #39931

I think setitem should align the rhs? I would say the regression was in 1.2, that this stopped aligning

cc @jreback @jbrockmendel

jorisvandenbossche · 2021-03-05T19:54:06Z

The "new" behaviour was indeed was happened before as well, eg with pandas 1.1.5:

In [3]: df[[True, False, True]] = pd.DataFrame({"a": [-1, -2]})

In [4]: df
Out[4]: 
     a
0 -1.0
1  2.0
2  NaN

In [5]: pd.__version__
Out[5]: '1.1.5'

phofl · 2021-03-05T21:03:12Z

Yes, we go through iloc here, which aligned in before 1.2 and stopped with 1.2. After this the behavior changed, which I think was wrong then for setitem. In 1.2.3 we restored the previous behavior for setitem again, which is more correct I think

jbrockmendel · 2021-03-31T22:30:11Z

@phofl @jorisvandenbossche it sounds like the consensus is that the behavior on master (and 1.2.3) is correct and the behavior on 1.2.0 was incorrect?

phofl · 2021-03-31T22:35:18Z

Is correct imho

phofl · 2021-04-09T20:21:19Z

If nobody disagrees I would close this, since the behavior is correct again in 1.2.3

jreback · 2021-04-09T21:38:14Z

as long as we have a test ok to close

phofl · 2021-04-09T21:43:22Z

test_setitem_boolean_mask_aligning covers this

jreback · 2021-04-09T21:47:26Z

great

isVoid · 2022-02-02T02:05:07Z

Hi devs, current behavior is inconsistent between series and dataframe. For series:

In [12]: s = pd.Series([1, 2, 3], dtype="int64")

In [13]: s[[True, False, True]] = [88, 99]

In [14]: s
Out[14]: 
0    88
1     2
2    99
dtype: int64

Is this intended behavior?

phofl · 2022-02-02T02:32:37Z

Please clarify, your example is correct

isVoid · 2022-02-02T04:25:34Z

Actually, no issue here. I noticed the example I supplied above has no index on value. When I do s[[True, False, True]] = pd.Series([88, 99]) I get the expected result. Sorry about the confusion!

galipremsagar added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 3, 2021

jreback added Regression Functionality that used to work in a prior pandas version Indexing Related to indexing on series/frames, not to indexes themselves and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 4, 2021

jreback added this to the 1.2.4 milestone Mar 4, 2021

phofl added the Needs Discussion Requires discussion from core team before further action label Mar 4, 2021

phofl added the Closing Candidate May be closeable, needs more eyeballs label Apr 9, 2021

jreback closed this as completed Apr 9, 2021

phofl modified the milestones: 1.2.4, No action Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Regression in pandas 1.2.3 in dataframe.setitem #40204

BUG: Regression in pandas 1.2.3 in dataframe.setitem #40204

galipremsagar commented Mar 3, 2021

INSTALLED VERSIONS

jreback commented Mar 4, 2021

phofl commented Mar 4, 2021 •

edited

Loading

jorisvandenbossche commented Mar 5, 2021

phofl commented Mar 5, 2021 •

edited

Loading

jbrockmendel commented Mar 31, 2021

phofl commented Mar 31, 2021

phofl commented Apr 9, 2021

jreback commented Apr 9, 2021

phofl commented Apr 9, 2021

jreback commented Apr 9, 2021

isVoid commented Feb 2, 2022 •

edited

Loading

phofl commented Feb 2, 2022

isVoid commented Feb 2, 2022

BUG: Regression in pandas 1.2.3 in dataframe.__setitem__ #40204

BUG: Regression in pandas 1.2.3 in dataframe.__setitem__ #40204

Comments

galipremsagar commented Mar 3, 2021

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

jreback commented Mar 4, 2021

phofl commented Mar 4, 2021 • edited Loading

jorisvandenbossche commented Mar 5, 2021

phofl commented Mar 5, 2021 • edited Loading

jbrockmendel commented Mar 31, 2021

phofl commented Mar 31, 2021

phofl commented Apr 9, 2021

jreback commented Apr 9, 2021

phofl commented Apr 9, 2021

jreback commented Apr 9, 2021

isVoid commented Feb 2, 2022 • edited Loading

phofl commented Feb 2, 2022

isVoid commented Feb 2, 2022

BUG: Regression in pandas 1.2.3 in dataframe.setitem #40204

BUG: Regression in pandas 1.2.3 in dataframe.setitem #40204

Output of `pd.show_versions()`

phofl commented Mar 4, 2021 •

edited

Loading

phofl commented Mar 5, 2021 •

edited

Loading

isVoid commented Feb 2, 2022 •

edited

Loading