Skip to content

BUG: MultiIndex assignment fails to broadcast omitted levels #42557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
bashtage opened this issue Jul 16, 2021 · 4 comments
Closed
2 tasks done

BUG: MultiIndex assignment fails to broadcast omitted levels #42557

bashtage opened this issue Jul 16, 2021 · 4 comments
Assignees
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Milestone

Comments

@bashtage
Copy link
Contributor

  • I have checked that this issue has not already been reported.
  • (optional) I have confirmed this bug exists on the master branch of pandas.

Note: Not in a released pandas, only master


Code Sample

import pandas as pd
mi = pd.MultiIndex.from_product([["a","b"],[1,2],["z","y"]])
df = pd.DataFrame(index=mi,dtype=float)
mi2 = pd.MultiIndex.from_product([["a","b"],[1,2]])
s = pd.Series(index=mi2, dtype=float)
s.iloc[:]=3.14
df["new"] = s
print(df)

Problem description

In 1.3.0 MultiIndex assignment broadcast missing levels. As of #42231, this no longer works.

Expected Output

In master, I see

       new
a 1 z  NaN
    y  NaN
  2 z  NaN
    y  NaN
b 1 z  NaN
    y  NaN
  2 z  NaN
    y  NaN

In 1.3.0 or 1.2.5, I see

        new
a 1 z  3.14
    y  3.14
  2 z  3.14
    y  3.14
b 1 z  3.14
    y  3.14
  2 z  3.14
    y  3.14

Output of pd.show_versions()

``` INSTALLED VERSIONS ------------------ commit : f3a6753 python : 3.8.10.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19043 machine : AMD64 processor : AMD64 Family 23 Model 113 Stepping 0, AuthenticAMD byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252

pandas : 1.4.0.dev0+246.gf3a6753d2d
numpy : 1.20.2
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.3
setuptools : 52.0.0.post20210125
Cython : 0.29.23
pytest : 6.2.4
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : 7.22.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.0
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

</details>
@bashtage bashtage added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 16, 2021
@rhshadrach rhshadrach added the Indexing Related to indexing on series/frames, not to indexes themselves label Jul 18, 2021
@lithomas1 lithomas1 removed the Needs Triage Issue that has not been reviewed by a pandas team member label Jul 19, 2021
@trevorkask
Copy link
Contributor

take

@trevorkask
Copy link
Contributor

I have found two probable causes of this bug. However, I am not sure which one to go with because I understand that a patch led to this bug. So one option may be better in terms of not undoing the patch.

  1. In the method _should_compare, line 5646, the parameters for the if statement check for boolean and numeric data, whereas the example given https://github.com/pandas-dev/pandas/issues/42557#issue-945839860, uses mixed data types(char and numeric). I suggest passing the is_mixed function rather than is_boolean or is_numeric.
  2. In the method get_indexer, line 3481, I could change the 'and' to 'or', so that the should_compare condition gets bypassed. This is a bit riskier.

Kindly advise on the best way forward, depending on what the patch altered.

@bashtage
Copy link
Contributor Author

I think the source of this bug was reverted in 4c9ef1b

@simonjayhawkins
Copy link
Member

I think the source of this bug was reverted in 4c9ef1b

#42575

@simonjayhawkins simonjayhawkins added this to the 1.4 milestone Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

No branches or pull requests

6 participants