Skip to content

Arithmetic doesn't broadcast when MultiIndex has the same data, different names #32569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gedkins opened this issue Mar 9, 2020 · 1 comment
Assignees
Labels
Bug MultiIndex Numeric Operations Arithmetic, Comparison, and Logical operations

Comments

@gedkins
Copy link

gedkins commented Mar 9, 2020

Code Sample

import pandas as pd

index_st = pd.MultiIndex.from_tuples([(1,1),(1,2),(2,1),(2,2)], names=['second','third'])
index_ft1 = pd.MultiIndex.from_tuples([(1,1),(1,2),(2,1),(2,2)], names=['first','third'])
index_ft2 = pd.MultiIndex.from_tuples([(1,1),(2,1),(1,2),(2,2)], names=['first','third'])

st = pd.DataFrame([1,2,3,4], index=index_st, columns=['val'])
ft1 = pd.DataFrame([10,20,30,40], index=index_ft1, columns=['val'])
ft2 = pd.DataFrame([10,30,20,40], index=index_ft2, columns=['val'])

print(st.add(ft1))
print(st.add(ft2))

Problem description

The calculation of ft1 doesn't broadcast as expected. Instead, it treats the differently-named indexed columns as identical and performs a component-wise addition.

The calculation of ft2 is as expected.

ft1 and ft2 having different index columns is definitely not expected.

Output That I See

              val
second third
1      1       11
       2       22
2      1       33
       2       44
                    val
third second first
1     1      1       11
             2       31
      2      1       13
             2       33
2     1      1       22
             2       42
      2      1       24
             2       44

Expected Output

The expected output of the two calculations should be the same, possibly with the rows returned in a different order.

                    val
third second first
1     1      1       11
             2       31
      2      1       13
             2       33
2     1      1       22
             2       42
      2      1       24
             2       44

                    val
third second first
1     1      1       11
             2       31
      2      1       13
             2       33
2     1      1       22
             2       42
      2      1       24
             2       44

Where the problem lies

The problem appears to be in MultiIndex.equals

This is called by NDFrame. _indexed_same, which is called by _align_method_FRAME

The problem is that MultiIndex.equals doesn't check the names, and so reports indices as equal when they're not. This in turn means the non-broadcasting case is taken, when it should broadcast.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : ae79bb2
python : 3.8.0.final.0
python-bits : 64
OS : Linux
OS-release : 4.19.76-linuxkit
Version : #1 SMP Thu Oct 17 19:31:58 UTC 2019
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.1.0.dev0+725.gae79bb23c
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 41.6.0
Cython : 0.29.15
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

See also (I haven't checked if these are related, but they "sound similar")

#20565
#5645
#19606

@emilyjia
Copy link

emilyjia commented Mar 9, 2020

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug MultiIndex Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

No branches or pull requests

4 participants