Skip to content

merge fails to add suffixes on multiindex columns #28518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lhk opened this issue Sep 19, 2019 · 0 comments · Fixed by #28735
Closed

merge fails to add suffixes on multiindex columns #28518

lhk opened this issue Sep 19, 2019 · 0 comments · Fixed by #28735
Labels
Bug MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@lhk
Copy link

lhk commented Sep 19, 2019

Code Sample, a copy-pastable example if possible

import pandas as pd
index_tuples=[]

for word_group in ["a", "b", "c", "d"]:
    for correctness in ["1", "2", "3"]:
        index_tuples.append([word_group, correctness])

index = pd.MultiIndex.from_tuples(index_tuples, names=["outer", "inner"])

frame_x = pd.DataFrame(columns = index)
frame_x["id"]=""

frame_y = pd.DataFrame(columns = index)
frame_y["id"]=""

print(frame_x.merge(frame_y, on="id").columns)

Problem description

I'm trying to merge to dataframes. Both have a multiindex and an "id" column. The merge happens on "id", the outer layer of the multiindex should receive suffixes. Depending on the number of indices in the multiindex this doesn't work. Only some of the multiindex columns receive suffixes, other's don't. For the codesample, I've set up an empty dataframe, the behaviour is the same when it is filled with data.

The issue seems non-deterministic. Sometimes it happens, sometimes it doesn't. Here is a video:
https://imgur.com/a/rbSvuSl

I'm using pandas version 0.25.1
This is a conda environment, here is its yml file: https://gist.github.com/lhk/ab3cf1f95be37a23789792fd75beef93

Expected Output

All columns in the multiindex should receive either _x or _y as suffix.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.6.9.final.0 python-bits : 64 OS : Linux OS-release : 5.0.0-21-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 0.25.1
numpy : 1.16.5
pytz : 2019.2
dateutil : 2.8.0
pip : 19.2.2
setuptools : 41.0.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.8.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.1.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.1
sqlalchemy : None
tables : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None

nrebena added a commit to nrebena/pandas that referenced this issue Oct 1, 2019
* test_lexsort_depth verify that lexsort_depth return the correct depth
when sortorder is passed to the MultiIndex constructor
* test_raise_invalid_sortorder test that the MultiIndex constructor
raise when passing an incorrect sortorder
* test_merge_multiindex_columns test the original issue
nrebena added a commit to nrebena/pandas that referenced this issue Oct 1, 2019
This fix issue pandas-dev#28518, where the label of the merge index where invalid
due to inconsistent lexsort_depth property of the intersection of the
indexes
@jreback jreback added Bug MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Oct 2, 2019
@jreback jreback added this to the 1.0 milestone Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants