Skip to content

BUG: Joining single-index DataFrame to multiindex DF incorrect for how=left and how=right #10741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
warmlogic opened this issue Aug 3, 2015 · 4 comments · Fixed by #10716
Closed
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@warmlogic
Copy link

When joining DataFrames where the calling frame is a multiindex DF and the input frame is a single-index DF, how='left' and how='right' produce results that should be swapped (i.e., 'left' returns what 'right' should return, and vice versa). To give a single example using how='left':

df1 = pd.DataFrame([['a', 'x', 0.471780], ['a','y', 0.774908], ['a', 'z', 0.563634],
                    ['b', 'x', -0.353756], ['b', 'y', 0.368062], ['b', 'z', -1.721840],
                    ['c', 'x', 1], ['c', 'y', 2], ['c', 'z', 3],
                   ],
                   columns=['first', 'second', 'value1']
                   ).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10], ['b', 20]],
                   columns=['first', 'value2']).set_index(['first'])

print(df1.join(df2, how='left'))

Expected (for how='left'):

                value1  value2
first second                  
a     x       0.471780      10
      y       0.774908      10
      z       0.563634      10
b     x      -0.353756      20
      y       0.368062      20
      z      -1.721840      20
c     x       1.000000     NaN
      y       2.000000     NaN
      z       3.000000     NaN

Actual (for how='left'):

                value1  value2
first second                  
a     x       0.471780      10
      y       0.774908      10
      z       0.563634      10
b     x      -0.353756      20
      y       0.368062      20
      z      -1.721840      20

However, the correct behavior occurs if the single-index DF is the calling frame and the multiindex DF is the input frame (df2.join(df1, how='left')). Behavior for how='inner' and how='outer' is correct in both situations.

pandas version 0.16.2

@jreback
Copy link
Contributor

jreback commented Aug 3, 2015

I think this is the same as #10665 (though this is manifesting at a higher level)
PR #10716 should fix this

@sinhrks can you confirm (and add this as an additional closer / test)

@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Aug 3, 2015
@jreback jreback added this to the 0.17.0 milestone Aug 3, 2015
@warmlogic
Copy link
Author

Thanks, those do look the same/related. I searched before posting but didn't find them. Cheers!

@jreback
Copy link
Contributor

jreback commented Aug 3, 2015

@warmlogic np, lots of issues outstanding!

@sinhrks
Copy link
Member

sinhrks commented Aug 4, 2015

@warmlogic Thanks for the report. As @jreback said, #10716 can solve the issue. I've added your test case to the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants