Skip to content

BUG Merge not behaving correctly when having MultiIndex with a single level #53215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 16, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -421,6 +421,7 @@ Groupby/resample/rolling
Reshaping
^^^^^^^^^
- Bug in :meth:`DataFrame.agg` and :meth:`Series.agg` on non-unique columns would return incorrect type when dist-like argument passed in (:issue:`51099`)
- Bug in :meth:`DataFrame.merge` not merging correctly when having ``MultiIndex`` with single level (:issue:`52331`)
- Bug in :meth:`DataFrame.stack` losing extension dtypes when columns is a :class:`MultiIndex` and frame contains mixed dtypes (:issue:`45740`)
- Bug in :meth:`DataFrame.transpose` inferring dtype for object column (:issue:`51546`)
- Bug in :meth:`Series.combine_first` converting ``int64`` dtype to ``float64`` and losing precision on very large integers (:issue:`51764`)
Expand Down
17 changes: 4 additions & 13 deletions pandas/core/reshape/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -2261,23 +2261,14 @@ def _get_no_sort_one_missing_indexer(
def _left_join_on_index(
left_ax: Index, right_ax: Index, join_keys, sort: bool = False
) -> tuple[Index, npt.NDArray[np.intp] | None, npt.NDArray[np.intp]]:
if len(join_keys) > 1:
if not (
isinstance(right_ax, MultiIndex) and len(join_keys) == right_ax.nlevels
):
raise AssertionError(
"If more than one join key is given then "
"'right_ax' must be a MultiIndex and the "
"number of join keys must be the number of levels in right_ax"
)

if isinstance(right_ax, MultiIndex):
left_indexer, right_indexer = _get_multiindex_indexer(
join_keys, right_ax, sort=sort
)
else:
jkey = join_keys[0]

left_indexer, right_indexer = _get_single_indexer(jkey, right_ax, sort=sort)
left_indexer, right_indexer = _get_single_indexer(
join_keys[0], right_ax, sort=sort
)

if sort or len(left_ax) != len(left_indexer):
# if asked to sort or there are 1-to-many matches
Expand Down
14 changes: 14 additions & 0 deletions pandas/tests/reshape/merge/test_merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -2773,3 +2773,17 @@ def test_merge_arrow_and_numpy_dtypes(dtype):
result = df2.merge(df)
expected = df2.copy()
tm.assert_frame_equal(result, expected)


def test_merge_multiindex_single_level():
# Non-regression test for GH #52331
# Merge on MultiIndex with single level
df = DataFrame({"col": ["A", "B"]})
df2 = DataFrame(
data={"b": [100]},
index=MultiIndex.from_tuples([("A",), ("C",)], names=["col"]),
)
expected = DataFrame({"col": ["A", "B"], "b": [100, np.nan]})

result = df.merge(df2, left_on=["col"], right_index=True, how="left")
tm.assert_frame_equal(result, expected)