Skip to content

Commit 43bff73

Browse files
authored
ENH: improve error reporting for dup columns in merge_asof (#50150)
1 parent e476938 commit 43bff73

File tree

3 files changed

+18
-2
lines changed

3 files changed

+18
-2
lines changed

doc/source/whatsnew/v2.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ Other enhancements
8585
- :func:`timedelta_range` now supports a ``unit`` keyword ("s", "ms", "us", or "ns") to specify the desired resolution of the output index (:issue:`49824`)
8686
- :meth:`DataFrame.to_json` now supports a ``mode`` keyword with supported inputs 'w' and 'a'. Defaulting to 'w', 'a' can be used when lines=True and orient='records' to append record oriented json lines to an existing json file. (:issue:`35849`)
8787
- Added ``name`` parameter to :meth:`IntervalIndex.from_breaks`, :meth:`IntervalIndex.from_arrays` and :meth:`IntervalIndex.from_tuples` (:issue:`48911`)
88+
- Improved error message for :func:`merge_asof` when join-columns were duplicated (:issue:`50102`)
8889
- Added :meth:`Index.infer_objects` analogous to :meth:`Series.infer_objects` (:issue:`50034`)
8990
- Added ``copy`` parameter to :meth:`Series.infer_objects` and :meth:`DataFrame.infer_objects`, passing ``False`` will avoid making copies for series or columns that are already non-object or where no better dtype can be inferred (:issue:`50096`)
9091
- :meth:`DataFrame.plot.hist` now recognizes ``xlabel`` and ``ylabel`` arguments (:issue:`49793`)

pandas/core/reshape/merge.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -1933,7 +1933,7 @@ def _validate_left_right_on(self, left_on, right_on):
19331933
lo_dtype = left_on_0.dtype
19341934
else:
19351935
lo_dtype = (
1936-
self.left[left_on_0].dtype
1936+
self.left._get_label_or_level_values(left_on_0).dtype
19371937
if left_on_0 in self.left.columns
19381938
else self.left.index.get_level_values(left_on_0)
19391939
)
@@ -1946,7 +1946,7 @@ def _validate_left_right_on(self, left_on, right_on):
19461946
ro_dtype = right_on_0.dtype
19471947
else:
19481948
ro_dtype = (
1949-
self.right[right_on_0].dtype
1949+
self.right._get_label_or_level_values(right_on_0).dtype
19501950
if right_on_0 in self.right.columns
19511951
else self.right.index.get_level_values(right_on_0)
19521952
)

pandas/tests/reshape/merge/test_merge_asof.py

+15
Original file line numberDiff line numberDiff line change
@@ -1567,3 +1567,18 @@ def test_merge_asof_array_as_on():
15671567
}
15681568
)
15691569
tm.assert_frame_equal(result, expected)
1570+
1571+
1572+
def test_merge_asof_raise_for_duplicate_columns():
1573+
# GH#50102
1574+
left = pd.DataFrame([[1, 2, "a"]], columns=["a", "a", "left_val"])
1575+
right = pd.DataFrame([[1, 1, 1]], columns=["a", "a", "right_val"])
1576+
1577+
with pytest.raises(ValueError, match="column label 'a'"):
1578+
merge_asof(left, right, on="a")
1579+
1580+
with pytest.raises(ValueError, match="column label 'a'"):
1581+
merge_asof(left, right, left_on="a", right_on="right_val")
1582+
1583+
with pytest.raises(ValueError, match="column label 'a'"):
1584+
merge_asof(left, right, left_on="left_val", right_on="a")

0 commit comments

Comments
 (0)