You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If specified, checks if merge is of specified type.
557
-
* "one_to_one" or "1:1": check if merge keys are unique in both
558
-
left and right dataset.
559
-
* "one_to_many" or "1:m": check if merge keys are unique in left
560
-
dataset.
561
-
* "many_to_one" or "m:1": check if merge keys are unique in right
562
-
dataset.
557
+
558
+
* "one_to_one" or "1:1": checks if merge keys are unique in both
559
+
left and right datasets.
560
+
* "one_to_many" or "1:m": checks if merge keys are unique in left
561
+
dataset.
562
+
* "many_to_one" or "m:1": checks if merge keys are unique in right
563
+
dataset.
564
+
* "many_to_may" or "m:m": allowed, but does not result in checks.
565
+
563
566
564
567
.. versionadded:: 0.21.0
565
568
@@ -740,22 +743,45 @@ Users can use the ``validate`` argument to automatically check whether there are
740
743
741
744
In the following example, there are duplicate values of ``B`` in the right DataFrame. As this is not a one-to-one merge -- as specified in the ``validate`` argument -- an exception will be raised.
742
745
743
-
.. code-block:: python
746
+
747
+
.. ipython:: python
744
748
745
749
left = pd.DataFrame({'A' : [1,2], 'B' : [1, 2]})
746
750
right = pd.DataFrame({'A' : [4,5,6], 'B': [2, 2, 2]})
747
-
result = pd.merge(left, right, on='B', how='outer', validate="one_to_one");
748
-
749
-
ValueError: Merge keys are not unique in either left or right dataset; not a one-to-one merge
750
-
751
751
752
-
If the user is aware of the duplicates in the right `DataFrame` but wants to ensure there are no duplicates in the left DataFrame, one can use the `one_to_many` argument instead, which will not raise an exception.
752
+
.. code-block:: python
753
753
754
-
.. ipython:: python
755
-
:suppress:
754
+
result = pd.merge(left, right, on='B', how='outer', validate="one_to_one")
/Users/Nick/github/pandas/pandas/core/reshape/merge.py in _validate(self, validate)
776
+
987" not a one-to-one merge")
777
+
988elifnot right_unique:
778
+
-->989raiseValueError("Merge keys are not unique in right dataset;"
779
+
990" not a one-to-one merge")
780
+
991
781
+
782
+
ValueError: Merge keys are not unique in right dataset; not a one-to-one merge
756
783
757
-
left = pd.DataFrame({'A' : [1,2], 'B' : [1, 2]})
758
-
right = pd.DataFrame({'A' : [4,5,6], 'B': [2, 2, 2]})
784
+
If the user is aware of the duplicates in the right `DataFrame` but wants to ensure there are no duplicates in the left DataFrame, one can use the `one_to_many` argument instead, which will not raise an exception.
0 commit comments