tweak left_join_key and docs

Nick Eubank · Nick Eubank · commit f3e565da04a2 · 2017-05-07T16:46:38.000-07:00
diff --git a/doc/source/merging.rst b/doc/source/merging.rst
@@ -729,26 +729,37 @@ Here is another example with duplicate join keys in DataFrames:
 
   Joining / merging on duplicate keys can cause a returned frame that is the multiplication of the row dimensions, which may result in memory overflow. It is the user' s responsibility to manage duplicate values in keys before joining large DataFrames.
 
+.. _merging.validation:
+
 Checking for duplicate keys
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
+.. versionadded:: 0.21.0
+
 Users can use the ``validate`` argument to automatically check whether there are unexpected duplicates in their merge keys. Key uniqueness is checked before merge operations and so should protect against memory overflows. Checking key uniqueness is also a good way to ensure user data structures are as expected. 
 
 In the following example, there are duplicate values of ``B`` in the right DataFrame. As this is not a one-to-one merge -- as specified in the ``validate`` argument -- an exception will be raised.
 
-.. ipython:: python
-
-   left = pd.DataFrame({'A' : [1,2], 'B' : [1, 2]})
+.. code-block:: python
 
-   right = pd.DataFrame({'A' : [4,5,6], 'B': [2, 2, 2]})
+  left = pd.DataFrame({'A' : [1,2], 'B' : [1, 2]})
+  right = pd.DataFrame({'A' : [4,5,6], 'B': [2, 2, 2]})
+  result = pd.merge(left, right, on='B', how='outer', validate="one_to_one");
+  
+  ValueError: Merge keys are not unique in either left or right dataset; not a one-to-one merge
 
-   result = pd.merge(left, right, on='B', how='outer', validate="one_to_one");
 
 If the user is aware of the duplicates in the right `DataFrame` but wants to ensure there are no duplicates in the left DataFrame, one can use the `one_to_many` argument instead, which will not raise an exception. 
 
+.. ipython:: python
+   :suppress:
+
+   left = pd.DataFrame({'A' : [1,2], 'B' : [1, 2]})
+   right = pd.DataFrame({'A' : [4,5,6], 'B': [2, 2, 2]})
+
 .. ipython:: python
 
-   result = pd.merge(left, right, on='B', how='outer', validate="one_to_many")
+   pd.merge(left, right, on='B', how='outer', validate="one_to_many")
 
 
 .. _merging.indicator:
diff --git a/pandas/core/reshape/merge.py b/pandas/core/reshape/merge.py
@@ -977,20 +977,18 @@ def _validate_specification(self):
 
     def _validate(self, validate):
 
-        # Get axes
-        left_key = self.left_on if self.left_on is not None else self.on
-        right_key = self.right_on if self.right_on is not None else self.on
-
         # Check uniqueness of each
         if self.left_index:
             left_unique = not (self.orig_left.index.duplicated()).any()
         else:
-            left_unique = not (self.orig_left[left_key].duplicated()).any()
+            left_unique = MultiIndex.from_arrays(self.left_join_keys
+                                                 ).is_unique
 
         if self.right_index:
             right_unique = not (self.orig_right.index.duplicated()).any()
         else:
-            right_unique = not (self.orig_right[right_key].duplicated()).any()
+            right_unique = MultiIndex.from_arrays(self.right_join_keys
+                                                  ).is_unique
 
         # Check valid arg
         if validate not in ['one_to_one', '1:1',