DOC: merge docs

jreback · jreback · commit 16e2fbead359 · 2017-03-10T17:06:14.000-05:00
diff --git a/doc/source/categorical.rst b/doc/source/categorical.rst
@@ -646,6 +646,9 @@ In this case the categories are not the same and so an error is raised:
 
 The same applies to ``df.append(df_different)``.
 
+See also the section on :ref:`merge dtypes<merging.dtypes>` for notes about preserving merge dtypes and performance.
+
+
 .. _categorical.union:
 
 Unioning
diff --git a/doc/source/merging.rst b/doc/source/merging.rst
@@ -746,6 +746,79 @@ The ``indicator`` argument will also accept string arguments, in which case the
    pd.merge(df1, df2, on='col1', how='outer', indicator='indicator_column')
 
 
+.. _merging.dtypes:
+
+Merge Dtypes
+~~~~~~~~~~~~
+
+.. versionadded:: 0.19.0
+
+Merging will preserve the dtype of the join keys.
+
+.. ipython:: python
+
+   df1 = pd.DataFrame({'key': [1], 'v1': [10]})
+   df1
+   df2 = pd.DataFrame({'key': [1, 2], 'v1': [20, 30]})
+   df2
+
+We are able to preserve the join keys
+
+.. ipython:: python
+
+   pd.merge(df1, df2, how='outer')
+   pd.merge(df1, df2, how='outer').dtypes
+
+Of course if you have missing values that are introduced, then the
+resulting dtype will be upcast.
+
+.. ipython:: python
+
+   pd.merge(df1, df2, how='outer', on='key')
+   pd.merge(df1, df2, how='outer', on='key').dtypes
+
+.. versionadded:: 0.20.0
+
+Merging will preserve ``category`` dtypes of the mergands.
+
+The left frame.
+
+.. ipython:: python
+
+   X = pd.Series(np.random.choice(['foo', 'bar'], size=(10,)))
+   X = X.astype('category', categories=['foo', 'bar'])
+
+   left = DataFrame({'X': X,
+                     'Y': np.random.choice(['one', 'two', 'three'], size=(10,))})
+   left
+   left.dtypes
+
+The right frame.
+
+.. ipython:: python
+
+   right = DataFrame({'X': Series(['foo', 'bar']).astype('category', categories=['foo', 'bar']),
+                      'Z': [1, 2]})
+   right
+   right.dtypes
+
+The merged result
+
+.. ipython:: python
+
+   result = pd.merge(left, right, how='outer')
+   result
+   result.dtypes
+
+.. note::
+
+   The category dtypes must be *exactly* the same, meaning the same categories and the ordered attribute.
+   Otherwise the result will coerce to ``object`` dtype.
+
+.. note::
+
+   Merging on ``category`` dtypes that are the same can be quite performant compared to ``object`` dtype merging.
+
 .. _merging.join.index:
 
 Joining on index