TomAugspurger
diff --git a/‎doc/source/advanced.rst
+1-1 b/‎doc/source/advanced.rst
+1-1
diff --git a/‎doc/source/categorical.rst
+70-8 b/‎doc/source/categorical.rst
+70-8
diff --git a/‎doc/source/merging.rst
+5-3 b/‎doc/source/merging.rst
+5-3
diff --git a/‎doc/source/whatsnew/v0.21.0.txt
+24 b/‎doc/source/whatsnew/v0.21.0.txt
+24
diff --git a/‎pandas/core/api.py
+1 b/‎pandas/core/api.py
+1
@@ -654,7 +654,7 @@ setting the index of a ``DataFrame/Series`` with a ``category`` dtype would conv
 
    df = pd.DataFrame({'A': np.arange(6),
                       'B': list('aabbca')})
-   df['B'] = df['B'].astype('category', categories=list('cab'))
+   df['B'] = df['B'].astype(pd.CategoricalDtype(list('cab')))
    df
    df.dtypes
    df.B.cat.categories
 
@@ -96,12 +96,19 @@ By passing a :class:`pandas.Categorical` object to a `Series` or assigning it to
     df["B"] = raw_cat
     df
 
-You can also specify differently ordered categories or make the resulting data ordered, by passing these arguments to ``astype()``:
+Anywhere above we passed a keyword ``dtype='category'``, we used the default behavior of
+
+1. categories are inferred from the data
+2. categories are unordered.
+
+To control those behaviors, instead of passing ``'category'``, use an instance
+of :class:`CategoricalDtype`.
 
 .. ipython:: python
 
-    s = pd.Series(["a","b","c","a"])
-    s_cat = s.astype("category", categories=["b","c","d"], ordered=False)
+    s = pd.Series(["a", "b", "c", "a"])
+    cat_type = pd.CategoricalDtype(categories=["b", "c", "d"], ordered=False)
+    s_cat = s.astype(cat_type)
     s_cat
 
 Categorical data has a specific ``category`` :ref:`dtype <basics.dtypes>`:
@@ -140,6 +147,61 @@ constructor to save the factorize step during normal constructor mode:
     splitter = np.random.choice([0,1], 5, p=[0.5,0.5])
     s = pd.Series(pd.Categorical.from_codes(splitter, categories=["train", "test"]))
 
+CategoricalDtype
+----------------
+
+.. versionchanged:: 0.21.0
+
+A categorical's type is fully described by 1.) its categories (an iterable with
+unique values and no missing values), and 2.) its orderedness (a boolean).
+This information can be stored in a :class:`~pandas.CategoricalDtype`.
+The ``categories`` argument is optional, which implies that the actual categories
+should be inferred from whatever is present in the data when the
+:class:`pandas.Categorical` is created.
+
+.. ipython:: python
+
+   pd.CategoricalDtype(['a', 'b', 'c'])
+   pd.CategoricalDtype(['a', 'b', 'c'], ordered=True)
+   pd.CategoricalDtype()
+
+A :class:`~pandas.CategoricalDtype` can be used in any place pandas expects a
+`dtype`. For example :func:`pandas.read_csv`, :func:`pandas.DataFrame.astype`,
+or the Series constructor.
+
+As a convenience, you can use the string `'category'` in place of a
+:class:`pandas.CategoricalDtype` when you want the default behavior of
+the categories being unordered, and equal to the set values present in the array.
+On other words, ``dtype='category'`` is equivalent to ``dtype=pd.CategoricalDtype()``.
+
+Equality Semantics
+~~~~~~~~~~~~~~~~~~
+
+Two instances of :class:`pandas.CategoricalDtype` compare equal whenever the have
+the same categories and orderedness. When comparing two unordered categoricals, the
+order of the ``categories`` is not considered
+
+.. ipython:: python
+
+   c1 = pd.CategoricalDtype(['a', 'b', 'c'], ordered=False)
+   # Equal, since order is not considered when ordered=False
+   c1 == pd.CategoricalDtype(['b', 'c', 'a'], ordered=False)
+   # Unequal, since the second CategoricalDtype is ordered
+   c1 == pd.CategoricalDtype(['a',  'b', 'c'], ordered=True)
+
+All instances of ``CategoricalDtype`` compare equal to the string ``'category'``
+
+.. ipython:: python
+
+   c1 == 'category'
+
+
+.. warning::
+
+   Since ``dtype='category'`` is essentially ``CategoricalDtype(None, False)``,
+   and since all instances ``CategoricalDtype`` compare equal to ``'`category'``,
+   all instances of ``CategoricalDtype`` compare equal to a ``CategoricalDtype(None)``
+
 Description
 -----------
 
@@ -189,7 +251,7 @@ It's also possible to pass in the categories in a specific order:
 
     .. ipython:: python
 
-         s = pd.Series(list('babc')).astype('category', categories=list('abcd'))
+         s = pd.Series(list('babc')).astype(pd.CategoricalDtype(list('abcd')))
          s
 
          # categories
@@ -306,7 +368,7 @@ meaning and certain operations are possible. If the categorical is unordered, ``
 
     s = pd.Series(pd.Categorical(["a","b","c","a"], ordered=False))
     s.sort_values(inplace=True)
-    s = pd.Series(["a","b","c","a"]).astype('category', ordered=True)
+    s = pd.Series(["a","b","c","a"]).astype(pd.CategoricalDtype(ordered=True))
     s.sort_values(inplace=True)
     s
     s.min(), s.max()
@@ -406,9 +468,9 @@ categories or a categorical with any list-like object, will raise a TypeError.
 
 .. ipython:: python
 
-    cat = pd.Series([1,2,3]).astype("category", categories=[3,2,1], ordered=True)
-    cat_base = pd.Series([2,2,2]).astype("category", categories=[3,2,1], ordered=True)
-    cat_base2 = pd.Series([2,2,2]).astype("category", ordered=True)
+    cat = pd.Series([1,2,3]).astype(pd.CategoricalDtype([3, 2, 1], ordered=True))
+    cat_base = pd.Series([2,2,2]).astype(pd.CategoricalDtype([3, 2, 1], ordered=True))
+    cat_base2 = pd.Series([2,2,2]).astype(pd.CategoricalDtype(ordered=True))
 
     cat
     cat_base
 
@@ -831,7 +831,7 @@ The left frame.
 .. ipython:: python
 
    X = pd.Series(np.random.choice(['foo', 'bar'], size=(10,)))
-   X = X.astype('category', categories=['foo', 'bar'])
+   X = X.astype(pd.CategoricalDtype(categories=['foo', 'bar']))
 
    left = pd.DataFrame({'X': X,
                         'Y': np.random.choice(['one', 'two', 'three'], size=(10,))})
@@ -842,8 +842,10 @@ The right frame.
 
 .. ipython:: python
 
-   right = pd.DataFrame({'X': pd.Series(['foo', 'bar']).astype('category', categories=['foo', 'bar']),
-                         'Z': [1, 2]})
+   right = pd.DataFrame({
+        'X': pd.Series(['foo', 'bar'], dtype=pd.CategoricalDtype(['foo', 'bar'])),
+        'Z': [1, 2]
+   })
    right
    right.dtypes
 
 
@@ -22,6 +22,8 @@ Check the :ref:`API Changes <whatsnew_0210.api_breaking>` and :ref:`deprecations
 New features
 ~~~~~~~~~~~~
 
+- New user-facing :class:`CategoricalDtype` for specifying categorical independent
+  of the data (:issue:`14711`, :issue:`15078`)
 - Support for `PEP 519 -- Adding a file system path protocol
   <https://www.python.org/dev/peps/pep-0519/>`_ on most readers and writers (:issue:`13823`)
 - Added ``__fspath__`` method to :class:`~pandas.HDFStore`, :class:`~pandas.ExcelFile`,
@@ -106,6 +108,28 @@ This does not permit that column to be accessed as an attribute:
 
 Both of these now raise a ``UserWarning`` about the potential for unexpected behavior. See :ref:`Attribute Access <indexing.attribute_access>`.
 
+.. _whatsnew_0210.enhancements.categorical_dtype:
+
+``CategoricalDtype`` for specifying categoricals
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+:class:`CategoricalDtype` has been added to the public API and expanded to
+include the ``categories`` and ``ordered`` attributes. A ``CategoricalDtype``
+can be used to specify the set of categories and orderedness of an array,
+independent of the data themselves. This can be useful, e.g., when converting
+string data to a ``Categorical``:
+
+.. ipython:: python
+
+   s = pd.Series(['a', 'b', 'c', 'a'])  # strings
+   dtype = pd.CategoricalDtype(categories=['a', 'b', 'c', 'd'], ordered=True)
+   s.astype(dtype)
+
+The ``.dtype`` property of a ``Categorical``, ``CategoricalIndex`` or a
+``Series`` with categorical type will now return an instance of ``CategoricalDtype``.
+
+See :ref:`CategoricalDtype <categorical.categoricaldtype>` for more.
+
 .. _whatsnew_0210.enhancements.other:
 
 Other Enhancements
 
@@ -6,6 +6,7 @@
 
 from pandas.core.algorithms import factorize, unique, value_counts
 from pandas.core.dtypes.missing import isna, isnull, notna, notnull
+from pandas.core.dtypes.dtypes import CategoricalDtype
 from pandas.core.categorical import Categorical
 from pandas.core.groupby import Grouper
 from pandas.io.formats.format import set_eng_float_format