pandas-dev
diff --git a/‎doc/source/api.rst
Lines changed: 2 additions & 0 deletions b/‎doc/source/api.rst
Lines changed: 2 additions & 0 deletions
diff --git a/‎doc/source/categorical.rst
Lines changed: 29 additions & 19 deletions b/‎doc/source/categorical.rst
Lines changed: 29 additions & 19 deletions
diff --git a/‎doc/source/release.rst
Lines changed: 1 addition & 0 deletions b/‎doc/source/release.rst
Lines changed: 1 addition & 0 deletions
diff --git a/‎doc/source/whatsnew/v0.16.0.txt
Lines changed: 129 additions & 0 deletions b/‎doc/source/whatsnew/v0.16.0.txt
Lines changed: 129 additions & 0 deletions
@@ -585,6 +585,8 @@ following usable methods and properties (all available as ``Series.cat.<method_o
    Categorical.remove_categories
    Categorical.remove_unused_categories
    Categorical.set_categories
+   Categorical.as_ordered
+   Categorical.as_unordered
    Categorical.codes
 
 To create a Series of dtype ``category``, use ``cat = s.astype("category")``.
 
@@ -90,8 +90,6 @@ By using some special functions:
 See :ref:`documentation <reshaping.tile.cut>` for :func:`~pandas.cut`.
 
 By passing a :class:`pandas.Categorical` object to a `Series` or assigning it to a `DataFrame`.
-This is the only possibility to specify differently ordered categories (or no order at all) at
-creation time and the only reason to use :class:`pandas.Categorical` directly:
 
 .. ipython:: python
 
@@ -103,6 +101,14 @@ creation time and the only reason to use :class:`pandas.Categorical` directly:
     df["B"] = raw_cat
     df
 
+You can also specify differently ordered categories or make the resulting data ordered, by passing these arguments to ``astype()``:
+
+.. ipython:: python
+
+    s = Series(["a","b","c","a"])
+    s_cat = s.astype("category", categories=["b","c","d"], ordered=False)
+    s_cat
+
 Categorical data has a specific ``category`` :ref:`dtype <basics.dtypes>`:
 
 .. ipython:: python
@@ -176,10 +182,9 @@ It's also possible to pass in the categories in a specific order:
     s.cat.ordered
 
 .. note::
-    New categorical data is automatically ordered if the passed in values are sortable or a
-    `categories` argument is supplied. This is a difference to R's `factors`, which are unordered
-    unless explicitly told to be ordered (``ordered=TRUE``). You can of course overwrite that by
-    passing in an explicit ``ordered=False``.
+
+    New categorical data are NOT automatically ordered. You must explicity pass ``ordered=True`` to
+    indicate an ordered ``Categorical``.
 
 
 Renaming categories
@@ -270,29 +275,37 @@ Sorting and Order
 
 .. _categorical.sort:
 
+.. warning::
+
+   The default for construction has change in v0.16.0 to ``ordered=False``, from the prior implicit ``ordered=True``
+
 If categorical data is ordered (``s.cat.ordered == True``), then the order of the categories has a
-meaning and certain operations are possible. If the categorical is unordered, a `TypeError` is
-raised.
+meaning and certain operations are possible. If the categorical is unordered, ``.min()/.max()`` will raise a `TypeError`.
 
 .. ipython:: python
 
     s = Series(Categorical(["a","b","c","a"], ordered=False))
-    try:
-        s.sort()
-    except TypeError as e:
-        print("TypeError: " + str(e))
-    s = Series(["a","b","c","a"], dtype="category") # ordered per default!
+    s.sort()
+    s = Series(["a","b","c","a"]).astype('category', ordered=True)
     s.sort()
     s
     s.min(), s.max()
 
+You can set categorical data to be ordered by using ``as_ordered()`` or unordered by using ``as_unordered()``. These will by
+default return a *new* object.
+
+.. ipython:: python
+
+    s.cat.as_ordered()
+    s.cat.as_unordered()
+
 Sorting will use the order defined by categories, not any lexical order present on the data type.
 This is even true for strings and numeric data:
 
 .. ipython:: python
 
     s = Series([1,2,3,1], dtype="category")
-    s.cat.categories = [2,3,1]
+    s = s.cat.set_categories([2,3,1], ordered=True)
     s
     s.sort()
     s
@@ -310,7 +323,7 @@ necessarily make the sort order the same as the categories order.
 .. ipython:: python
 
     s = Series([1,2,3,1], dtype="category")
-    s = s.cat.reorder_categories([2,3,1])
+    s = s.cat.reorder_categories([2,3,1], ordered=True)
     s
     s.sort()
     s
@@ -339,7 +352,7 @@ The ordering of the categorical is determined by the ``categories`` of that colu
 
 .. ipython:: python
 
-   dfs = DataFrame({'A' : Categorical(list('bbeebbaa'),categories=['e','a','b']),
+   dfs = DataFrame({'A' : Categorical(list('bbeebbaa'),categories=['e','a','b'],ordered=True),
                     'B' : [1,2,1,2,2,1,2,1] })
    dfs.sort(['A','B'])
 
@@ -664,9 +677,6 @@ The following differences to R's factor functions can be observed:
 
 * R's `levels` are named `categories`
 * R's `levels` are always of type string, while `categories` in pandas can be of any dtype.
-* New categorical data is automatically ordered if the passed in values are sortable or a
-  `categories` argument is supplied. This is a difference to R's `factors`, which are unordered
-  unless explicitly told to be ordered (``ordered=TRUE``).
 * It's not possible to specify labels at creation time. Use ``s.cat.rename_categories(new_labels)``
   afterwards.
 * In contrast to R's `factor` function, using categorical data as the sole input to create a
 
@@ -59,6 +59,7 @@ Highlights include:
 - ``Series.to_coo/from_coo`` methods to interact with ``scipy.sparse``, see :ref:`here <whatsnew_0160.enhancements.sparse>`
 - Backwards incompatible change to ``Timedelta`` to conform the ``.seconds`` attribute with ``datetime.timedelta``, see :ref:`here <whatsnew_0160.api_breaking.timedelta>`
 - Changes to the ``.loc`` slicing API to conform with the behavior of ``.ix`` see :ref:`here <whatsnew_0160.api_breaking.indexing>`
+- Changes to the default for ordering in the ``Categorical`` constructor, see :ref:`here <whatsnew_0160.api_breaking.categorical>`
 
 See the :ref:`v0.16.0 Whatsnew <whatsnew_0160>` overview or the issue tracker on GitHub for an extensive list
 of all API changes, enhancements and bugs that have been fixed in 0.16.0.
 
@@ -13,6 +13,7 @@ users upgrade to this version.
   * ``Series.to_coo/from_coo`` methods to interact with ``scipy.sparse``, see :ref:`here <whatsnew_0160.enhancements.sparse>`
   * Backwards incompatible change to ``Timedelta`` to conform the ``.seconds`` attribute with ``datetime.timedelta``, see :ref:`here <whatsnew_0160.api_breaking.timedelta>`
   * Changes to the ``.loc`` slicing API to conform with the behavior of ``.ix`` see :ref:`here <whatsnew_0160.api_breaking.indexing>`
+  * Changes to the default for ordering in the ``Categorical`` constructor, see :ref:`here <whatsnew_0160.api_breaking.categorical>`
 
 - Check the :ref:`API Changes <whatsnew_0160.api>` and :ref:`deprecations <whatsnew_0160.deprecations>` before updating
 
@@ -367,6 +368,134 @@ API Changes
 - ``Series.describe`` for categorical data will now give counts and frequencies of 0, not ``NaN``, for unused categories (:issue:`9443`)
 
 
+Categorical Changes
+~~~~~~~~~~~~~~~~~~~
+
+.. _whatsnew_0160.api_breaking.categorical:
+
+In prior versions, ``Categoricals`` that had an unspecified ordering (meaning no ``ordered`` keyword was passed) were defaulted as ``ordered`` Categoricals. Going forward, the ``ordered`` keyword in the ``Categorical`` constructor will default to ``False``. Ordering must now be explicit.
+
+Furthermore, previously you *could* change the ``ordered`` attribute of a Categorical by just setting the attribute, e.g. ``cat.ordered=True``; This is now deprecated and you should use ``cat.as_ordered()`` or ``cat.as_unordered()``. These will by default return a **new** object and not modify the existing object. (:issue:`9347`, :issue:`9190`)
+
+Previous Behavior
+
+.. code-block:: python
+
+   In [3]: s = Series([0,1,2], dtype='category')
+
+   In [4]: s
+   Out[4]:
+   0    0
+   1    1
+   2    2
+   dtype: category
+   Categories (3, int64): [0 < 1 < 2]
+
+   In [5]: s.cat.ordered
+   Out[5]: True
+
+   In [6]: s.cat.ordered = False
+
+   In [7]: s
+   Out[7]:
+   0    0
+   1    1
+   2    2
+   dtype: category
+   Categories (3, int64): [0, 1, 2]
+
+New Behavior
+
+.. ipython:: python
+
+   s = Series([0,1,2], dtype='category')
+   s
+   s.cat.ordered
+   s = s.cat.as_ordered()
+   s
+   s.cat.ordered
+
+   # you can set in the constructor of the Categorical
+   s = Series(Categorical([0,1,2],ordered=True))
+   s
+   s.cat.ordered
+
+For ease of creation of series of categorical data, we have added the ability to pass keywords when calling ``.astype()``. These are passed directly to the constructor.
+
+.. ipython:: python
+
+   s = Series(["a","b","c","a"]).astype('category',ordered=True)
+   s
+   s = Series(["a","b","c","a"]).astype('category',categories=list('abcdef'),ordered=False)
+   s
+
+Indexing Changes
+~~~~~~~~~~~~~~~~
+
+.. _whatsnew_0160.api_breaking.indexing:
+
+The behavior of a small sub-set of edge cases for using ``.loc`` have changed (:issue:`8613`). Furthermore we have improved the content of the error messages that are raised:
+
+- slicing with ``.loc`` where the start and/or stop bound is not found in the index is now allowed; this previously would raise a ``KeyError``. This makes the behavior the same as ``.ix`` in this case. This change is only for slicing, not when indexing with a single label.
+
+.. ipython:: python
+
+     df = DataFrame(np.random.randn(5,4),
+                    columns=list('ABCD'),
+                    index=date_range('20130101',periods=5))
+     df
+     s = Series(range(5),[-2,-1,1,2,3])
+     s
+
+  Previous Behavior
+
+  .. code-block:: python
+
+     In [4]: df.loc['2013-01-02':'2013-01-10']
+     KeyError: 'stop bound [2013-01-10] is not in the [index]'
+
+     In [6]: s.loc[-10:3]
+     KeyError: 'start bound [-10] is not the [index]'
+
+  New Behavior
+
+  .. ipython:: python
+
+     df.loc['2013-01-02':'2013-01-10']
+     s.loc[-10:3]
+
+- allow slicing with float-like values on an integer index for ``.ix``. Previously this was only enabled for ``.loc``:
+
+  Previous Behavior
+
+  .. code-block:: python
+
+     In [8]: s.ix[-1.0:2]
+     TypeError: the slice start value [-1.0] is not a proper indexer for this index type (Int64Index)
+
+  New Behavior
+
+  .. ipython:: python
+
+     s.ix[-1.0:2]
+
+- provide a useful exception for indexing with an invalid type for that index when using ``.loc``. For example trying to use ``.loc`` on an index of type ``DatetimeIndex`` or ``PeriodIndex`` or ``TimedeltaIndex``, with an integer (or a float).
+
+  Previous Behavior
+
+  .. code-block:: python
+
+     In [4]: df.loc[2:3]
+     KeyError: 'start bound [2] is not the [index]'
+
+  New Behavior
+
+  .. code-block:: python
+
+     In [4]: df.loc[2:3]
+     TypeError: Cannot do slice indexing on <class 'pandas.tseries.index.DatetimeIndex'> with <type 'int'> keys
+
+
 .. _whatsnew_0160.deprecations:
 
 Deprecations