Merge pull request pandas-dev#9058 from jorisvandenbossche/doc-fixup-0152

jorisvandenbossche · jorisvandenbossche · commit 88feb4e3a5a7 · 2014-12-11T14:41:58.000+01:00
DOC: fix-up docs for 0.15.2 release
diff --git a/doc/source/io.rst b/doc/source/io.rst
@@ -3403,7 +3403,7 @@ writes ``data`` to the database in batches of 1000 rows at a time:
     data.to_sql('data_chunked', engine, chunksize=1000)
 
 SQL data types
-""""""""""""""
+++++++++++++++
 
 :func:`~pandas.DataFrame.to_sql` will try to map your data to an appropriate
 SQL data type based on the dtype of the data. When you have columns of dtype
@@ -3801,7 +3801,7 @@ is lost when exporting.
 Labeled data can similarly be imported from *Stata* data files as ``Categorical``
 variables using the keyword argument ``convert_categoricals`` (``True`` by default).
 The keyword argument ``order_categoricals`` (``True`` by default) determines
- whether imported ``Categorical`` variables are ordered.
+whether imported ``Categorical`` variables are ordered.
 
 .. note::
 
diff --git a/doc/source/release.rst b/doc/source/release.rst
@@ -50,7 +50,9 @@ pandas 0.15.2
 
 **Release date:** (December 12, 2014)
 
-This is a minor release from 0.15.1 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes.
+This is a minor release from 0.15.1 and includes a large number of bug fixes
+along with several new features, enhancements, and performance improvements.
+A small number of API changes were necessary to fix existing bugs.
 
 See the :ref:`v0.15.2 Whatsnew <whatsnew_0152>` overview for an extensive list
 of all API changes, enhancements and bugs that have been fixed in 0.15.2.
diff --git a/doc/source/whatsnew/v0.15.2.txt b/doc/source/whatsnew/v0.15.2.txt
@@ -3,9 +3,10 @@
 v0.15.2 (December 12, 2014)
 ---------------------------
 
-This is a minor release from 0.15.1 and includes a small number of API changes, several new features,
-enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
-users upgrade to this version.
+This is a minor release from 0.15.1 and includes a large number of bug fixes
+along with several new features, enhancements, and performance improvements.
+A small number of API changes were necessary to fix existing bugs.
+We recommend that all users upgrade to this version.
 
 - :ref:`Enhancements <whatsnew_0152.enhancements>`
 - :ref:`API Changes <whatsnew_0152.api>`
@@ -16,6 +17,7 @@ users upgrade to this version.
 
 API changes
 ~~~~~~~~~~~
+
 - Indexing in ``MultiIndex`` beyond lex-sort depth is now supported, though
   a lexically sorted index will have a better performance. (:issue:`2646`)
 
@@ -38,24 +40,30 @@ API changes
     df2.index.lexsort_depth
     df2.loc[(1,'z')]
 
-- Bug in concat of Series with ``category`` dtype which were coercing to ``object``. (:issue:`8641`)
-
 - Bug in unique of Series with ``category`` dtype, which returned all categories regardless
   whether they were "used" or not (see :issue:`8559` for the discussion).
+  Previous behaviour was to return all categories:
 
-- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters. ``Series.all``, ``Series.any``, ``Index.all``, and ``Index.any`` no longer support the ``out`` and ``keepdims`` parameters, which existed for compatibility with ndarray. Various index types no longer support the ``all`` and ``any`` aggregation functions and will now raise ``TypeError``. (:issue:`8302`):
+  .. code-block:: python
 
-  .. ipython:: python
+    In [3]: cat = pd.Categorical(['a', 'b', 'a'], categories=['a', 'b', 'c'])
 
-     s = pd.Series([False, True, False], index=[0, 0, 1])
-     s.any(level=0)
+    In [4]: cat
+    Out[4]: 
+    [a, b, a]
+    Categories (3, object): [a < b < c]
 
-- ``Panel`` now supports the ``all`` and ``any`` aggregation functions. (:issue:`8302`):
+    In [5]: cat.unique()
+    Out[5]: array(['a', 'b', 'c'], dtype=object)
+
+  Now, only the categories that do effectively occur in the array are returned:
 
   .. ipython:: python
 
-     p = pd.Panel(np.random.rand(2, 5, 4) > 0.1)
-     p.all()
+    cat = pd.Categorical(['a', 'b', 'a'], categories=['a', 'b', 'c'])
+    cat.unique()
+
+- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters. ``Series.all``, ``Series.any``, ``Index.all``, and ``Index.any`` no longer support the ``out`` and ``keepdims`` parameters, which existed for compatibility with ndarray. Various index types no longer support the ``all`` and ``any`` aggregation functions and will now raise ``TypeError``. (:issue:`8302`).
 
 - Allow equality comparisons of Series with a categorical dtype and object dtype; previously these would raise ``TypeError`` (:issue:`8938`)
 
@@ -90,25 +98,70 @@ API changes
 
 - ``Timestamp('now')`` is now equivalent to ``Timestamp.now()`` in that it returns the local time rather than UTC. Also, ``Timestamp('today')`` is now equivalent to ``Timestamp.today()`` and both have ``tz`` as a possible argument. (:issue:`9000`)
 
+- Fix negative step support for label-based slices (:issue:`8753`)
+
+  Old behavior:
+
+  .. code-block:: python
+
+     In [1]: s = pd.Series(np.arange(3), ['a', 'b', 'c'])
+     Out[1]:
+     a    0
+     b    1
+     c    2
+     dtype: int64
+
+     In [2]: s.loc['c':'a':-1]
+     Out[2]:
+     c    2
+     dtype: int64
+
+  New behavior:
+
+  .. ipython:: python
+
+     s = pd.Series(np.arange(3), ['a', 'b', 'c'])
+     s.loc['c':'a':-1]
+
+
 .. _whatsnew_0152.enhancements:
 
 Enhancements
 ~~~~~~~~~~~~
 
+``Categorical`` enhancements:
+
+- Added ability to export Categorical data to Stata (:issue:`8633`).  See :ref:`here <io.stata-categorical>` for limitations of categorical variables exported to Stata data files.
+- Added flag ``order_categoricals`` to ``StataReader`` and ``read_stata`` to select whether to order imported categorical data (:issue:`8836`).  See :ref:`here <io.stata-categorical>` for more information on importing categorical variables from Stata data files.
+- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
+- Added support for ``searchsorted()`` on `Categorical` class (:issue:`8420`).
+
+Other enhancements:
+
 - Added the ability to specify the SQL type of columns when writing a DataFrame
   to a database (:issue:`8778`).
   For example, specifying to use the sqlalchemy ``String`` type instead of the
   default ``Text`` type for string columns:
 
-  .. code-block::
+  .. code-block:: python
 
      from sqlalchemy.types import String
      data.to_sql('data_dtype', engine, dtype={'Col_1': String})
 
-- Added ability to export Categorical data to Stata (:issue:`8633`).  See :ref:`here <io.stata-categorical>` for limitations of categorical variables exported to Stata data files.
-- Added flag ``order_categoricals`` to ``StataReader`` and ``read_stata`` to select whether to order imported categorical data (:issue:`8836`).  See :ref:`here <io.stata-categorical>` for more information on importing categorical variables from Stata data files.
-- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
-- Added support for ``searchsorted()`` on `Categorical` class (:issue:`8420`).
+- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters (:issue:`8302`):
+
+  .. ipython:: python
+
+     s = pd.Series([False, True, False], index=[0, 0, 1])
+     s.any(level=0)
+
+- ``Panel`` now supports the ``all`` and ``any`` aggregation functions. (:issue:`8302`):
+
+  .. ipython:: python
+
+     p = pd.Panel(np.random.rand(2, 5, 4) > 0.1)
+     p.all()
+
 - Added support for ``utcfromtimestamp()``, ``fromtimestamp()``, and ``combine()`` on `Timestamp` class (:issue:`5351`).
 - Added Google Analytics (`pandas.io.ga`) basic documentation (:issue:`8835`). See :ref:`here<remote_data.ga>`.
 - ``Timedelta`` arithmetic returns ``NotImplemented`` in unknown cases, allowing extensions by custom classes (:issue:`8813`).
@@ -122,19 +175,22 @@ Enhancements
 - Added ability to read table footers to read_html (:issue:`8552`)
 - ``to_sql`` now infers datatypes of non-NA values for columns that contain NA values and have dtype ``object`` (:issue:`8778`).
 
+
 .. _whatsnew_0152.performance:
 
 Performance
 ~~~~~~~~~~~
-- Reduce memory usage when skiprows is an integer in read_csv (:issue:`8681`)
 
+- Reduce memory usage when skiprows is an integer in read_csv (:issue:`8681`)
 - Performance boost for ``to_datetime`` conversions with a passed ``format=``, and the ``exact=False`` (:issue:`8904`)
 
+
 .. _whatsnew_0152.bug_fixes:
 
 Bug Fixes
 ~~~~~~~~~
 
+- Bug in concat of Series with ``category`` dtype which were coercing to ``object``. (:issue:`8641`)
 - Bug in Timestamp-Timestamp not returning a Timedelta type and datelike-datelike ops with timezones (:issue:`8865`)
 - Made consistent a timezone mismatch exception (either tz operated with None or incompatible timezone), will now return ``TypeError`` rather than ``ValueError`` (a couple of edge cases only), (:issue:`8865`)
 - Bug in using a ``pd.Grouper(key=...)`` with no level/axis or level only (:issue:`8795`, :issue:`8866`)
@@ -154,95 +210,32 @@ Bug Fixes
 - Bug in ``merge`` where ``how='left'`` and ``sort=False`` would not preserve left frame order (:issue:`7331`)
 - Bug in ``MultiIndex.reindex`` where reindexing at level would not reorder labels (:issue:`4088`)
 - Bug in certain operations with dateutil timezones, manifesting with dateutil 2.3 (:issue:`8639`)
-
-- Fix negative step support for label-based slices (:issue:`8753`)
-
-  Old behavior:
-
-  .. code-block:: python
-
-     In [1]: s = pd.Series(np.arange(3), ['a', 'b', 'c'])
-     Out[1]:
-     a    0
-     b    1
-     c    2
-     dtype: int64
-
-     In [2]: s.loc['c':'a':-1]
-     Out[2]:
-     c    2
-     dtype: int64
-
-  New behavior:
-
-  .. ipython:: python
-
-     s = pd.Series(np.arange(3), ['a', 'b', 'c'])
-     s.loc['c':'a':-1]
-
 - Regression in DatetimeIndex iteration with a Fixed/Local offset timezone (:issue:`8890`)
 - Bug in ``to_datetime`` when parsing a nanoseconds using the ``%f`` format (:issue:`8989`)
 - ``io.data.Options`` now raises ``RemoteDataError`` when no expiry dates are available from Yahoo and when it receives no data from Yahoo (:issue:`8761`), (:issue:`8783`).
 - Fix: The font size was only set on x axis if vertical or the y axis if horizontal. (:issue:`8765`)
 - Fixed division by 0 when reading big csv files in python 3 (:issue:`8621`)
 - Bug in outputing a Multindex with ``to_html,index=False`` which would add an extra column (:issue:`8452`)
-
-
-
-
-
-
-
 - Imported categorical variables from Stata files retain the ordinal information in the underlying data (:issue:`8836`).
-
-
-
 - Defined ``.size`` attribute across ``NDFrame`` objects to provide compat with numpy >= 1.9.1; buggy with ``np.array_split`` (:issue:`8846`)
-
-
 - Skip testing of histogram plots for matplotlib <= 1.2 (:issue:`8648`).
-
-
-
-
-
-
 - Bug where ``get_data_google`` returned object dtypes (:issue:`3995`)
-
 - Bug in ``DataFrame.stack(..., dropna=False)`` when the DataFrame's ``columns`` is a ``MultiIndex``
   whose ``labels`` do not reference all its ``levels``. (:issue:`8844`)
-
-
 - Bug in that Option context applied on ``__enter__`` (:issue:`8514`)
-
-
 - Bug in resample that causes a ValueError when resampling across multiple days
   and the last offset is not calculated from the start of the range (:issue:`8683`)
-
-
-
 - Bug where ``DataFrame.plot(kind='scatter')`` fails when checking if an np.array is in the DataFrame (:issue:`8852`)
-
-
-
 - Bug in ``pd.infer_freq/DataFrame.inferred_freq`` that prevented proper sub-daily frequency inference when the index contained DST days (:issue:`8772`).
 - Bug where index name was still used when plotting a series with ``use_index=False`` (:issue:`8558`).
 - Bugs when trying to stack multiple columns, when some (or all) of the level names are numbers (:issue:`8584`).
 - Bug in ``MultiIndex`` where ``__contains__`` returns wrong result if index is not lexically sorted or unique (:issue:`7724`)
 - BUG CSV: fix problem with trailing whitespace in skipped rows, (:issue:`8679`), (:issue:`8661`), (:issue:`8983`)
 - Regression in ``Timestamp`` does not parse 'Z' zone designator for UTC (:issue:`8771`)
-
-
-
-
-
-
 - Bug in `StataWriter` the produces writes strings with 244 characters irrespective of actual size (:issue:`8969`)
-
-
 - Fixed ValueError raised by cummin/cummax when datetime64 Series contains NaT. (:issue:`8965`)
 - Bug in Datareader returns object dtype if there are missing values (:issue:`8980`)
 - Bug in plotting if sharex was enabled and index was a timeseries, would show labels on multiple axes (:issue:`3964`).
-
 - Bug where passing a unit to the TimedeltaIndex constructor applied the to nano-second conversion twice. (:issue:`9011`).
 - Bug in plotting of a period-like array (:issue:`9012`)
+