doc/source/whatsnew/v0.18.2.txt

.. _whatsnew_0182:

v0.18.2 (July ??, 2016)
-----------------------

This is a minor bug-fix release from 0.18.1 and includes a large number of
bug fixes along with several new features, enhancements, and performance improvements.
We recommend that all users upgrade to this version.

Highlights include:


.. contents:: What's new in v0.18.2
    :local:
    :backlinks: none

.. _whatsnew_0182.new_features:

New features
~~~~~~~~~~~~


.. _whatsnew_0182.enhancements.other:

Other enhancements
^^^^^^^^^^^^^^^^^^

- The ``.tz_localize()`` method of ``DatetimeIndex`` and ``Timestamp`` has gained the ``errors`` keyword, so you can potentially coerce nonexistent timestamps to ``NaT``. The default behaviour remains to raising a ``NonExistentTimeError`` (:issue:`13057`)

- ``Index`` now supports ``.str.extractall()`` which returns ``DataFrame``, see :ref:`Extract all matches in each subject (extractall) <text.extractall>` (:issue:`10008`, :issue:`13156`)
- ``.to_hdf/read_hdf()`` now accept path objects (e.g. ``pathlib.Path``, ``py.path.local``) for the file path (:issue:`11773`)

  .. ipython:: python

     idx = pd.Index(["a1a2", "b1", "c1"])
     idx.str.extractall("[ab](?P<digit>\d)")

- ``Timestamp``s can now accept positional and keyword parameters like :func:`datetime.datetime` (:issue:`10758`, :issue:`11630`)

  .. ipython:: python

    pd.Timestamp(2012, 1, 1)

    pd.Timestamp(year=2012, month=1, day=1, hour=8, minute=30)

- The ``pd.read_csv()`` with ``engine='python'`` has gained support for the ``decimal`` option (:issue:`12933`)

- ``Index.astype()`` now accepts an optional boolean argument ``copy``, which has an effect if requirements on dtype are satisfied (:issue:`13209`)

- ``Categorical.astype()`` now accepts an optional boolean argument ``copy``, effective when dtype is categorical (:issue:`13209`)

.. _whatsnew_0182.api:

API changes
~~~~~~~~~~~


- Non-convertible dates in an excel date column will be returned without conversion and the column will be ``object`` dtype, rather than raising an exception  (:issue:`10001`)
- An ``UnsupportedFunctionCall`` error is now raised if numpy ufuncs like ``np.mean`` are called on groupby or resample objects (:issue:`12811`)
- Calls to ``.sample()`` will respect the random seed set via ``numpy.random.seed(n)`` (:issue:`13161`)

.. _whatsnew_0182.api.tolist:

``Series.tolist()`` will now return Python types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

``Series.tolist()`` will now return Python types in the output, mimicking NumPy ``.tolist()`` behaviour (:issue:`10904`)


.. ipython:: python

   s = pd.Series([1,2,3])
   type(s.tolist()[0])

Previous Behavior:

.. code-block:: ipython

   In [7]: type(s.tolist()[0])
   Out[7]:
    <class 'numpy.int64'>

New Behavior:

.. ipython:: python

   type(s.tolist()[0])


.. _whatsnew_0182.api.promote:

``Series`` type promotoion on assignment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A ``Series`` will now correctly promote its dtype with assignment with incompat values to the current dtype (:issue:`13234`)


.. ipython:: python

   s = pd.Series()

Previous Behavior:

.. code-block:: ipython

   In [2]: s["a"] = pd.Timestamp("2016-01-01")

   In [3]: s["b"] = 3.0
   TypeError: invalid type promotion

New Behavior:

.. ipython:: python

   s["a"] = pd.Timestamp("2016-01-01")
   s["b"] = 3.0
   s
   s.dtype

.. _whatsnew_0182.api.to_datetime_coerce:

``.to_datetime()`` when coercing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A bug is fixed in ``.to_datetime()`` when passing integers or floats, and no ``unit`` and ``errors='coerce'`` (:issue:`13180`).
Previously if ``.to_datetime()`` encountered mixed integers/floats and strings, but no datetimes with ``errors='coerce'`` it would convert all to ``NaT``.

Previous Behavior:

.. code-block:: ipython

   In [2]: pd.to_datetime([1, 'foo'], errors='coerce')
   Out[2]: DatetimeIndex(['NaT', 'NaT'], dtype='datetime64[ns]', freq=None)

This will now convert integers/floats with the default unit of ``ns``.

.. ipython:: python

   pd.to_datetime([1, 'foo'], errors='coerce')

.. _whatsnew_0182.api.other:

Other API changes
^^^^^^^^^^^^^^^^^

- ``Float64Index.astype(int)`` will now raise ``ValueError`` if ``Float64Index`` contains ``NaN`` values (:issue:`13149`)
- ``TimedeltaIndex.astype(int)`` and ``DatetimeIndex.astype(int)`` will now return ``Int64Index`` instead of ``np.array`` (:issue:`13209`)

.. _whatsnew_0182.deprecations:

Deprecations
^^^^^^^^^^^^


.. _whatsnew_0182.performance:

Performance Improvements
~~~~~~~~~~~~~~~~~~~~~~~~

- Improved performance of sparse ``IntIndex.intersect`` (:issue:`13082`)
- Improved performance of sparse arithmetic with ``BlockIndex`` when the number of blocks are large, though recommended to use ``IntIndex`` in such cases (:issue:`13082`)
- increased performance of ``DataFrame.quantile()`` as it now operates per-block (:issue:`11623`)


- Improved performance of ``DataFrameGroupBy.transform`` (:issue:`12737`)


.. _whatsnew_0182.bug_fixes:

Bug Fixes
~~~~~~~~~

- Bug in ``io.json.json_normalize()``, where non-ascii keys raised an exception (:issue:`13213`)
- Bug in ``SparseSeries`` with ``MultiIndex`` ``[]`` indexing may raise ``IndexError`` (:issue:`13144`)
- Bug in ``SparseSeries`` with ``MultiIndex`` ``[]`` indexing result may have normal ``Index`` (:issue:`13144`)
- Bug in ``SparseDataFrame`` in which ``axis=None`` did not default to ``axis=0`` (:issue:`13048`)
- Bug in ``SparseSeries`` and ``SparseDataFrame`` creation with ``object`` dtype may raise ``TypeError`` (:issue:`11633`)
- Bug when passing a not-default-indexed ``Series`` as ``xerr`` or ``yerr`` in ``.plot()`` (:issue:`11858`)
- Bug in matplotlib ``AutoDataFormatter``; this restores the second scaled formatting and re-adds micro-second scaled formatting (:issue:`13131`)


- Bug in ``.groupby(..).resample(..)`` when the same object is called multiple times (:issue:`13174`)
- Bug in ``.to_records()`` when index name is a unicode string (:issue: `13172`)

- Bug in calling ``.memory_usage()`` on object which doesn't implement (:issue:`12924`)

- Regression in ``Series.quantile`` with nans (also shows up in ``.median()`` and ``.describe()``); furthermore now names the ``Series`` with the quantile (:issue:`13098`, :issue:`13146`)

- Bug in ``SeriesGroupBy.transform`` with datetime values and missing groups (:issue:`13191`)

- Bug in ``Series.str.extractall()`` with ``str`` index raises ``ValueError``  (:issue:`13156`)


- Bug in ``PeriodIndex`` and ``Period`` subtraction raises ``AttributeError`` (:issue:`13071`)
- Bug in ``.resample(..)`` with a ``PeriodIndex`` not changing its ``freq`` appropriately when empty (:issue:`13067`)
- Bug in ``PeriodIndex`` construction returning a ``float64`` index in some circumstances (:issue:`13067`)
- Bug in ``.resample(..)`` with a ``PeriodIndex`` not retaining its type or name with an empty ``DataFrame``appropriately when empty (:issue:`13212`)


- Bug in ``MultiIndex`` slicing where extra elements were returned when level is non-unique (:issue:`12896`)


- Bug in ``Series`` arithmetic raises ``TypeError`` if it contains datetime-like as ``object`` dtype (:issue:`13043`)


- Bug in ``NaT`` - ``Period`` raises ``AttributeError`` (:issue:`13071`)
- Bug in ``Period`` addition raises ``TypeError`` if ``Period`` is on right hand side (:issue:`13069`)
- Bug in ``Peirod`` and ``Series`` or ``Index`` comparison raises ``TypeError`` (:issue:`13200`)
- Bug in ``pd.set_eng_float_format()`` that would prevent NaN's from formatting (:issue:`11981`)
- Bug in ``.unstack`` with ``Categorical`` dtype resets ``.ordered`` to ``True`` (:issue:`13249`)


- Bug in ``groupby`` where ``apply`` returns different result depending on whether first result is ``None`` or not (:issue:`12824`)