doc/source/whatsnew/v0.16.0.txt

.. _whatsnew_0160:

v0.16.0 (February ??, 2015)
---------------------------

This is a major release from 0.15.2 and includes a small number of API changes, several new features,
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
users upgrade to this version.

- Highlights include:

- Check the :ref:`API Changes <whatsnew_0160.api>` and :ref:`deprecations <whatsnew_0160.deprecations>` before updating

- :ref:`Other Enhancements <whatsnew_0160.enhancements>`

- :ref:`Performance Improvements <whatsnew_0160.performance>`

- :ref:`Bug Fixes <whatsnew_0160.bug_fixes>`

New features
~~~~~~~~~~~~

.. _whatsnew_0160.api:

Backwards incompatible API changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. _whatsnew_0160.api_breaking:

- ``Index.duplicated`` now returns `np.array(dtype=bool)` rather than `Index(dtype=object)` containing `bool` values. (:issue:`8875`)
- ``DataFrame.to_json`` now returns accurate type serialisation for each column for frames of mixed dtype (:issue:`9037`)

  Previously data was coerced to a common dtype before serialisation, which for
  example resulted in integers being serialised to floats:

  .. code-block:: python

    In [2]: pd.DataFrame({'i': [1,2], 'f': [3.0, 4.2]}).to_json()
    Out[2]: '{"f":{"0":3.0,"1":4.2},"i":{"0":1.0,"1":2.0}}'

  Now each column is serialised using its correct dtype:

  .. code-block:: python

    In [2]:  pd.DataFrame({'i': [1,2], 'f': [3.0, 4.2]}).to_json()
    Out[2]: '{"f":{"0":3.0,"1":4.2},"i":{"0":1,"1":2}}'

- ``DatetimeIndex``, ``PeriodIndex`` and ``TimedeltaIndex.summary`` now output the same format. (:issue:`9116`)
- ``TimedeltaIndex.freqstr`` now output the same string format as ``DatetimeIndex``. (:issue:`9116`)


Deprecations
~~~~~~~~~~~~

.. _whatsnew_0160.deprecations:


Enhancements
~~~~~~~~~~~~

.. _whatsnew_0160.enhancements:

- Paths beginning with ~ will now be expanded to begin with the user's home directory (:issue:`9066`)
- Added time interval selection in get_data_yahoo (:issue:`9071`)
- Added ``Series.str.slice_replace()``, which previously raised NotImplementedError (:issue:`8888`)
- Added ``Timestamp.to_datetime64()`` to complement ``Timedelta.to_timedelta64()`` (:issue:`9255`)


Performance
~~~~~~~~~~~

.. _whatsnew_0160.performance:

- Fixed a performance regression for ``.loc`` indexing with an array or list-like (:issue:`9126`:).
- ``DataFrame.to_json`` 30x performance improvement for mixed dtype frames. (:issue:`9037`)
- Performance improvements in ``MultiIndex.duplicated`` by working with labels instead of values (:issue:`9125`)
- Improved the speed of `nunique` by calling `unique` instead of `value_counts` (:issue:`9129`, :issue:`7771`)
- Performance improvement of up to 10x in ``DataFrame.count`` and ``DataFrame.dropna`` by taking advantage of homogeneous/heterogeneous dtypes appropriately (:issue:`9136`)
- Performance improvement of up to 20x in ``DataFrame.count`` when using a ``MultiIndex`` and the ``level`` keyword argument  (:issue:`9163`)
- Performance and memory usage improvements in ``merge`` when key space exceeds ``int64`` bounds (:issue:`9151`)

Bug Fixes
~~~~~~~~~

.. _whatsnew_0160.bug_fixes:

- Fixed issue using `read_csv` on s3 with Python 3.
- Fixed compatibility issue in ``DatetimeIndex`` affecting architectures where ``numpy.int_`` defaults to ``numpy.int32`` (:issue:`8943`)
- Bug in Panel indexing with an object-like (:issue:`9140`)
- Bug in the returned ``Series.dt.components`` index was reset to the default index (:issue:`9247`)


- Fixed bug in ``to_sql`` when mapping a Timestamp object column (datetime
  column with timezone info) to the according sqlalchemy type (:issue:`9085`).
- Fixed bug in ``to_sql`` ``dtype`` argument not accepting an instantiated
  SQLAlchemy type  (:issue:`9083`).


- Bug in ``MultiIndex.has_duplicates`` when having many levels causes an indexer overflow (:issue:`9075`, :issue:`5873`)
- Bug in ``pivot`` and `unstack`` where ``nan`` values would break index alignment (:issue:`7466`)
- Bug in left ``join`` on multi-index with ``sort=True`` or null values (:issue:`9210`).


- Fixed character encoding bug in ``read_stata`` and ``StataReader`` when loading data from a URL (:issue:`9231`).


- Bug in DatetimeIndex iteration, related to (:issue:`8890`), fixed in (:issue:`9100`)


- Bug in using grouper functions that need passed thru arguments (e.g. axis), when using wrapped function (e.g. ``fillna``), (:issue:`9221`)

- DataFrame now properly supports simultaneous ``copy`` and ``dtype`` arguments in constructor (:issue:`9099`)
- Bug in read_csv when using skiprows on a file with CR line endings with the c engine. (:issue:`9079`)
- isnull now detects ``NaT`` in PeriodIndex (:issue:`9129`)
- Bug in groupby ``.nth()`` with a multiple column groupby (:issue:`8979`)

- Fixed division by zero error for ``Series.kurt()`` when all values are equal (:issue:`9197`)


- Fixed issue in the ``xlsxwriter`` engine where it added a default 'General' format to cells if no other format wass applied. This prevented other row or column formatting being applied. (:issue:`9167`)
- Fixes issue with ``index_col=False`` when ``usecols`` is also specified in ``read_csv``. (:issue:`9082`)