doc/source/v0.13.1.txt

.. _whatsnew_0131:

v0.13.1 (February ???)
----------------------

This is a minor release from 0.13.0 and includes a number of API changes, several new features and
enhancements along with a large number of bug fixes.

Highlights include:

Several experimental features are added, including:

There are several new or updated docs sections including:

- :ref:`Tutorials<tutorials>`, a guide to community developed pandas tutorials.
- :ref:`Pandas Ecosystem<ecosystem>`, a guide to complementary projects built on top of pandas.

.. warning::

   0.13.1 fixes a bug that was caused by a combination of having numpy < 1.8, and doing
   chained assignent on a string-like array. Please review :ref:`the docs<indexing.view_versus_copy>`,
   chained indexing can have unexpected results and should generally be avoided.

   This would previously segfault:

   .. ipython:: python

      df = DataFrame(dict(A = np.array(['foo','bar','bah','foo','bar'])))
      df['A'].iloc[0] = np.nan
      df

   The recommended way to do this type of assignment is:

   .. ipython:: python

      df = DataFrame(dict(A = np.array(['foo','bar','bah','foo','bar'])))
      df.ix[0,'A'] = np.nan
      df

API changes
~~~~~~~~~~~

- Add ``-NaN`` and ``-nan`` to the default set of NA values (:issue:`5952`).
  See :ref:`NA Values <io.na_values>`.
- Added the ``NDFrame.equals()`` method to compare if two NDFrames are
  equal have equal axes, dtypes, and values. Added the
  ``array_equivalent`` function to compare if two ndarrays are
  equal. NaNs in identical locations are treated as
  equal. (:issue:`5283`) See also :ref:`the docs<basics.equals>` for a motivating example.

  .. ipython:: python

      df = DataFrame({'col':['foo', 0, np.nan]}).sort()
      df2 = DataFrame({'col':[np.nan, 0, 'foo']}, index=[2,1,0])
      df.equals(df)

      import pandas.core.common as com
      com.array_equivalent(np.array([0, np.nan]), np.array([0, np.nan]))
      np.array_equal(np.array([0, np.nan]), np.array([0, np.nan]))

Prior Version Deprecations/Changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

These were announced changes in 0.13 or prior that are taking effect as of 0.13.1

Deprecations
~~~~~~~~~~~~

Enhancements
~~~~~~~~~~~~

- pd.to_csv and pd.to_datetime learned a new ``infer_datetime_format`` keyword which greatly
  improves parsing perf in many cases. Thanks to @lexual for suggesting and @danbirken
  for rapidly implementing. (:issue:`5490`, :issue:`6021`)

- The ``ArrayFormatter`` for ``datetime`` and ``timedelta64`` now intelligently
  limit precision based on the values in the array (:issue:`3401`)

  Previously output might look like:

  .. code-block:: python

        age                 today               diff
      0 2001-01-01 00:00:00 2013-04-19 00:00:00 4491 days, 00:00:00
      1 2004-06-01 00:00:00 2013-04-19 00:00:00 3244 days, 00:00:00

  Now the output looks like:

  .. ipython:: python

     df = DataFrame([ Timestamp('20010101'),
                      Timestamp('20040601') ], columns=['age'])
     df['today'] = Timestamp('20130419')
     df['diff'] = df['today']-df['age']
     df

- ``Panel.apply`` will work on non-ufuncs. See :ref:`the docs<basics.apply_panel>`.

  .. ipython:: python

     import pandas.util.testing as tm
     panel = tm.makePanel(5)
     panel
     panel['ItemA']

  Specifying an ``apply`` that operates on a Series (to return a single element)

  .. ipython:: python

     panel.apply(lambda x: x.dtype, axis='items')

  A similar reduction type operation

  .. ipython:: python

     panel.apply(lambda x: x.sum(), axis='major_axis')

  This is equivalent to

  .. ipython:: python

     panel.sum('major_axis')

  A transformation operation that returns a Panel, but is computing
  the z-score across the major_axis

  .. ipython:: python

     result = panel.apply(lambda x: (x-x.mean())/x.std(), axis='major_axis')
     result
     result['ItemA']

- ``Panel.apply`` operating on cross-sectional slabs. (:issue:`1148`)

  .. ipython:: python

     f = lambda x: ((x.T-x.mean(1))/x.std(1)).T

     result = panel.apply(f, axis = ['items','major_axis'])
     result
     result.loc[:,:,'ItemA']

  This is equivalent to the following

  .. ipython:: python

     result = Panel(dict([ (ax,f(panel.loc[:,:,ax])) for ax in panel.minor_axis ]))
     result
     result.loc[:,:,'ItemA']

- Added optional ``infer_datetime_format`` to ``read_csv``, ``Series.from_csv``
  and ``DataFrame.read_csv`` (:issue:`5490`)

  If ``parse_dates`` is enabled and this flag is set, pandas will attempt to
  infer the format of the datetime strings in the columns, and if it can
  be inferred, switch to a faster method of parsing them.  In some cases
  this can increase the parsing speed by ~5-10x.

  .. code-block:: python

      # Try to infer the format for the index column
      df = pd.read_csv('foo.csv', index_col=0, parse_dates=True,
                       infer_datetime_format=True)

Experimental
~~~~~~~~~~~~

Bug Fixes
~~~~~~~~~

See :ref:`V0.13.1 Bug Fixes<release.bug_fixes-0.13.1>` for an extensive list of bugs that have been fixed in 0.13.1.

See the :ref:`full release notes
<release>` or issue tracker
on GitHub for a complete list of all API changes, Enhancements and Bug Fixes.