Skip to content

Latest commit

 

History

History
383 lines (267 loc) · 15.1 KB

v0.25.0.rst

File metadata and controls

383 lines (267 loc) · 15.1 KB

What's New in 0.25.0 (April XX, 2019)

Warning

Starting with the 0.25.x series of releases, pandas only supports Python 3.5 and higher. See :ref:`install.dropping-27` for more details.

Warning

Panel has been fully removed. For N-D labeled data structures, please use xarray

{{ header }}

These are the changes in pandas 0.25.0. See :ref:`release` for a full changelog including other versions of pandas.

Other Enhancements

Backwards incompatible API changes

Indexing with date strings with UTC offsets

Indexing a :class:`DataFrame` or :class:`Series` with a :class:`DatetimeIndex` with a date string with a UTC offset would previously ignore the UTC offset. Now, the UTC offset is respected in indexing. (:issue:`24076`, :issue:`16785`)

Previous Behavior:

In [1]: df = pd.DataFrame([0], index=pd.DatetimeIndex(['2019-01-01'], tz='US/Pacific'))

In [2]: df
Out[2]:
                           0
2019-01-01 00:00:00-08:00  0

In [3]: df['2019-01-01 00:00:00+04:00':'2019-01-01 01:00:00+04:00']
Out[3]:
                           0
2019-01-01 00:00:00-08:00  0

New Behavior:

.. ipython:: ipython

    df = pd.DataFrame([0], index=pd.DatetimeIndex(['2019-01-01'], tz='US/Pacific'))
    df['2019-01-01 12:00:00+04:00':'2019-01-01 13:00:00+04:00']

GroupBy.apply on DataFrame evaluates first group only once

The implementation of :meth:`DataFrameGroupBy.apply() <pandas.core.groupby.DataFrameGroupBy.apply>` previously evaluated the supplied function consistently twice on the first group to infer if it is safe to use a fast code path. Particularly for functions with side effects, this was an undesired behavior and may have led to surprises.

(:issue:`2936`, :issue:`2656`, :issue:`7739`, :issue:`10519`, :issue:`12155`, :issue:`20084`, :issue:`21417`)

Now every group is evaluated only a single time.

.. ipython:: python

    df = pd.DataFrame({"a": ["x", "y"], "b": [1, 2]})
    df

    def func(group):
        print(group.name)
        return group

Previous Behaviour:

In [3]: df.groupby('a').apply(func)
x
x
y
Out[3]:
   a  b
0  x  1
1  y  2

New Behaviour:

.. ipython:: python

    df.groupby("a").apply(func)


Concatenating Sparse Values

When passed DataFrames whose values are sparse, :func:`concat` will now return a Series or DataFrame with sparse values, rather than a SparseDataFrame (:issue:`25702`).

.. ipython:: python

   df = pd.DataFrame({"A": pd.SparseArray([0, 1])})

Previous Behavior:

In [2]: type(pd.concat([df, df]))
pandas.core.sparse.frame.SparseDataFrame

New Behavior:

.. ipython:: python

   type(pd.concat([df, df]))


This now matches the existing behavior of :class:`concat` on Series with sparse values. :func:`concat` will continue to return a SparseDataFrame when all the values are instances of SparseDataFrame.

This change also affects routines using :func:`concat` internally, like :func:`get_dummies`, which now returns a :class:`DataFrame` in all cases (previously a SparseDataFrame was returned if all the columns were dummy encoded, and a :class:`DataFrame` otherwise).

Providing any SparseSeries or SparseDataFrame to :func:`concat` will cause a SparseSeries or SparseDataFrame to be returned, as before.

Increased minimum versions for dependencies

Due to dropping support for Python 2.7, a number of optional dependencies have updated minimum versions. Independently, some minimum supported versions of dependencies were updated (:issue:`23519`, :issue:`24942`). If installed, we now require:

Package Minimum Version Required
beautifulsoup4 4.4.1  
openpyxl 2.4.0  
pymysql 0.7.9  
pytz 2015.4  
sqlalchemy 1.1.4  
xlsxwriter 0.7.7  
xlwt 1.0.0  
pytest (dev) 4.0.2  

Other API Changes

Deprecations

Removal of prior version deprecations/changes

Performance Improvements

Bug Fixes

Categorical

Datetimelike

  • Bug in :func:`to_datetime` which would raise an (incorrect) ValueError when called with a date far into the future and the format argument specified instead of raising OutOfBoundsDatetime (:issue:`23830`)

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Plotting

Groupby/Resample/Rolling

Reshaping

Sparse

  • Significant speedup in SparseArray initialization that benefits most operations, fixing performance regression introduced in v0.20.0 (:issue:`24985`)
  • Bug in :class:`SparseFrame` constructor where passing None as the data would cause default_fill_value to be ignored (:issue:`16807`)

Other

Contributors

.. contributors:: v0.24.x..HEAD