These are the changes in pandas 1.6.0. See :ref:`release` for a full changelog including other versions of pandas.
{{ header }}
- :meth:`.GroupBy.quantile` now preserving nullable dtypes instead of casting to numpy dtypes (:issue:`37493`)
- :meth:`Series.add_suffix`, :meth:`DataFrame.add_suffix`, :meth:`Series.add_prefix` and :meth:`DataFrame.add_prefix` support an
axis
argument. Ifaxis
is set, the default behaviour of which axis to consider can be overwritten (:issue:`47819`) - :func:`assert_frame_equal` now shows the first element where the DataFrames differ, analogously to
pytest
's output (:issue:`47910`)
These are bug fixes that might have notable behavior changes.
:meth:`.GroupBy.cumsum` and :meth:`.GroupBy.cumprod` overflow instead of lossy casting to float
In previous versions we cast to float when applying cumsum
and cumprod
which
lead to incorrect results even if the result could be hold by int64
dtype.
Additionally, the aggregation overflows consistent with numpy when the limit of
int64
is reached.
Old Behavior
In [1]: df = pd.DataFrame({"key": ["b"] * 7, "value": 625})
In [2]: df.groupby("key")["value"].cumprod()[5]
Out[2]: 5.960464477539062e+16
We return incorrect results with the 6th value.
New Behavior
.. ipython:: python df = pd.DataFrame({"key": ["b"] * 7, "value": 625}) df.groupby("key")["value"].cumprod()
We overflow with the 7th value, but the 6th value is still correct.
Some minimum supported versions of dependencies were updated. If installed, we now require:
Package | Minimum Version | Required | Changed |
---|---|---|---|
X | X |
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
Package | Minimum Version | Changed |
---|---|---|
X |
See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.
- Performance improvement in :meth:`.GroupBy.median` and :meth:`.GroupBy.cumprod` for nullable dtypes (:issue:`37493`)
- Performance improvement in :meth:`MultiIndex.argsort` and :meth:`MultiIndex.sort_values` (:issue:`48406`)
- Performance improvement in :meth:`.GroupBy.mean` and :meth:`.GroupBy.var` for extension array dtypes (:issue:`37493`)
- Performance improvement for :meth:`Series.value_counts` with nullable dtype (:issue:`48338`)
- Performance improvement for :class:`Series` constructor passing integer numpy array with nullable dtype (:issue:`48338`)
- Performance improvement for :meth:`MultiIndex.unique` (:issue:`48335`)
- Bug in :meth:`DataFrame.reindex` filling with wrong values when indexing columns and index for
uint
dtypes (:issue:`48184`) - Bug in :meth:`DataFrame.reindex` casting dtype to
object
when :class:`DataFrame` has single extension array column when re-indexingcolumns
andindex
(:issue:`48190`) - Bug in :func:`~DataFrame.describe` when formatting percentiles in the resulting index showed more decimals than needed (:issue:`46362`)
- Bug in :meth:`MultiIndex.unique` losing extension array dtype (:issue:`48335`)
- Bug in :meth:`MultiIndex.union` losing extension array (:issue:`48498`)
- Bug in :meth:`MultiIndex.append` not checking names for equality (:issue:`48288`)
- Bug in :meth:`Period.strftime` and :meth:`PeriodIndex.strftime`, raising
UnicodeDecodeError
when a locale-specific directive was passed (:issue:`46319`)
- Bug in :meth:`DataFrameGroupBy.sample` raises
ValueError
when the object is empty (:issue:`48459`)
- Bug in :func:`join` when
left_on
orright_on
is or includes a :class:`CategoricalIndex` incorrectly raisingAttributeError
(:issue:`48464`)
- Bug in :meth:`Series.mean` overflowing unnecessarily with nullable integers (:issue:`48378`)