What's new in 1.6.0 (??)

These are the changes in pandas 1.6.0. See :ref:`release` for a full changelog including other versions of pandas.

Enhancements

enhancement1

enhancement2

Other enhancements

:func:`read_sas` now supports using encoding='infer' to correctly read and use the encoding specified by the sas file. (:issue:`48048`)
:meth:`.DataFrameGroupBy.quantile` and :meth:`.SeriesGroupBy.quantile` now preserve nullable dtypes instead of casting to numpy dtypes (:issue:`37493`)
:meth:`Series.add_suffix`, :meth:`DataFrame.add_suffix`, :meth:`Series.add_prefix` and :meth:`DataFrame.add_prefix` support an axis argument. If axis is set, the default behaviour of which axis to consider can be overwritten (:issue:`47819`)
:func:`assert_frame_equal` now shows the first element where the DataFrames differ, analogously to pytest's output (:issue:`47910`)
Added index parameter to :meth:`DataFrame.to_dict` (:issue:`46398`)
Added metadata propagation for binary operators on :class:`DataFrame` (:issue:`28283`)
:class:`.CategoricalConversionWarning`, :class:`.InvalidComparison`, :class:`.InvalidVersion`, :class:`.LossySetitemError`, and :class:`.NoBufferPresent` are now exposed in pandas.errors (:issue:`27656`)

Notable bug fixes

These are bug fixes that might have notable behavior changes.

:meth:`.GroupBy.cumsum` and :meth:`.GroupBy.cumprod` overflow instead of lossy casting to float

In previous versions we cast to float when applying cumsum and cumprod which lead to incorrect results even if the result could be hold by int64 dtype. Additionally, the aggregation overflows consistent with numpy and the regular :meth:`DataFrame.cumprod` and :meth:`DataFrame.cumsum` methods when the limit of int64 is reached (:issue:`37493`).

Old Behavior

In [1]: df = pd.DataFrame({"key": ["b"] * 7, "value": 625})
In [2]: df.groupby("key")["value"].cumprod()[5]
Out[2]: 5.960464477539062e+16

We return incorrect results with the 6th value.

New Behavior

.. ipython:: python

    df = pd.DataFrame({"key": ["b"] * 7, "value": 625})
    df.groupby("key")["value"].cumprod()

We overflow with the 7th value, but the 6th value is still correct.

notable_bug_fix2

Backwards incompatible API changes

Increased minimum versions for dependencies

Some minimum supported versions of dependencies were updated. If installed, we now require:

Package	Minimum Version	Required	Changed
		X	X

For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.

Package	Minimum Version	Changed
		X

See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.

Other API changes

:func:`read_csv`: specifying an incorrect number of columns with index_col of now raises ParserError instead of IndexError when using the c parser.

Deprecations

Performance improvements

Performance improvement in :meth:`.DataFrameGroupBy.median` and :meth:`.SeriesGroupBy.median` and :meth:`.GroupBy.cumprod` for nullable dtypes (:issue:`37493`)
Performance improvement in :meth:`MultiIndex.argsort` and :meth:`MultiIndex.sort_values` (:issue:`48406`)
Performance improvement in :meth:`MultiIndex.size` (:issue:`48723`)
Performance improvement in :meth:`MultiIndex.union` without missing values and without duplicates (:issue:`48505`)
Performance improvement in :meth:`MultiIndex.difference` (:issue:`48606`)
Performance improvement in :meth:`.DataFrameGroupBy.mean`, :meth:`.SeriesGroupBy.mean`, :meth:`.DataFrameGroupBy.var`, and :meth:`.SeriesGroupBy.var` for extension array dtypes (:issue:`37493`)
Performance improvement in :meth:`MultiIndex.isin` when level=None (:issue:`48622`)
Performance improvement for :meth:`Series.value_counts` with nullable dtype (:issue:`48338`)
Performance improvement for :class:`Series` constructor passing integer numpy array with nullable dtype (:issue:`48338`)
Performance improvement for :class:`DatetimeIndex` constructor passing a list (:issue:`48609`)
Performance improvement in :func:`merge` and :meth:`DataFrame.join` when joining on a sorted :class:`MultiIndex` (:issue:`48504`)
Performance improvement in :meth:`DataFrame.loc` and :meth:`Series.loc` for tuple-based indexing of a :class:`MultiIndex` (:issue:`48384`)
Performance improvement for :meth:`MultiIndex.unique` (:issue:`48335`)
Performance improvement in :meth:`DataFrame.join` when joining on a subset of a :class:`MultiIndex` (:issue:`48611`)
Performance improvement for :meth:`MultiIndex.intersection` (:issue:`48604`)
Performance improvement in var for nullable dtypes (:issue:`48379`).
Performance improvement to :func:`read_sas` with blank_missing=True (:issue:`48502`)
Memory improvement in :meth:`RangeIndex.sort_values` (:issue:`48801`)

Bug fixes

Categorical

Datetimelike

Bug in :func:`pandas.infer_freq`, raising TypeError when inferred on :class:`RangeIndex` (:issue:`47084`)
Bug in :class:`DatetimeIndex` constructor failing to raise when tz=None is explicitly specified in conjunction with timezone-aware dtype or data (:issue:`48659`)
Bug in subtracting a datetime scalar from :class:`DatetimeIndex` failing to retain the original freq attribute (:issue:`48818`)

Timedelta

Timezones

Numeric

Conversion

Bug in constructing :class:`Series` with int64 dtype from a string list raising instead of casting (:issue:`44923`)
Bug in :meth:`DataFrame.eval` incorrectly raising an AttributeError when there are negative values in function call (:issue:`46471`)
Bug in :meth:`Series.convert_dtypes` not converting dtype to nullable dtype when :class:`Series` contains NA and has dtype object (:issue:`48791`)
Bug where any :class:`ExtensionDtype` subclass with kind="M" would be interpreted as a timezone type (:issue:`34986`)

Strings

Interval

Indexing

Bug in :meth:`DataFrame.reindex` filling with wrong values when indexing columns and index for uint dtypes (:issue:`48184`)
Bug in :meth:`DataFrame.reindex` casting dtype to object when :class:`DataFrame` has single extension array column when re-indexing columns and index (:issue:`48190`)
Bug in :func:`~DataFrame.describe` when formatting percentiles in the resulting index showed more decimals than needed (:issue:`46362`)

Missing

Bug in :meth:`Index.equals` raising TypeError when :class:`Index` consists of tuples that contain NA (:issue:`48446`)

MultiIndex

Bug in :meth:`MultiIndex.difference` losing extension array dtype (:issue:`48606`)
Bug in :class:`MultiIndex.set_levels` raising IndexError when setting empty level (:issue:`48636`)
Bug in :meth:`MultiIndex.unique` losing extension array dtype (:issue:`48335`)
Bug in :meth:`MultiIndex.intersection` losing extension array (:issue:`48604`)
Bug in :meth:`MultiIndex.union` losing extension array (:issue:`48498`, :issue:`48505`)
Bug in :meth:`MultiIndex.append` not checking names for equality (:issue:`48288`)
Bug in :meth:`MultiIndex.symmetric_difference` losing extension array (:issue:`48607`)

I/O

Bug in :func:`read_sas` caused fragmentation of :class:`DataFrame` and raised :class:`.errors.PerformanceWarning` (:issue:`48595`)

Period

Bug in :meth:`Period.strftime` and :meth:`PeriodIndex.strftime`, raising UnicodeDecodeError when a locale-specific directive was passed (:issue:`46319`)

Plotting

Groupby/resample/rolling

Bug in :meth:`DataFrameGroupBy.sample` raises ValueError when the object is empty (:issue:`48459`)

Reshaping

Bug in :meth:`DataFrame.pivot_table` raising TypeError for nullable dtype and margins=True (:issue:`48681`)
Bug in :meth:`DataFrame.pivot` not respecting None as column name (:issue:`48293`)
Bug in :func:`join` when left_on or right_on is or includes a :class:`CategoricalIndex` incorrectly raising AttributeError (:issue:`48464`)

Sparse

ExtensionArray

Bug in :meth:`Series.mean` overflowing unnecessarily with nullable integers (:issue:`48378`)
Bug when concatenating an empty DataFrame with an ExtensionDtype to another DataFrame with the same ExtensionDtype, the resulting dtype turned into object (:issue:`48510`)

Styler

Metadata

Fixed metadata propagation in :meth:`DataFrame.corr` and :meth:`DataFrame.cov` (:issue:`28283`)

Files

v1.6.0.rst

Latest commit

History