What's new in 2.2.0 (Month XX, 2024)

These are the changes in pandas 2.2.0. See :ref:`release` for a full changelog including other versions of pandas.

Enhancements

Calamine engine for :func:`read_excel`

The calamine engine was added to :func:`read_excel`. It uses python-calamine, which provides Python bindings for the Rust library calamine. This engine supports Excel files (.xlsx, .xlsm, .xls, .xlsb) and OpenDocument spreadsheets (.ods) (:issue:`50395`).

There are two advantages of this engine:

Calamine is often faster than other engines, some benchmarks show results up to 5x faster than 'openpyxl', 20x - 'odf', 4x - 'pyxlsb', and 1.5x - 'xlrd'. But, 'openpyxl' and 'pyxlsb' are faster in reading a few rows from large files because of lazy iteration over rows.
Calamine supports the recognition of datetime in .xlsb files, unlike 'pyxlsb' which is the only other engine in pandas that can read .xlsb files.

pd.read_excel("path_to_file.xlsb", engine="calamine")

For more, see :ref:`io.calamine` in the user guide on IO tools.

enhancement2

Other enhancements

DataFrame.apply now allows the usage of numba (via engine="numba") to JIT compile the passed function, allowing for potential speedups (:issue:`54666`)

Notable bug fixes

These are bug fixes that might have notable behavior changes.

:func:`merge` and :meth:`DataFrame.join` now consistently follow documented sort behavior

In previous versions of pandas, :func:`merge` and :meth:`DataFrame.join` did not always return a result that followed the documented sort behavior. pandas now follows the documented sort behavior in merge and join operations (:issue:`54611`).

As documented, sort=True sorts the join keys lexicographically in the resulting :class:`DataFrame`. With sort=False, the order of the join keys depends on the join type (how keyword):

how="left": preserve the order of the left keys
how="right": preserve the order of the right keys
how="inner": preserve the order of the left keys
how="outer": sort keys lexicographically

One example with changing behavior is inner joins with non-unique left join keys and sort=False:

.. ipython:: python

    left = pd.DataFrame({"a": [1, 2, 1]})
    right = pd.DataFrame({"a": [1, 2]})
    result = pd.merge(left, right, how="inner", on="a", sort=False)

Old Behavior

In [5]: result
Out[5]:
   a
0  1
1  1
2  2

New Behavior

.. ipython:: python

    result

notable_bug_fix2

Backwards incompatible API changes

Increased minimum versions for dependencies

Some minimum supported versions of dependencies were updated. If installed, we now require:

Package	Minimum Version	Required	Changed
		X	X

For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.

Package	Minimum Version	Changed
		X

See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.

Other API changes

Deprecations

Changed :meth:`Timedelta.resolution_string` to return min, s, ms, us, and ns instead of T, S, L, U, and N, for compatibility with respective deprecations in frequency aliases (:issue:`52536`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_clipboard`. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_csv` except path_or_buf. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_dict`. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_excel` except excel_writer. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_gbq` except destination_table. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_hdf` except path_or_buf. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_html` except buf. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_json` except path_or_buf. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_latex` except buf. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_markdown` except buf. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_parquet` except path. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_pickle` except path. (:issue:`54229`)
Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_string` except buf. (:issue:`54229`)
Deprecated downcasting behavior in :meth:`Series.where`, :meth:`DataFrame.where`, :meth:`Series.mask`, :meth:`DataFrame.mask`, :meth:`Series.clip`, :meth:`DataFrame.clip`; in a future version these will not infer object-dtype columns to non-object dtype, or all-round floats to integer dtype. Call result.infer_objects(copy=False) on the result for object inference, or explicitly cast floats to ints. To opt in to the future version, use pd.set_option("future.downcasting", True) (:issue:`53656`)
Deprecated including the groups in computations when using :meth:`DataFrameGroupBy.apply` and :meth:`DataFrameGroupBy.resample`; pass include_groups=False to exclude the groups (:issue:`7155`)
Deprecated not passing a tuple to :class:`DataFrameGroupBy.get_group` or :class:`SeriesGroupBy.get_group` when grouping by a length-1 list-like (:issue:`25971`)
Deprecated strings S, U, and N denoting units in :func:`to_timedelta` (:issue:`52536`)
Deprecated strings T, S, L, U, and N denoting frequencies in :class:`Minute`, :class:`Second`, :class:`Milli`, :class:`Micro`, :class:`Nano` (:issue:`52536`)
Deprecated strings T, S, L, U, and N denoting units in :class:`Timedelta` (:issue:`52536`)
Deprecated the extension test classes BaseNoReduceTests, BaseBooleanReduceTests, and BaseNumericReduceTests, use BaseReduceTests instead (:issue:`54663`)
Deprecated the option mode.data_manager and the ArrayManager; only the BlockManager will be available in future versions (:issue:`55043`)

Performance improvements

Performance improvement in :func:`concat` with axis=1 and objects with unaligned indexes (:issue:`55084`)
Performance improvement in :func:`to_dict` on converting DataFrame to dictionary (:issue:`50990`)
Performance improvement in :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` when indexed by a :class:`MultiIndex` (:issue:`54835`)
Performance improvement in :meth:`Index.difference` (:issue:`55108`)
Performance improvement when indexing with more than 4 keys (:issue:`54550`)

Bug fixes

Bug in :class:`AbstractHolidayCalendar` where timezone data was not propagated when computing holiday observances (:issue:`54580`)
Bug in :class:`pandas.core.window.Rolling` where duplicate datetimelike indexes are treated as consecutive rather than equal with closed='left' and closed='neither' (:issue:`20712`)
Bug in :meth:`DataFrame.apply` where passing raw=True ignored args passed to the applied function (:issue:`55009`)

Categorical

:meth:`Categorical.isin` raising InvalidIndexError for categorical containing overlapping :class:`Interval` values (:issue:`34974`)

Datetimelike

Timedelta

Timezones

Numeric

Bug in :func:`read_csv` with engine="pyarrow" causing rounding errors for large integers (:issue:`52505`)

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Bug in :func:`read_csv` where on_bad_lines="warn" would write to stderr instead of raise a Python warning. This now yields a :class:`.errors.ParserWarning` (:issue:`54296`)
Bug in :func:`read_excel`, with engine="xlrd" (xls files) erroring when file contains NaNs/Infs (:issue:`54564`)
Bug in :func:`to_excel`, with OdsWriter (ods files) writing boolean/string value (:issue:`54994`)

Period

Plotting

Groupby/resample/rolling

Reshaping

Bug in :func:`concat` ignoring sort parameter when passed :class:`DatetimeIndex` indexes (:issue:`54769`)
Bug in :func:`merge` returning columns in incorrect order when left and/or right is empty (:issue:`51929`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.2.0.rst

v2.2.0.rst

What's new in 2.2.0 (Month XX, 2024)

Enhancements

Calamine engine for :func:`read_excel`

enhancement2

Other enhancements

Notable bug fixes

:func:`merge` and :meth:`DataFrame.join` now consistently follow documented sort behavior

notable_bug_fix2

Backwards incompatible API changes

Increased minimum versions for dependencies

Other API changes

Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Styler

Other

Contributors

Files

v2.2.0.rst

Latest commit

History

v2.2.0.rst

File metadata and controls

What's new in 2.2.0 (Month XX, 2024)

Enhancements

Calamine engine for :func:`read_excel`

enhancement2

Other enhancements

Notable bug fixes

:func:`merge` and :meth:`DataFrame.join` now consistently follow documented sort behavior

notable_bug_fix2

Backwards incompatible API changes

Increased minimum versions for dependencies

Other API changes

Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Styler

Other

Contributors