Skip to content

Latest commit

 

History

History
401 lines (312 loc) · 19.9 KB

v1.4.0.rst

File metadata and controls

401 lines (312 loc) · 19.9 KB

What's new in 1.4.0 (??)

These are the changes in pandas 1.4.0. See :ref:`release` for a full changelog including other versions of pandas.

{{ header }}

Enhancements

More flexible numeric dtypes for indexes

Until now, it has only been possible to create numeric indexes with int64/float64/uint64 dtypes. It is now possible to create an index of any numpy int/uint/float dtype using the new :class:`NumericIndex` index type (:issue:`41153`):

.. ipython:: python

    pd.NumericIndex([1, 2, 3], dtype="int8")
    pd.NumericIndex([1, 2, 3], dtype="uint32")
    pd.NumericIndex([1, 2, 3], dtype="float32")

In order to maintain backwards compatibility, calls to the base :class:`Index` will in pandas 1.x. return :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index`. For example, the code below returns an Int64Index with dtype int64:

In [1]: pd.Index([1, 2, 3], dtype="int8")
Int64Index([1, 2, 3], dtype='int64')

For the duration of Pandas 1.x, in order to maintain backwards compatibility, all operations that until now have returned :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index` will continue to so. This means, that in order to use NumericIndex, you will have to call NumericIndex explicitly. For example the below series will have an Int64Index:

In [2]: ser = pd.Series([1, 2, 3], index=[1, 2, 3])
In [3]: ser.index
Int64Index([1, 2, 3], dtype='int64')

Instead if you want to use a NumericIndex, you should do:

.. ipython:: python

    idx = pd.NumericIndex([1, 2, 3], dtype="int8")
    ser = pd.Series([1, 2, 3], index=idx)
    ser.index

In Pandas 2.0, :class:`NumericIndex` will become the default numeric index type and Int64Index, UInt64Index and Float64Index will be removed.

See :ref:`here <advanced.numericindex>` for more.

Styler

:class:`.Styler` has been further developed in 1.4.0. The following enhancements have been made:

There are also bug fixes and deprecations listed below.

Multithreaded CSV reading with a new CSV Engine based on pyarrow

:func:`pandas.read_csv` now accepts engine="pyarrow" (requires at least pyarrow 0.17.0) as an argument, allowing for faster csv parsing on multicore machines with pyarrow installed. See the :doc:`I/O docs </user_guide/io>` for more info. (:issue:`23697`)

Other enhancements

Notable bug fixes

These are bug fixes that might have notable behavior changes.

Inconsistent date string parsing

The dayfirst option of :func:`to_datetime` isn't strict, and this can lead to surprising behaviour:

.. ipython:: python
    :okwarning:

    pd.to_datetime(["31-12-2021"], dayfirst=False)

Now, a warning will be raised if a date string cannot be parsed accordance to the given dayfirst value when the value is a delimited date string (e.g. 31-12-2012).

notable_bug_fix2

Backwards incompatible API changes

Increased minimum versions for dependencies

Some minimum supported versions of dependencies were updated. If installed, we now require:

Package Minimum Version Required Changed
numpy 1.18.5 X X
pytz 2020.1 X X
python-dateutil 2.8.1 X X
bottleneck 1.3.1   X
numexpr 2.7.1   X
pytest (dev) 6.0    
mypy (dev) 0.910   X

For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.

Package Minimum Version Changed
beautifulsoup4 4.8.2 X
fastparquet 0.4.0  
fsspec 0.7.4  
gcsfs 0.6.0  
lxml 4.5.0 X
matplotlib 3.3.2 X
numba 0.50.1 X
openpyxl 3.0.2 X
pyarrow 0.17.0  
pymysql 0.10.1 X
pytables 3.6.1 X
s3fs 0.4.0  
scipy 1.4.1 X
sqlalchemy 1.3.11 X
tabulate 0.8.7  
xarray 0.15.1 X
xlrd 2.0.1 X
xlsxwriter 1.2.2 X
xlwt 1.3.0  
pandas-gbq 0.14.0 X

See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.

Other API changes

Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

  • Bug in :class:`DataFrame` constructor unnecessarily copying non-datetimelike 2D object arrays (:issue:`39272`)
  • :func:`to_datetime` would silently swap MM/DD/YYYY and DD/MM/YYYY formats if the given dayfirst option could not be respected - now, a warning is raised in the case of delimited date strings (e.g. 31-12-2012) (:issue:`12585`)

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Styler

Other

Contributors