What's new in 1.3.0 (??)

These are the changes in pandas 1.3.0. See :ref:`release` for a full changelog including other versions of pandas.

Enhancements

Custom HTTP(s) headers when reading csv or json files

When reading from a remote URL that is not handled by fsspec (ie. HTTP and HTTPS) the dictionary passed to storage_options will be used to create the headers included in the request. This can be used to control the User-Agent header or send other custom headers (:issue:`36688`). For example:

.. ipython:: python

    headers = {"User-Agent": "pandas"}
    df = pd.read_csv(
        "https://download.bls.gov/pub/time.series/cu/cu.item",
        sep="\t",
        storage_options=headers
    )

:class:`Rolling` and :class:`Expanding` now support a method argument with a 'table' option that performs the windowing operation over an entire :class:`DataFrame`. See ref:window.overview for performance and functional benefits. (:issue:`15095`, :issue:`38995`)

Control of index with `group_keys` in :meth:`DataFrame.resample`

The argument group_keys has been added to the method :meth:`DataFrame.resample`. As with :meth:`DataFrame.groupby`, this argument controls the whether each group is added to the index in the resample when :meth:`.Resampler.apply` is used.

Warning

Not specifying the group_keys argument will retain the previous behavior and emit a warning. In a future version of pandas, not specifying group_keys will default to the same behavior as group_keys=False.

.. ipython:: python

    df = pd.DataFrame(
        {'a': range(6)},
        index=pd.date_range("2021-01-01", periods=6, freq="8H")
    )
    df.resample("D", group_keys=True).apply(lambda x: x)
    df.resample("D", group_keys=False).apply(lambda x: x)

Previously, the resulting index would depend upon the values returned by apply, as seen in the following example.

>>> # pandas 1.2
>>> df.resample("D").apply(lambda x: x)
                     a
2021-01-01 00:00:00  0
2021-01-01 08:00:00  1
2021-01-01 16:00:00  2
2021-01-02 00:00:00  3
2021-01-02 08:00:00  4
2021-01-02 16:00:00  5
>>> df.resample("D").apply(lambda x: x.reset_index())
                           index  a
2021-01-01 0 2021-01-01 00:00:00  0
           1 2021-01-01 08:00:00  1
           2 2021-01-01 16:00:00  2
2021-01-02 0 2021-01-02 00:00:00  3
           1 2021-01-02 08:00:00  4
           2 2021-01-02 16:00:00  5

Other enhancements

Added :meth:`MultiIndex.dtypes` (:issue:`37062`)
Added end and end_day options for origin in :meth:`DataFrame.resample` (:issue:`37804`)
Improve error message when usecols and names do not match for :func:`read_csv` and engine="c" (:issue:`29042`)
Improved consistency of error message when passing an invalid win_type argument in :class:`Window` (:issue:`15969`)
:func:`pandas.read_sql_query` now accepts a dtype argument to cast the columnar data from the SQL database based on user input (:issue:`10285`)
Improved integer type mapping from pandas to SQLAlchemy when using :meth:`DataFrame.to_sql` (:issue:`35076`)
:func:`to_numeric` now supports downcasting of nullable ExtensionDtype objects (:issue:`33013`)
Add support for dict-like names in :class:`MultiIndex.set_names` and :class:`MultiIndex.rename` (:issue:`20421`)
:func:`pandas.read_excel` can now auto detect .xlsb files (:issue:`35416`)
:meth:`.Rolling.sum`, :meth:`.Expanding.sum`, :meth:`.Rolling.mean`, :meth:`.Expanding.mean`, :meth:`.Rolling.median`, :meth:`.Expanding.median`, :meth:`.Rolling.max`, :meth:`.Expanding.max`, :meth:`.Rolling.min`, and :meth:`.Expanding.min` now support Numba execution with the engine keyword (:issue:`38895`)
:meth:`DataFrame.apply` can now accept NumPy unary operators as strings, e.g. df.apply("sqrt"), which was already the case for :meth:`Series.apply` (:issue:`39116`)
:meth:`DataFrame.apply` can now accept non-callable DataFrame properties as strings, e.g. df.apply("size"), which was already the case for :meth:`Series.apply` (:issue:`39116`)

Notable bug fixes

These are bug fixes that might have notable behavior changes.

Increased minimum versions for dependencies

Some minimum supported versions of dependencies were updated. If installed, we now require:

Package	Minimum Version	Required	Changed
numpy	1.16.5	X
pytz	2017.3	X
python-dateutil	2.7.3	X
bottleneck	1.2.1
numexpr	2.6.8
pytest (dev)	5.0.1
mypy (dev)	0.790		X

For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.

Package	Minimum Version	Changed
beautifulsoup4	4.6.0
fastparquet	0.3.2
fsspec	0.7.4
gcsfs	0.6.0
lxml	4.3.0
matplotlib	2.2.3
numba	0.46.0
openpyxl	2.6.0
pyarrow	0.15.0
pymysql	0.7.11
pytables	3.5.1
s3fs	0.4.0
scipy	1.2.0
sqlalchemy	1.2.8
tabulate	0.8.7	X
xarray	0.12.0
xlrd	1.2.0
xlsxwriter	1.0.2
xlwt	1.3.0
pandas-gbq	0.12.0

See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.

Other API changes

Partially initialized :class:`CategoricalDtype` (i.e. those with categories=None objects will no longer compare as equal to fully initialized dtype objects.

Deprecations

:meth:`~DataFrame.groupby` no longer ignores `group_keys` for transform-like `apply`

If group_keys=True is specified when calling :meth:`~DataFrame.groupby`, functions passed to apply that return like-indexed outputs will have the group keys added to the result index. Previous versions of pandas would add the group keys only when the result from the applied function had a different index than the input. If group_keys is not specified, the group keys will not be added for like-indexed outputs.

Previous behavior:

>>> # pandas 1.2
>>> df = pd.DataFrame({"A": [1, 2, 2], "B": [1, 2, 3]})
>>> df
   A  B
0  1  1
1  2  2
2  2  3
>>> df.groupby("A").apply(lambda x: x.rename(np.exp))  # Different index
            A  B
A
1 1.000000  1  1
2 2.718282  2  2
  7.389056  2  3

>>> df.groupby("A").apply(lambda x: x)  # Same index
   A  B
0  1  1
1  2  2
2  2  3

In this future this behavior will change to always respect group_keys, which defaults to True.

New behavior:

.. ipython:: python

   df = pd.DataFrame({"A": [1, 2, 2], "B": [1, 2, 3]})
   df.groupby("A", group_keys=True).apply(lambda x: x)
   df.groupby("A", group_keys=True).apply(lambda x: x.rename(np.exp))

A warning will be issued if the result would change from pandas 1.2

.. ipython:: python
   :okwarning:

   df.groupby("A").apply(lambda x: x)

Other Deprecations

Deprecating allowing scalars passed to the :class:`Categorical` constructor (:issue:`38433`)
Deprecated allowing subclass-specific keyword arguments in the :class:`Index` constructor, use the specific subclass directly instead (:issue:`14093`,:issue:21311,:issue:22315,:issue:26974)
Deprecated astype of datetimelike (timedelta64[ns], datetime64[ns], Datetime64TZDtype, PeriodDtype) to integer dtypes, use values.view(...) instead (:issue:`38544`)
Deprecated :meth:`MultiIndex.is_lexsorted` and :meth:`MultiIndex.lexsort_depth` as a public methods, users should use :meth:`MultiIndex.is_monotonic_increasing` instead (:issue:`32259`)
Deprecated keyword try_cast in :meth:`Series.where`, :meth:`Series.mask`, :meth:`DataFrame.where`, :meth:`DataFrame.mask`; cast results manually if desired (:issue:`38836`)
Deprecated comparison of :class:`Timestamp` object with datetime.date objects. Instead of e.g. ts <= mydate use ts <= pd.Timestamp(mydate) or ts.date() <= mydate (:issue:`36131`)
Deprecated :attr:`Rolling.win_type` returning "freq" (:issue:`38963`)
Deprecated :attr:`Rolling.is_datetimelike` (:issue:`38963`)

Performance improvements

Performance improvement in :meth:`IntervalIndex.isin` (:issue:`38353`)
Performance improvement in :meth:`Series.mean` for nullable data types (:issue:`34814`)

Bug fixes

Categorical

Bug in :class:`CategoricalIndex` incorrectly failing to raise TypeError when scalar data is passed (:issue:`38614`)
Bug in CategoricalIndex.reindex failed when Index passed with elements all in category (:issue:`28690`)
Bug where constructing a :class:`Categorical` from an object-dtype array of date objects did not round-trip correctly with astype (:issue:`38552`)
Bug in constructing a :class:`DataFrame` from an ndarray and a :class:`CategoricalDtype` (:issue:`38857`)
Bug in :meth:`DataFrame.reindex` was throwing IndexError when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`)

Datetimelike

Bug in :class:`DataFrame` and :class:`Series` constructors sometimes dropping nanoseconds from :class:`Timestamp` (resp. :class:`Timedelta`) data, with dtype=datetime64[ns] (resp. timedelta64[ns]) (:issue:`38032`)
Bug in :meth:`DataFrame.first` and :meth:`Series.first` returning two months for offset one month when first day is last calendar day (:issue:`29623`)
Bug in constructing a :class:`DataFrame` or :class:`Series` with mismatched datetime64 data and timedelta64 dtype, or vice-versa, failing to raise TypeError (:issue:`38575`, :issue:`38764`, :issue:`38792`)
Bug in constructing a :class:`Series` or :class:`DataFrame` with a datetime object out of bounds for datetime64[ns] dtype or a timedelta object ouf of bounds for timedelta64[ns] dtype (:issue:`38792`, :issue:`38965`)
Bug in :meth:`DatetimeIndex.intersection`, :meth:`DatetimeIndex.symmetric_difference`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38741`)
Bug in :meth:`Series.where` incorrectly casting datetime64 values to int64 (:issue:`37682`)
Bug in :class:`Categorical` incorrectly typecasting datetime object to Timestamp (:issue:`38878`)

Timedelta

Bug in constructing :class:`Timedelta` from np.timedelta64 objects with non-nanosecond units that are out of bounds for timedelta64[ns] (:issue:`38965`)

Timezones

Numeric

Bug in :meth:`DataFrame.quantile`, :meth:`DataFrame.sort_values` causing incorrect subsequent indexing behavior (:issue:`38351`)
Bug in :meth:`DataFrame.select_dtypes` with include=np.number now retains numeric ExtensionDtype columns (:issue:`35340`)
Bug in :meth:`DataFrame.mode` and :meth:`Series.mode` not keeping consistent integer :class:`Index` for empty input (:issue:`33321`)
Bug in :meth:`DataFrame.rank` with np.inf and mixture of np.nan and np.inf (:issue:`32593`)
Bug in :meth:`DataFrame.rank` with axis=0 and columns holding incomparable types raising IndexError (:issue:`38932`)

Conversion

Strings

Interval

Bug in :meth:`IntervalIndex.intersection` and :meth:`IntervalIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38653`, :issue:`38741`)
Bug in :meth:`IntervalIndex.intersection` returning duplicates when at least one of both Indexes has duplicates which are present in the other (:issue:`38743`)

Indexing

Bug in :meth:`CategoricalIndex.get_indexer` failing to raise InvalidIndexError when non-unique (:issue:`38372`)
Bug in inserting many new columns into a :class:`DataFrame` causing incorrect subsequent indexing behavior (:issue:`38380`)
Bug in :meth:`DataFrame.loc`, :meth:`Series.loc`, :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` returning incorrect elements for non-monotonic :class:`DatetimeIndex` for string slices (:issue:`33146`)
Bug in :meth:`DataFrame.reindex` and :meth:`Series.reindex` with timezone aware indexes raising TypeError for method="ffill" and method="bfill" and specified tolerance (:issue:`38566`)
Bug in :meth:`DataFrame.__setitem__` raising ValueError with empty :class:`DataFrame` and specified columns for string indexer and non empty :class:`DataFrame` to set (:issue:`38831`)
Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)
Bug in setting timedelta64 values into numeric :class:`Series` failing to cast to object dtype (:issue:`39086`)

Missing

Bug in :class:`Grouper` now correctly propagates dropna argument and :meth:`DataFrameGroupBy.transform` now correctly handles missing values for dropna=True (:issue:`35612`)

MultiIndex

Bug in :meth:`DataFrame.drop` raising TypeError when :class:`MultiIndex` is non-unique and no level is provided (:issue:`36293`)
Bug in :meth:`MultiIndex.intersection` duplicating NaN in result (:issue:`38623`)
Bug in :meth:`MultiIndex.equals` incorrectly returning True when :class:`MultiIndex` containing NaN even when they are differntly ordered (:issue:`38439`)
Bug in :meth:`MultiIndex.intersection` always returning empty when intersecting with :class:`CategoricalIndex` (:issue:`38653`)

I/O

Bug in :meth:`Index.__repr__` when display.max_seq_items=1 (:issue:`38415`)
Bug in :func:`read_csv` not recognizing scientific notation if decimal is set for engine="python" (:issue:`31920`)
Bug in :func:`read_csv` interpreting NA value as comment, when NA does contain the comment string fixed for engine="python" (:issue:`34002`)
Bug in :func:`read_csv` raising IndexError with multiple header columns and index_col specified when file has no data rows (:issue:`38292`)
Bug in :func:`read_csv` not accepting usecols with different length than names for engine="python" (:issue:`16469`)
Bug in :meth:`read_csv` returning object dtype when delimiter="," with usecols and parse_dates specified for engine="python" (:issue:`35873`)
Bug in :func:`read_csv` raising TypeError when names and parse_dates is specified for engine="c" (:issue:`33699`)
Bug in :func:`read_clipboard`, :func:`DataFrame.to_clipboard` not working in WSL (:issue:`38527`)
Allow custom error values for parse_dates argument of :func:`read_sql`, :func:`read_sql_query` and :func:`read_sql_table` (:issue:`35185`)
Bug in :func:`to_hdf` raising KeyError when trying to apply for subclasses of DataFrame or Series (:issue:`33748`).
Bug in :meth:`~HDFStore.put` raising a wrong TypeError when saving a DataFrame with non-string dtype (:issue:`34274`)
Bug in :func:`json_normalize` resulting in the first element of a generator object not being included in the returned DataFrame (:issue:`35923`)
Bug in :func:`read_excel` forward filling :class:`MultiIndex` names with multiple header and index columns specified (:issue:`34673`)
:func:`pandas.read_excel` now respects :func:pandas.set_option (:issue:`34252`)
Bug in :func:`read_csv` not switching true_values and false_values for nullable boolean dtype (:issue:`34655`)
Bug in :func:read_json when orient="split" does not maintan numeric string index (:issue:`28556`)

Period

Plotting

Bug in :func:`scatter_matrix` raising when 2d ax argument passed (:issue:`16253`)

Groupby/resample/rolling

Bug in :meth:`SeriesGroupBy.value_counts` where unobserved categories in a grouped categorical series were not tallied (:issue:`38672`)
Bug in :meth:`.GroupBy.indices` would contain non-existent indices when null values were present in the groupby keys (:issue:`9304`)
Fixed bug in :meth:`DataFrameGroupBy.sum` and :meth:`SeriesGroupBy.sum` causing loss of precision through using Kahan summation (:issue:`38778`)
Fixed bug in :meth:`DataFrameGroupBy.cumsum`, :meth:`SeriesGroupBy.cumsum`, :meth:`DataFrameGroupBy.mean` and :meth:`SeriesGroupBy.mean` causing loss of precision through using Kahan summation (:issue:`38934`)
Bug in :meth:`.Resampler.aggregate` and :meth:`DataFrame.transform` raising TypeError instead of SpecificationError when missing keys having mixed dtypes (:issue:`39025`)

Reshaping

Bug in :func:`merge` raising error when performing an inner join with partial index and right_index when no overlap between indices (:issue:`33814`)
Bug in :meth:`DataFrame.unstack` with missing levels led to incorrect index names (:issue:`37510`)
Bug in :func:`join` over :class:`MultiIndex` returned wrong result, when one of both indexes had only one level (:issue:`36909`)
:meth:`merge_asof` raises ValueError instead of cryptic TypeError in case of non-numerical merge columns (:issue:`29130`)

Sparse

Bug in :meth:`DataFrame.sparse.to_coo` raising KeyError with columns that are a numeric :class:`Index` without a 0 (:issue:`18414`)
Bug in :meth:`SparseArray.astype` with copy=False producing incorrect results when going from integer dtype to floating dtype (:issue:`34456`)

ExtensionArray

Bug in :meth:`DataFrame.where` when other is a :class:`Series` with ExtensionArray dtype (:issue:`38729`)
Fixed bug where :meth:`Series.idxmax`, :meth:`Series.idxmin` and argmax/min fail when the underlying data is :class:`ExtensionArray` (:issue:`32749`, :issue:`33719`, :issue:`36566`)

Other

Bug in :class:`Index` constructor sometimes silently ignorning a a specified dtype (:issue:`38879`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.3.0.rst

v1.3.0.rst

What's new in 1.3.0 (??)

Enhancements

Custom HTTP(s) headers when reading csv or json files

Control of index with `group_keys` in :meth:`DataFrame.resample`

Other enhancements

Notable bug fixes

Increased minimum versions for dependencies

Other API changes

Deprecations

:meth:`~DataFrame.groupby` no longer ignores `group_keys` for transform-like `apply`

Other Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Other

Contributors

Files

v1.3.0.rst

Latest commit

History

v1.3.0.rst

File metadata and controls

What's new in 1.3.0 (??)

Enhancements

Custom HTTP(s) headers when reading csv or json files

Control of index with group_keys in :meth:`DataFrame.resample`

Other enhancements

Notable bug fixes

Increased minimum versions for dependencies

Other API changes

Deprecations

:meth:`~DataFrame.groupby` no longer ignores group_keys for transform-like apply

Other Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Other

Contributors

Control of index with `group_keys` in :meth:`DataFrame.resample`

:meth:`~DataFrame.groupby` no longer ignores `group_keys` for transform-like `apply`