What's new in 1.3.0 (??)

These are the changes in pandas 1.3.0. See :ref:`release` for a full changelog including other versions of pandas.

Enhancements

Custom HTTP(s) headers when reading csv or json files

When reading from a remote URL that is not handled by fsspec (ie. HTTP and HTTPS) the dictionary passed to storage_options will be used to create the headers included in the request. This can be used to control the User-Agent header or send other custom headers (:issue:`36688`). For example:

.. ipython:: python

    headers = {"User-Agent": "pandas"}
    df = pd.read_csv(
        "https://download.bls.gov/pub/time.series/cu/cu.item",
        sep="\t",
        storage_options=headers
    )

:class:`Rolling` and :class:`Expanding` now support a method argument with a 'table' option that performs the windowing operation over an entire :class:`DataFrame`. See ref:window.overview for performance and functional benefits. (:issue:`15095`)

Other enhancements

Added :meth:`MultiIndex.dtypes` (:issue:`37062`)
Added end and end_day options for origin in :meth:`DataFrame.resample` (:issue:`37804`)
Improve error message when usecols and names do not match for :func:`read_csv` and engine="c" (:issue:`29042`)
Improved consistency of error message when passing an invalid win_type argument in :class:`Window` (:issue:`15969`)
:func:`pandas.read_sql_query` now accepts a dtype argument to cast the columnar data from the SQL database based on user input (:issue:`10285`)
Improved integer type mapping from pandas to SQLAlchemy when using :meth:`DataFrame.to_sql` (:issue:`35076`)
:func:`to_numeric` now supports downcasting of nullable ExtensionDtype objects (:issue:`33013`)

Notable bug fixes

These are bug fixes that might have notable behavior changes.

Assigning with `DataFrame.setitem` consistently creates a new array

Assigning values with DataFrame.__setitem__ now consistently assigns a new array, rather than mutating inplace (:issue:`33457`, :issue:`35271`, :issue:`35266`)

Previously, DataFrame.__setitem__ would sometimes operate inplace on the underlying array, and sometimes assign a new array. Fixing this inconsistency can have behavior-changing implications for workloads that relied on inplace mutation. The two most common cases are creating a DataFrame from an array and slicing a DataFrame.

Previous Behavior

The array would be mutated inplace for some dtypes, like NumPy's int64 dtype.

>>> import pandas as pd
>>> import numpy as np
>>> a = np.array([1, 2, 3])
>>> df = pd.DataFrame(a, columns=['a'])
>>> df['a'] = 0
>>> a  # mutated inplace
array([0, 0, 0])

But not others, like :class:`Int64Dtype`.

>>> import pandas as pd
>>> import numpy as np
>>> a = pd.array([1, 2, 3], dtype="Int64")
>>> df = pd.DataFrame(a, columns=['a'])
>>> df['a'] = 0
>>> a  # not mutated
<IntegerArray>
[1, 2, 3]
Length: 3, dtype: Int64

New Behavior

In pandas 1.3.0, DataFrame.__setitem__ consistently sets on a new array rather than mutating the existing array inplace.

For NumPy's int64 dtype

.. ipython:: python

   import pandas as pd
   import numpy as np
   a = np.array([1, 2, 3])
   df = pd.DataFrame(a, columns=['a'])
   df['a'] = 0
   a  # not mutated

For :class:`Int64Dtype`.

.. ipython:: python

   import pandas as pd
   import numpy as np
   a = pd.array([1, 2, 3], dtype="Int64")
   df = pd.DataFrame(a, columns=['a'])
   df['a'] = 0
   a  # not mutated

This also affects cases where a second Series or DataFrame is a view on a first DataFrame.

df = pd.DataFrame({"A": [1, 2, 3]})
df2 = df[['A']]
df['A'] = np.array([0, 0, 0])

Previously, whether df2 was mutated depending on the dtype of the array being assigned to. Now, a new array is consistently assigned, so df2 is not mutated.

Increased minimum versions for dependencies

Some minimum supported versions of dependencies were updated. If installed, we now require:

Package	Minimum Version	Required
numpy	1.16.5	X
pytz	2017.3	X
python-dateutil	2.7.3	X
bottleneck	1.2.1
numexpr	2.6.8
pytest (dev)	5.0.1
mypy (dev)	0.782

For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.

Package	Minimum Version	Changed
beautifulsoup4	4.6.0
fastparquet	0.3.2
fsspec	0.7.4
gcsfs	0.6.0
lxml	4.3.0
matplotlib	2.2.3
numba	0.46.0
openpyxl	2.6.0
pyarrow	0.15.0
pymysql	0.7.11
pytables	3.5.1
s3fs	0.4.0
scipy	1.2.0
sqlalchemy	1.2.8
tabulate	0.8.7	X
xarray	0.12.0
xlrd	1.2.0
xlsxwriter	1.0.2
xlwt	1.3.0
pandas-gbq	0.12.0

See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.

Other API changes

Partially initialized :class:`CategoricalDtype` (i.e. those with categories=None objects will no longer compare as equal to fully initialized dtype objects.

Deprecations

Deprecating allowing scalars passed to the :class:`Categorical` constructor (:issue:`38433`)
Deprecated allowing subclass-specific keyword arguments in the :class:`Index` constructor, use the specific subclass directly instead (:issue:`14093`,:issue:21311,:issue:22315,:issue:26974)
Deprecated astype of datetimelike (timedelta64[ns], datetime64[ns], Datetime64TZDtype, PeriodDtype) to integer dtypes, use values.view(...) instead (:issue:`38544`)

Performance improvements

Performance improvement in :meth:`IntervalIndex.isin` (:issue:`38353`)

Bug fixes

Categorical

Bug in :class:`CategoricalIndex` incorrectly failing to raise TypeError when scalar data is passed (:issue:`38614`)
Bug in CategoricalIndex.reindex failed when Index passed with elements all in category (:issue:`28690`)
Bug where construcing a :class:`Categorical` from an object-dtype array of date objects did not round-trip correctly with astype (:issue:`38552`)

Datetimelike

Bug in :class:`DataFrame` and :class:`Series` constructors sometimes dropping nanoseconds from :class:`Timestamp` (resp. :class:`Timedelta`) data, with dtype=datetime64[ns] (resp. timedelta64[ns]) (:issue:`38032`)
Bug in :meth:`DataFrame.first` and :meth:`Series.first` returning two months for offset one month when first day is last calendar day (:issue:`29623`)
Bug in constructing a :class:`DataFrame` or :class:`Series` with mismatched datetime64 data and timedelta64 dtype, or vice-versa, failing to raise TypeError (:issue:`38575`, :issue:`38764`)
Bug in :meth:`DatetimeIndex.intersection`, :meth:`DatetimeIndex.symmetric_difference`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38741`)
Bug in :meth:`Series.where` incorrectly casting datetime64 values to int64 (:issue:`37682`)

Timedelta

Timezones

Numeric

Bug in :meth:`DataFrame.quantile`, :meth:`DataFrame.sort_values` causing incorrect subsequent indexing behavior (:issue:`38351`)
Bug in :meth:`DataFrame.select_dtypes` with include=np.number now retains numeric ExtensionDtype columns (:issue:`35340`)
Bug in :meth:`DataFrame.mode` and :meth:`Series.mode` not keeping consistent integer :class:`Index` for empty input (:issue:`33321`)

Conversion

Strings

Interval

Bug in :meth:`IntervalIndex.intersection` and :meth:`IntervalIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38653`, :issue:`38741`)

Indexing

Bug in :meth:`CategoricalIndex.get_indexer` failing to raise InvalidIndexError when non-unique (:issue:`38372`)
Bug in inserting many new columns into a :class:`DataFrame` causing incorrect subsequent indexing behavior (:issue:`38380`)
Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)
Bug in :meth:`DataFrame.iloc.__setitem__` creating a new array instead of overwriting Categorical values in-place (:issue:`35417`)

Missing

Bug in :class:`Grouper` now correctly propagates dropna argument and :meth:`DataFrameGroupBy.transform` now correctly handles missing values for dropna=True (:issue:`35612`)

MultiIndex

Bug in :meth:`DataFrame.drop` raising TypeError when :class:`MultiIndex` is non-unique and no level is provided (:issue:`36293`)
Bug in :meth:`MultiIndex.equals` incorrectly returning True when :class:`MultiIndex` containing NaN even when they are differntly ordered (:issue:`38439`)
Bug in :meth:`MultiIndex.intersection` always returning empty when intersecting with :class:`CategoricalIndex` (:issue:`38653`)

I/O

Bug in :meth:`Index.__repr__` when display.max_seq_items=1 (:issue:`38415`)
Bug in :func:`read_csv` interpreting NA value as comment, when NA does contain the comment string fixed for engine="python" (:issue:`34002`)
Bug in :func:`read_csv` raising IndexError with multiple header columns and index_col specified when file has no data rows (:issue:`38292`)
Bug in :func:`read_csv` not accepting usecols with different length than names for engine="python" (:issue:`16469`)
Bug in :func:`read_csv` raising TypeError when names and parse_dates is specified for engine="c" (:issue:`33699`)
Bug in :func:`read_clipboard`, :func:`DataFrame.to_clipboard` not working in WSL (:issue:`38527`)
Allow custom error values for parse_dates argument of :func:`read_sql`, :func:`read_sql_query` and :func:`read_sql_table` (:issue:`35185`)
Bug in :func:`to_hdf` raising KeyError when trying to apply for subclasses of DataFrame or Series (:issue:`33748`).
Bug in :func:`json_normalize` resulting in the first element of a generator object not being included in the returned DataFrame (:issue:`35923`)

Period

Plotting

Bug in :func:`scatter_matrix` raising when 2d ax argument passed (:issue:`16253`)

Groupby/resample/rolling

Reshaping

Sparse

Bug in :meth:`DataFrame.sparse.to_coo` raising KeyError with columns that are a numeric :class:`Index` without a 0 (:issue:`18414`)

ExtensionArray

Bug in :meth:`DataFrame.where` when other is a :class:`Series` with ExtensionArray dtype (:issue:`38729`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.3.0.rst

v1.3.0.rst

What's new in 1.3.0 (??)

Enhancements

Custom HTTP(s) headers when reading csv or json files

Other enhancements

Notable bug fixes

Assigning with `DataFrame.setitem` consistently creates a new array

Increased minimum versions for dependencies

Other API changes

Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Other

Contributors

Files

v1.3.0.rst

Latest commit

History

v1.3.0.rst

File metadata and controls

What's new in 1.3.0 (??)

Enhancements

Custom HTTP(s) headers when reading csv or json files

Other enhancements

Notable bug fixes

Assigning with DataFrame.__setitem__ consistently creates a new array

Increased minimum versions for dependencies

Other API changes

Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Period

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Other

Contributors

Assigning with `DataFrame.setitem` consistently creates a new array