These are the changes in pandas 1.3.0. See :ref:`release` for a full changelog including other versions of pandas.
{{ header }}
When reading from a remote URL that is not handled by fsspec (ie. HTTP and
HTTPS) the dictionary passed to storage_options
will be used to create the
headers included in the request. This can be used to control the User-Agent
header or send other custom headers (:issue:`36688`).
For example:
.. ipython:: python headers = {"User-Agent": "pandas"} df = pd.read_csv( "https://download.bls.gov/pub/time.series/cu/cu.item", sep="\t", storage_options=headers )
- :class:`Rolling` and :class:`Expanding` now support a
method
argument with a'table'
option that performs the windowing operation over an entire :class:`DataFrame`. See ref:window.overview for performance and functional benefits (:issue:`15095`, :issue:`38995`) - Added :meth:`MultiIndex.dtypes` (:issue:`37062`)
- Added
end
andend_day
options fororigin
in :meth:`DataFrame.resample` (:issue:`37804`) - Improve error message when
usecols
andnames
do not match for :func:`read_csv` andengine="c"
(:issue:`29042`) - Improved consistency of error message when passing an invalid
win_type
argument in :class:`Window` (:issue:`15969`) - :func:`pandas.read_sql_query` now accepts a
dtype
argument to cast the columnar data from the SQL database based on user input (:issue:`10285`) - Improved integer type mapping from pandas to SQLAlchemy when using :meth:`DataFrame.to_sql` (:issue:`35076`)
- :func:`to_numeric` now supports downcasting of nullable
ExtensionDtype
objects (:issue:`33013`) - Add support for dict-like names in :class:`MultiIndex.set_names` and :class:`MultiIndex.rename` (:issue:`20421`)
- :func:`pandas.read_excel` can now auto detect .xlsb files (:issue:`35416`)
- :meth:`.Rolling.sum`, :meth:`.Expanding.sum`, :meth:`.Rolling.mean`, :meth:`.Expanding.mean`, :meth:`.Rolling.median`, :meth:`.Expanding.median`, :meth:`.Rolling.max`, :meth:`.Expanding.max`, :meth:`.Rolling.min`, and :meth:`.Expanding.min` now support
Numba
execution with theengine
keyword (:issue:`38895`) - :meth:`DataFrame.apply` can now accept NumPy unary operators as strings, e.g.
df.apply("sqrt")
, which was already the case for :meth:`Series.apply` (:issue:`39116`) - :meth:`DataFrame.apply` can now accept non-callable DataFrame properties as strings, e.g.
df.apply("size")
, which was already the case for :meth:`Series.apply` (:issue:`39116`) - :meth:`Series.apply` can now accept list-like or dictionary-like arguments that aren't lists or dictionaries, e.g.
ser.apply(np.array(["sum", "mean"]))
, which was already the case for :meth:`DataFrame.apply` (:issue:`39140`) - :meth:`.Styler.set_tooltips` allows on hover tooltips to be added to styled HTML dataframes.
These are bug fixes that might have notable behavior changes.
Preserve dtypes in :meth:`~pandas.DataFrame.combine_first`
:meth:`~pandas.DataFrame.combine_first` will now preserve dtypes (:issue:`7509`)
.. ipython:: python df1 = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=[0, 1, 2]) df1 df2 = pd.DataFrame({"B": [4, 5, 6], "C": [1, 2, 3]}, index=[2, 3, 4]) df2 combined = df1.combine_first(df2)
pandas 1.2.x
In [1]: combined.dtypes
Out[2]:
A float64
B float64
C float64
dtype: object
pandas 1.3.0
.. ipython:: python combined.dtypes
Some minimum supported versions of dependencies were updated. If installed, we now require:
Package | Minimum Version | Required | Changed |
---|---|---|---|
numpy | 1.16.5 | X | |
pytz | 2017.3 | X | |
python-dateutil | 2.7.3 | X | |
bottleneck | 1.2.1 | ||
numexpr | 2.6.8 | ||
pytest (dev) | 5.0.1 | ||
mypy (dev) | 0.790 | X |
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
Package | Minimum Version | Changed |
---|---|---|
beautifulsoup4 | 4.6.0 | |
fastparquet | 0.3.2 | |
fsspec | 0.7.4 | |
gcsfs | 0.6.0 | |
lxml | 4.3.0 | |
matplotlib | 2.2.3 | |
numba | 0.46.0 | |
openpyxl | 2.6.0 | |
pyarrow | 0.15.0 | |
pymysql | 0.7.11 | |
pytables | 3.5.1 | |
s3fs | 0.4.0 | |
scipy | 1.2.0 | |
sqlalchemy | 1.2.8 | |
tabulate | 0.8.7 | X |
xarray | 0.12.0 | |
xlrd | 1.2.0 | |
xlsxwriter | 1.0.2 | |
xlwt | 1.3.0 | |
pandas-gbq | 0.12.0 |
See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.
- Partially initialized :class:`CategoricalDtype` (i.e. those with
categories=None
objects will no longer compare as equal to fully initialized dtype objects.
- Deprecated allowing scalars to be passed to the :class:`Categorical` constructor (:issue:`38433`)
- Deprecated allowing subclass-specific keyword arguments in the :class:`Index` constructor, use the specific subclass directly instead (:issue:`14093`, :issue:`21311`, :issue:`22315`, :issue:`26974`)
- Deprecated
astype
of datetimelike (timedelta64[ns]
,datetime64[ns]
,Datetime64TZDtype
,PeriodDtype
) to integer dtypes, usevalues.view(...)
instead (:issue:`38544`) - Deprecated :meth:`MultiIndex.is_lexsorted` and :meth:`MultiIndex.lexsort_depth`, use :meth:`MultiIndex.is_monotonic_increasing` instead (:issue:`32259`)
- Deprecated keyword
try_cast
in :meth:`Series.where`, :meth:`Series.mask`, :meth:`DataFrame.where`, :meth:`DataFrame.mask`; cast results manually if desired (:issue:`38836`) - Deprecated comparison of :class:`Timestamp` object with
datetime.date
objects. Instead of e.g.ts <= mydate
usets <= pd.Timestamp(mydate)
orts.date() <= mydate
(:issue:`36131`) - Deprecated :attr:`Rolling.win_type` returning
"freq"
(:issue:`38963`) - Deprecated :attr:`Rolling.is_datetimelike` (:issue:`38963`)
- Performance improvement in :meth:`IntervalIndex.isin` (:issue:`38353`)
- Performance improvement in :meth:`Series.mean` for nullable data types (:issue:`34814`)
- Bug in :class:`CategoricalIndex` incorrectly failing to raise
TypeError
when scalar data is passed (:issue:`38614`) - Bug in
CategoricalIndex.reindex
failed whenIndex
passed with elements all in category (:issue:`28690`) - Bug where constructing a :class:`Categorical` from an object-dtype array of
date
objects did not round-trip correctly withastype
(:issue:`38552`) - Bug in constructing a :class:`DataFrame` from an
ndarray
and a :class:`CategoricalDtype` (:issue:`38857`) - Bug in :meth:`DataFrame.reindex` was throwing
IndexError
when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`) - Bug in setting categorical values into an object-dtype column in a :class:`DataFrame` (:issue:`39136`)
- Bug in :meth:`DataFrame.reindex` was raising
IndexError
when new index contained duplicates and old index was :class:`CategoricalIndex` (:issue:`38906`)
- Bug in :class:`DataFrame` and :class:`Series` constructors sometimes dropping nanoseconds from :class:`Timestamp` (resp. :class:`Timedelta`)
data
, withdtype=datetime64[ns]
(resp.timedelta64[ns]
) (:issue:`38032`) - Bug in :meth:`DataFrame.first` and :meth:`Series.first` returning two months for offset one month when first day is last calendar day (:issue:`29623`)
- Bug in constructing a :class:`DataFrame` or :class:`Series` with mismatched
datetime64
data andtimedelta64
dtype, or vice-versa, failing to raiseTypeError
(:issue:`38575`, :issue:`38764`, :issue:`38792`) - Bug in constructing a :class:`Series` or :class:`DataFrame` with a
datetime
object out of bounds fordatetime64[ns]
dtype or atimedelta
object out of bounds fortimedelta64[ns]
dtype (:issue:`38792`, :issue:`38965`) - Bug in :meth:`DatetimeIndex.intersection`, :meth:`DatetimeIndex.symmetric_difference`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38741`)
- Bug in :meth:`Series.where` incorrectly casting
datetime64
values toint64
(:issue:`37682`) - Bug in :class:`Categorical` incorrectly typecasting
datetime
object toTimestamp
(:issue:`38878`)
- Bug in constructing :class:`Timedelta` from
np.timedelta64
objects with non-nanosecond units that are out of bounds fortimedelta64[ns]
(:issue:`38965`)
- Bug in :meth:`DataFrame.quantile`, :meth:`DataFrame.sort_values` causing incorrect subsequent indexing behavior (:issue:`38351`)
- Bug in :meth:`DataFrame.select_dtypes` with
include=np.number
now retains numericExtensionDtype
columns (:issue:`35340`) - Bug in :meth:`DataFrame.mode` and :meth:`Series.mode` not keeping consistent integer :class:`Index` for empty input (:issue:`33321`)
- Bug in :meth:`DataFrame.rank` with
np.inf
and mixture ofnp.nan
andnp.inf
(:issue:`32593`) - Bug in :meth:`DataFrame.rank` with
axis=0
and columns holding incomparable types raisingIndexError
(:issue:`38932`)
- Bug in :meth:`IntervalIndex.intersection` and :meth:`IntervalIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38653`, :issue:`38741`)
- Bug in :meth:`IntervalIndex.intersection` returning duplicates when at least one of both Indexes has duplicates which are present in the other (:issue:`38743`)
- Bug in :meth:`CategoricalIndex.get_indexer` failing to raise
InvalidIndexError
when non-unique (:issue:`38372`) - Bug in inserting many new columns into a :class:`DataFrame` causing incorrect subsequent indexing behavior (:issue:`38380`)
- Bug in :meth:`DataFrame.loc`, :meth:`Series.loc`, :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` returning incorrect elements for non-monotonic :class:`DatetimeIndex` for string slices (:issue:`33146`)
- Bug in :meth:`DataFrame.reindex` and :meth:`Series.reindex` with timezone aware indexes raising
TypeError
formethod="ffill"
andmethod="bfill"
and specifiedtolerance
(:issue:`38566`) - Bug in :meth:`DataFrame.__setitem__` raising
ValueError
with empty :class:`DataFrame` and specified columns for string indexer and non empty :class:`DataFrame` to set (:issue:`38831`) - Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
- Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)
- Bug in setting
timedelta64
values into numeric :class:`Series` failing to cast to object dtype (:issue:`39086`) - Bug in setting :class:`Interval` values into a :class:`Series` or :class:`DataFrame` with mismatched :class:`IntervalDtype` incorrectly casting the new values to the existing dtype (:issue:`39120`)
- Bug in setting
datetime64
values into a :class:`Series` with integer-dtype incorrect casting the datetime64 values to integers (:issue:`39266`)
- Bug in :class:`Grouper` now correctly propagates
dropna
argument and :meth:`DataFrameGroupBy.transform` now correctly handles missing values fordropna=True
(:issue:`35612`)
- Bug in :meth:`DataFrame.drop` raising
TypeError
when :class:`MultiIndex` is non-unique andlevel
is not provided (:issue:`36293`) - Bug in :meth:`MultiIndex.intersection` duplicating
NaN
in result (:issue:`38623`) - Bug in :meth:`MultiIndex.equals` incorrectly returning
True
when :class:`MultiIndex` containingNaN
even when they are differently ordered (:issue:`38439`) - Bug in :meth:`MultiIndex.intersection` always returning empty when intersecting with :class:`CategoricalIndex` (:issue:`38653`)
- Bug in :meth:`Index.__repr__` when
display.max_seq_items=1
(:issue:`38415`) - Bug in :func:`read_csv` not recognizing scientific notation if decimal is set for
engine="python"
(:issue:`31920`) - Bug in :func:`read_csv` interpreting
NA
value as comment, whenNA
does contain the comment string fixed forengine="python"
(:issue:`34002`) - Bug in :func:`read_csv` raising
IndexError
with multiple header columns andindex_col
specified when file has no data rows (:issue:`38292`) - Bug in :func:`read_csv` not accepting
usecols
with different length thannames
forengine="python"
(:issue:`16469`) - Bug in :meth:`read_csv` returning object dtype when
delimiter=","
withusecols
andparse_dates
specified forengine="python"
(:issue:`35873`) - Bug in :func:`read_csv` raising
TypeError
whennames
andparse_dates
is specified forengine="c"
(:issue:`33699`) - Bug in :func:`read_clipboard`, :func:`DataFrame.to_clipboard` not working in WSL (:issue:`38527`)
- Allow custom error values for parse_dates argument of :func:`read_sql`, :func:`read_sql_query` and :func:`read_sql_table` (:issue:`35185`)
- Bug in :func:`to_hdf` raising
KeyError
when trying to apply for subclasses ofDataFrame
orSeries
(:issue:`33748`) - Bug in :meth:`~HDFStore.put` raising a wrong
TypeError
when saving a DataFrame with non-string dtype (:issue:`34274`) - Bug in :func:`json_normalize` resulting in the first element of a generator object not being included in the returned
DataFrame
(:issue:`35923`) - Bug in :func:`read_excel` forward filling :class:`MultiIndex` names with multiple header and index columns specified (:issue:`34673`)
- :func:`read_excel` now respects :func:`set_option` (:issue:`34252`)
- Bug in :func:`read_csv` not switching
true_values
andfalse_values
for nullableboolean
dtype (:issue:`34655`) - Bug in :func:`read_json` when
orient="split"
does not maintain numeric string index (:issue:`28556`) - :meth:`read_sql` returned an empty generator if
chunksize
was no-zero and the query returned no results. Now returns a generator with a single empty dataframe (:issue:`34411`)
- Bug in :func:`scatter_matrix` raising when 2d
ax
argument passed (:issue:`16253`)
- Bug in :meth:`SeriesGroupBy.value_counts` where unobserved categories in a grouped categorical series were not tallied (:issue:`38672`)
- Bug in :meth:`.GroupBy.indices` would contain non-existent indices when null values were present in the groupby keys (:issue:`9304`)
- Fixed bug in :meth:`DataFrameGroupBy.sum` and :meth:`SeriesGroupBy.sum` causing loss of precision through using Kahan summation (:issue:`38778`)
- Fixed bug in :meth:`DataFrameGroupBy.cumsum`, :meth:`SeriesGroupBy.cumsum`, :meth:`DataFrameGroupBy.mean` and :meth:`SeriesGroupBy.mean` causing loss of precision through using Kahan summation (:issue:`38934`)
- Bug in :meth:`.Resampler.aggregate` and :meth:`DataFrame.transform` raising
TypeError
instead ofSpecificationError
when missing keys had mixed dtypes (:issue:`39025`)
- Bug in :func:`merge` raising error when performing an inner join with partial index and
right_index
when no overlap between indices (:issue:`33814`) - Bug in :meth:`DataFrame.unstack` with missing levels led to incorrect index names (:issue:`37510`)
- Bug in :func:`join` over :class:`MultiIndex` returned wrong result, when one of both indexes had only one level (:issue:`36909`)
- :meth:`merge_asof` raises
ValueError
instead of crypticTypeError
in case of non-numerical merge columns (:issue:`29130`) - Bug in :meth:`DataFrame.join` not assigning values correctly when having :class:`MultiIndex` where at least one dimension is from dtype
Categorical
with non-alphabetically sorted categories (:issue:`38502`)
- Bug in :meth:`DataFrame.sparse.to_coo` raising
KeyError
with columns that are a numeric :class:`Index` without a 0 (:issue:`18414`) - Bug in :meth:`SparseArray.astype` with
copy=False
producing incorrect results when going from integer dtype to floating dtype (:issue:`34456`)
- Bug in :meth:`DataFrame.where` when
other
is a :class:`Series` with :class:`ExtensionArray` dtype (:issue:`38729`) - Fixed bug where :meth:`Series.idxmax`, :meth:`Series.idxmin` and
argmax/min
fail when the underlying data is :class:`ExtensionArray` (:issue:`32749`, :issue:`33719`, :issue:`36566`)
- Bug in :class:`Index` constructor sometimes silently ignorning a specified
dtype
(:issue:`38879`)