These are the changes in pandas 2.0.0. See :ref:`release` for a full changelog including other versions of pandas.
{{ header }}
When installing pandas using pip, sets of optional dependencies can also be installed by specifying extras.
pip install "pandas[performance, aws]>=2.0.0"
The available extras, found in the :ref:`installation guide<install.dependencies>`, are
[all, performance, computation, timezone, fss, aws, gcp, excel, parquet, feather, hdf5, spss, postgresql, mysql,
sql-other, html, xml, plot, output_formatting, clipboard, compression, test]
(:issue:`39164`).
The use_nullable_dtypes
keyword argument has been expanded to the following functions to enable automatic conversion to nullable dtypes (:issue:`36712`)
Additionally a new global configuration, io.nullable_backend
can now be used in conjunction with the parameter use_nullable_dtypes=True
in the following functions
to select the nullable dtypes implementation.
- :func:`read_csv` (with
engine="pyarrow"
) - :func:`read_excel`
- :func:`read_parquet`
- :func:`read_orc`
By default, io.nullable_backend
is set to "pandas"
to return existing, numpy-backed nullable dtypes, but it can also
be set to "pyarrow"
to return pyarrow-backed, nullable :class:`ArrowDtype` (:issue:`48957`).
.. ipython:: python import io data = io.StringIO("""a,b,c,d,e,f,g,h,i 1,2.5,True,a,,,,, 3,4.5,False,b,6,7.5,True,a, """) with pd.option_context("io.nullable_backend", "pandas"): df = pd.read_csv(data, use_nullable_dtypes=True) df.dtypes data.seek(0) with pd.option_context("io.nullable_backend", "pyarrow"): df_pyarrow = pd.read_csv(data, use_nullable_dtypes=True, engine="pyarrow") df_pyarrow.dtypes
- :func:`read_sas` now supports using
encoding='infer'
to correctly read and use the encoding specified by the sas file. (:issue:`48048`) - :meth:`.DataFrameGroupBy.quantile` and :meth:`.SeriesGroupBy.quantile` now preserve nullable dtypes instead of casting to numpy dtypes (:issue:`37493`)
- :meth:`Series.add_suffix`, :meth:`DataFrame.add_suffix`, :meth:`Series.add_prefix` and :meth:`DataFrame.add_prefix` support an
axis
argument. Ifaxis
is set, the default behaviour of which axis to consider can be overwritten (:issue:`47819`) - :func:`assert_frame_equal` now shows the first element where the DataFrames differ, analogously to
pytest
's output (:issue:`47910`) - Added
index
parameter to :meth:`DataFrame.to_dict` (:issue:`46398`) - Added support for extension array dtypes in :func:`merge` (:issue:`44240`)
- Added metadata propagation for binary operators on :class:`DataFrame` (:issue:`28283`)
- :class:`.CategoricalConversionWarning`, :class:`.InvalidComparison`, :class:`.InvalidVersion`, :class:`.LossySetitemError`, and :class:`.NoBufferPresent` are now exposed in
pandas.errors
(:issue:`27656`) - Fix
test
optional_extra by adding missing test packagepytest-asyncio
(:issue:`48361`) - :func:`DataFrame.astype` exception message thrown improved to include column name when type conversion is not possible. (:issue:`47571`)
- :func:`date_range` now supports a
unit
keyword ("s", "ms", "us", or "ns") to specify the desired resolution of the output index (:issue:`49106`) - :func:`timedelta_range` now supports a
unit
keyword ("s", "ms", "us", or "ns") to specify the desired resolution of the output index (:issue:`49824`) - :meth:`DataFrame.to_json` now supports a
mode
keyword with supported inputs 'w' and 'a'. Defaulting to 'w', 'a' can be used when lines=True and orient='records' to append record oriented json lines to an existing json file. (:issue:`35849`) - Added
name
parameter to :meth:`IntervalIndex.from_breaks`, :meth:`IntervalIndex.from_arrays` and :meth:`IntervalIndex.from_tuples` (:issue:`48911`) - Added :meth:`Index.infer_objects` analogous to :meth:`Series.infer_objects` (:issue:`50034`)
- Added
copy
parameter to :meth:`Series.infer_objects` and :meth:`DataFrame.infer_objects`, passingFalse
will avoid making copies for series or columns that are already non-object or where no better dtype can be inferred (:issue:`50096`) - :meth:`DataFrame.plot.hist` now recognizes
xlabel
andylabel
arguments (:issue:`49793`)
These are bug fixes that might have notable behavior changes.
:meth:`.GroupBy.cumsum` and :meth:`.GroupBy.cumprod` overflow instead of lossy casting to float
In previous versions we cast to float when applying cumsum
and cumprod
which
lead to incorrect results even if the result could be hold by int64
dtype.
Additionally, the aggregation overflows consistent with numpy and the regular
:meth:`DataFrame.cumprod` and :meth:`DataFrame.cumsum` methods when the limit of
int64
is reached (:issue:`37493`).
Old Behavior
In [1]: df = pd.DataFrame({"key": ["b"] * 7, "value": 625})
In [2]: df.groupby("key")["value"].cumprod()[5]
Out[2]: 5.960464477539062e+16
We return incorrect results with the 6th value.
New Behavior
.. ipython:: python df = pd.DataFrame({"key": ["b"] * 7, "value": 625}) df.groupby("key")["value"].cumprod()
We overflow with the 7th value, but the 6th value is still correct.
:meth:`.DataFrameGroupBy.nth` and :meth:`.SeriesGroupBy.nth` now behave as filtrations
In previous versions of pandas, :meth:`.DataFrameGroupBy.nth` and
:meth:`.SeriesGroupBy.nth` acted as if they were aggregations. However, for most
inputs n
, they may return either zero or multiple rows per group. This means
that they are filtrations, similar to e.g. :meth:`.DataFrameGroupBy.head`. pandas
now treats them as filtrations (:issue:`13666`).
.. ipython:: python df = pd.DataFrame({"a": [1, 1, 2, 1, 2], "b": [np.nan, 2.0, 3.0, 4.0, 5.0]}) gb = df.groupby("a")
Old Behavior
In [5]: gb.nth(n=1)
Out[5]:
A B
1 1 2.0
4 2 5.0
New Behavior
.. ipython:: python gb.nth(n=1)
In particular, the index of the result is derived from the input by selecting
the appropriate rows. Also, when n
is larger than the group, no rows instead of
NaN
is returned.
Old Behavior
In [5]: gb.nth(n=3, dropna="any")
Out[5]:
B
A
1 NaN
2 NaN
New Behavior
.. ipython:: python gb.nth(n=3, dropna="any")
In past versions, a :class:`Series` (e.g. ser[1:2]
or ser[2:]
) was _usually_
positional but not always. Starting in pandas 2.0, standard integer slices are always
treated as being positional in :meth:`Series.__getitem__` and :meth:`Series.__setitem__`.
Importantly, this means the deprecation in (:issue:`45324`) is reverted.
Previous behavior:
In [5]: ser = pd.Series(range(10), index=[x / 2 for x in range(10)])
In [6]: ser[1:3]
Out[6]:
1.0 2
1.5 3
2.0 4
2.5 5
3.0 6
dtype: int64
New behavior:
.. ipython:: python ser = pd.Series(range(10), index=[x / 2 for x in range(10)]) ser[1:3]
To treat slice keys as labels, explicitly use loc
e.g. ser.loc[1:3]
(:issue:`49612`).
In past versions, when constructing a :class:`Series` or :class:`DataFrame` and passing a "datetime64" or "timedelta64" dtype with unsupported resolution (i.e. anything other than "ns"), pandas would silently replace the given dtype with its nanosecond analogue:
Previous behavior:
In [5]: pd.Series(["2016-01-01"], dtype="datetime64[s]")
Out[5]:
0 2016-01-01
dtype: datetime64[ns]
In [6] pd.Series(["2016-01-01"], dtype="datetime64[D]")
Out[6]:
0 2016-01-01
dtype: datetime64[ns]
In pandas 2.0 we support resolutions "s", "ms", "us", and "ns". When passing a supported dtype (e.g. "datetime64[s]"), the result now has exactly the requested dtype:
New behavior:
.. ipython:: python pd.Series(["2016-01-01"], dtype="datetime64[s]")
With an un-supported dtype, pandas now raises instead of silently swapping in a supported dtype:
New behavior:
.. ipython:: python :okexcept: pd.Series(["2016-01-01"], dtype="datetime64[D]")
In previous versions, converting a :class:`Series` or :class:`DataFrame`
from datetime64[ns]
to a different datetime64[X]
dtype would return
with datetime64[ns]
dtype instead of the requested dtype. In pandas 2.0,
support is added for "datetime64[s]", "datetime64[ms]", and "datetime64[us]" dtypes,
so converting to those dtypes gives exactly the requested dtype:
Previous behavior:
.. ipython:: python idx = pd.date_range("2016-01-01", periods=3) ser = pd.Series(idx)
Previous behavior:
In [4]: ser.astype("datetime64[s]")
Out[4]:
0 2016-01-01
1 2016-01-02
2 2016-01-03
dtype: datetime64[ns]
With the new behavior, we get exactly the requested dtype:
New behavior:
.. ipython:: python ser.astype("datetime64[s]")
For non-supported resolutions e.g. "datetime64[D]", we raise instead of silently ignoring the requested dtype:
New behavior:
.. ipython:: python :okexcept: ser.astype("datetime64[D]")
For conversion from timedelta64[ns]
dtypes, the old behavior converted
to a floating point format.
Previous behavior:
.. ipython:: python idx = pd.timedelta_range("1 Day", periods=3) ser = pd.Series(idx)
Previous behavior:
In [7]: ser.astype("timedelta64[s]")
Out[7]:
0 86400.0
1 172800.0
2 259200.0
dtype: float64
In [8]: ser.astype("timedelta64[D]")
Out[8]:
0 1.0
1 2.0
2 3.0
dtype: float64
The new behavior, as for datetime64, either gives exactly the requested dtype or raises:
New behavior:
.. ipython:: python :okexcept: ser.astype("timedelta64[s]") ser.astype("timedelta64[D]")
Before, constructing an empty (where data
is None
or an empty list-like argument) :class:`Series` or :class:`DataFrame` without
specifying the axes (index=None
, columns=None
) would return the axes as empty :class:`Index` with object dtype.
Now, the axes return an empty :class:`RangeIndex`.
Previous behavior:
In [8]: pd.Series().index
Out[8]:
Index([], dtype='object')
In [9] pd.DataFrame().axes
Out[9]:
[Index([], dtype='object'), Index([], dtype='object')]
New behavior:
.. ipython:: python pd.Series().index pd.DataFrame().axes
Some minimum supported versions of dependencies were updated. If installed, we now require:
Package | Minimum Version | Required | Changed |
---|---|---|---|
mypy (dev) | 0.990 | X | |
python-dateutil | 2.8.2 | X | X |
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
Package | Minimum Version | Changed |
---|---|---|
pyarrow | 6.0.0 | X |
matplotlib | 3.6.1 | X |
fastparquet | 0.6.3 | X |
xarray | 0.21.0 | X |
See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.
- The
freq
,tz
,nanosecond
, andunit
keywords in the :class:`Timestamp` constructor are now keyword-only (:issue:`45307`) - Passing
nanoseconds
greater than 999 or less than 0 in :class:`Timestamp` now raises aValueError
(:issue:`48538`, :issue:`48255`) - :func:`read_csv`: specifying an incorrect number of columns with
index_col
of now raisesParserError
instead ofIndexError
when using the c parser. - Default value of
dtype
in :func:`get_dummies` is changed tobool
fromuint8
(:issue:`45848`) - :meth:`DataFrame.astype`, :meth:`Series.astype`, and :meth:`DatetimeIndex.astype` casting datetime64 data to any of "datetime64[s]", "datetime64[ms]", "datetime64[us]" will return an object with the given resolution instead of coercing back to "datetime64[ns]" (:issue:`48928`)
- :meth:`DataFrame.astype`, :meth:`Series.astype`, and :meth:`DatetimeIndex.astype` casting timedelta64 data to any of "timedelta64[s]", "timedelta64[ms]", "timedelta64[us]" will return an object with the given resolution instead of coercing to "float64" dtype (:issue:`48963`)
- :meth:`Index.astype` now allows casting from
float64
dtype to datetime-like dtypes, matching :class:`Series` behavior (:issue:`49660`) - Passing data with dtype of "timedelta64[s]", "timedelta64[ms]", or "timedelta64[us]" to :class:`TimedeltaIndex`, :class:`Series`, or :class:`DataFrame` constructors will now retain that dtype instead of casting to "timedelta64[ns]"; timedelta64 data with lower resolution will be cast to the lowest supported resolution "timedelta64[s]" (:issue:`49014`)
- Passing
dtype
of "timedelta64[s]", "timedelta64[ms]", or "timedelta64[us]" to :class:`TimedeltaIndex`, :class:`Series`, or :class:`DataFrame` constructors will now retain that dtype instead of casting to "timedelta64[ns]"; passing a dtype with lower resolution for :class:`Series` or :class:`DataFrame` will be cast to the lowest supported resolution "timedelta64[s]" (:issue:`49014`) - Passing a
np.datetime64
object with non-nanosecond resolution to :class:`Timestamp` will retain the input resolution if it is "s", "ms", or "ns"; otherwise it will be cast to the closest supported resolution (:issue:`49008`) - The
other
argument in :meth:`DataFrame.mask` and :meth:`Series.mask` now defaults tono_default
instead ofnp.nan
consistent with :meth:`DataFrame.where` and :meth:`Series.where`. Entries will be filled with the corresponding NULL value (np.nan
for numpy dtypes,pd.NA
for extension dtypes). (:issue:`49111`) - Changed behavior of :meth:`Series.quantile` and :meth:`DataFrame.quantile` with :class:`SparseDtype` to retain sparse dtype (:issue:`49583`)
- When creating a :class:`Series` with a object-dtype :class:`Index` of datetime objects, pandas no longer silently converts the index to a :class:`DatetimeIndex` (:issue:`39307`, :issue:`23598`)
- :meth:`Series.unique` with dtype "timedelta64[ns]" or "datetime64[ns]" now returns :class:`TimedeltaArray` or :class:`DatetimeArray` instead of
numpy.ndarray
(:issue:`49176`) - :func:`to_datetime` and :class:`DatetimeIndex` now allow sequences containing both
datetime
objects and numeric entries, matching :class:`Series` behavior (:issue:`49037`) - :func:`pandas.api.dtypes.is_string_dtype` now only returns
True
for array-likes withdtype=object
when the elements are inferred to be strings (:issue:`15585`) - Passing a sequence containing
datetime
objects anddate
objects to :class:`Series` constructor will return withobject
dtype instead ofdatetime64[ns]
dtype, consistent with :class:`Index` behavior (:issue:`49341`) - Passing strings that cannot be parsed as datetimes to :class:`Series` or :class:`DataFrame` with
dtype="datetime64[ns]"
will raise instead of silently ignoring the keyword and returningobject
dtype (:issue:`24435`) - Passing a sequence containing a type that cannot be converted to :class:`Timedelta` to :func:`to_timedelta` or to the :class:`Series` or :class:`DataFrame` constructor with
dtype="timedelta64[ns]"
or to :class:`TimedeltaIndex` now raisesTypeError
instead ofValueError
(:issue:`49525`) - Changed behavior of :class:`Index` constructor with sequence containing at least one
NaT
and everything else eitherNone
orNaN
to inferdatetime64[ns]
dtype instead ofobject
, matching :class:`Series` behavior (:issue:`49340`) - :func:`read_stata` with parameter
index_col
set toNone
(the default) will now set the index on the returned :class:`DataFrame` to a :class:`RangeIndex` instead of a :class:`Int64Index` (:issue:`49745`) - Changed behavior of :class:`Index`, :class:`Series`, and :class:`DataFrame` arithmetic methods when working with object-dtypes, the results no longer do type inference on the result of the array operations, use
result.infer_objects()
to do type inference on the result (:issue:`49999`) - Changed behavior of :class:`Index` constructor with an object-dtype
numpy.ndarray
containing all-bool
values or all-complex values, this will now retain object dtype, consistent with the :class:`Series` behavior (:issue:`49594`) - Changed behavior of :meth:`DataFrame.shift` with
axis=1
, an integerfill_value
, and homogeneous datetime-like dtype, this now fills new columns with integer dtypes instead of casting to datetimelike (:issue:`49842`) - Files are now closed when encountering an exception in :func:`read_json` (:issue:`49921`)
- Changed behavior of :func:`read_csv`, :func:`read_json` & :func:`read_fwf`, where the index will now always be a :class:`RangeIndex`, when no index is specified. Previously the index would be a :class:`Index` with dtype
object
if the new DataFrame/Series has length 0 (:issue:`49572`) - :meth:`DataFrame.values`, :meth:`DataFrame.to_numpy`, :meth:`DataFrame.xs`, :meth:`DataFrame.reindex`, :meth:`DataFrame.fillna`, and :meth:`DataFrame.replace` no longer silently consolidate the underlying arrays; do
df = df.copy()
to ensure consolidation (:issue:`49356`) - Creating a new DataFrame using a full slice on both axes with :attr:`~DataFrame.loc`
or :attr:`~DataFrame.iloc` (thus,
df.loc[:, :]
ordf.iloc[:, :]
) now returns a new DataFrame (shallow copy) instead of the original DataFrame, consistent with other methods to get a full slice (for exampledf.loc[:]
ordf[:]
) (:issue:`49469`)
- Removed deprecated :attr:`Timestamp.freq`, :attr:`Timestamp.freqstr` and argument
freq
from the :class:`Timestamp` constructor and :meth:`Timestamp.fromordinal` (:issue:`14146`) - Removed deprecated :class:`CategoricalBlock`, :meth:`Block.is_categorical`, require datetime64 and timedelta64 values to be wrapped in :class:`DatetimeArray` or :class:`TimedeltaArray` before passing to :meth:`Block.make_block_same_class`, require
DatetimeTZBlock.values
to have the correct ndim when passing to the :class:`BlockManager` constructor, and removed the "fastpath" keyword from the :class:`SingleBlockManager` constructor (:issue:`40226`, :issue:`40571`) - Removed deprecated global option
use_inf_as_null
in favor ofuse_inf_as_na
(:issue:`17126`) - Removed deprecated module
pandas.core.index
(:issue:`30193`) - Removed deprecated alias
pandas.core.tools.datetimes.to_time
, import the function directly frompandas.core.tools.times
instead (:issue:`34145`) - Removed deprecated :meth:`Categorical.to_dense`, use
np.asarray(cat)
instead (:issue:`32639`) - Removed deprecated :meth:`Categorical.take_nd` (:issue:`27745`)
- Removed deprecated :meth:`Categorical.mode`, use
Series(cat).mode()
instead (:issue:`45033`) - Removed deprecated :meth:`Categorical.is_dtype_equal` and :meth:`CategoricalIndex.is_dtype_equal` (:issue:`37545`)
- Removed deprecated :meth:`CategoricalIndex.take_nd` (:issue:`30702`)
- Removed deprecated :meth:`Index.is_type_compatible` (:issue:`42113`)
- Removed deprecated :meth:`Index.is_mixed`, check
index.inferred_type
directly instead (:issue:`32922`) - Removed deprecated :func:`pandas.api.types.is_categorical`; use :func:`pandas.api.types.is_categorical_dtype` instead (:issue:`33385`)
- Removed deprecated :meth:`Index.asi8` (:issue:`37877`)
- Enforced deprecation changing behavior when passing
datetime64[ns]
dtype data and timezone-aware dtype to :class:`Series`, interpreting the values as wall-times instead of UTC times, matching :class:`DatetimeIndex` behavior (:issue:`41662`) - Removed deprecated :meth:`DataFrame._AXIS_NUMBERS`, :meth:`DataFrame._AXIS_NAMES`, :meth:`Series._AXIS_NUMBERS`, :meth:`Series._AXIS_NAMES` (:issue:`33637`)
- Removed deprecated :meth:`Index.to_native_types`, use
obj.astype(str)
instead (:issue:`36418`) - Removed deprecated :meth:`Series.iteritems`, :meth:`DataFrame.iteritems`, use
obj.items
instead (:issue:`45321`) - Removed deprecated :meth:`DataFrame.lookup` (:issue:`35224`)
- Removed deprecated :meth:`Series.append`, :meth:`DataFrame.append`, use :func:`concat` instead (:issue:`35407`)
- Removed deprecated :meth:`Series.iteritems`, :meth:`DataFrame.iteritems` and :meth:`HDFStore.iteritems` use
obj.items
instead (:issue:`45321`) - Removed deprecated :meth:`DatetimeIndex.union_many` (:issue:`45018`)
- Removed deprecated
weekofyear
andweek
attributes of :class:`DatetimeArray`, :class:`DatetimeIndex` anddt
accessor in favor ofisocalendar().week
(:issue:`33595`) - Removed deprecated :meth:`RangeIndex._start`, :meth:`RangeIndex._stop`, :meth:`RangeIndex._step`, use
start
,stop
,step
instead (:issue:`30482`) - Removed deprecated :meth:`DatetimeIndex.to_perioddelta`, Use
dtindex - dtindex.to_period(freq).to_timestamp()
instead (:issue:`34853`) - Removed deprecated :meth:`.Styler.hide_index` and :meth:`.Styler.hide_columns` (:issue:`49397`)
- Removed deprecated :meth:`.Styler.set_na_rep` and :meth:`.Styler.set_precision` (:issue:`49397`)
- Removed deprecated :meth:`.Styler.where` (:issue:`49397`)
- Removed deprecated :meth:`.Styler.render` (:issue:`49397`)
- Removed deprecated argument
null_color
in :meth:`.Styler.highlight_null` (:issue:`49397`) - Removed deprecated argument
check_less_precise
in :meth:`.testing.assert_frame_equal`, :meth:`.testing.assert_extension_array_equal`, :meth:`.testing.assert_series_equal`, :meth:`.testing.assert_index_equal` (:issue:`30562`) - Removed deprecated
null_counts
argument in :meth:`DataFrame.info`. Useshow_counts
instead (:issue:`37999`) - Removed deprecated :meth:`Index.is_monotonic`, and :meth:`Series.is_monotonic`; use
obj.is_monotonic_increasing
instead (:issue:`45422`) - Removed deprecated :meth:`Index.is_all_dates` (:issue:`36697`)
- Enforced deprecation disallowing passing a timezone-aware :class:`Timestamp` and
dtype="datetime64[ns]"
to :class:`Series` or :class:`DataFrame` constructors (:issue:`41555`) - Enforced deprecation disallowing passing a sequence of timezone-aware values and
dtype="datetime64[ns]"
to to :class:`Series` or :class:`DataFrame` constructors (:issue:`41555`) - Enforced deprecation disallowing
numpy.ma.mrecords.MaskedRecords
in the :class:`DataFrame` constructor; pass"{name: data[name] for name in data.dtype.names}
instead (:issue:`40363`) - Enforced deprecation disallowing unit-less "datetime64" dtype in :meth:`Series.astype` and :meth:`DataFrame.astype` (:issue:`47844`)
- Enforced deprecation disallowing using
.astype
to convert adatetime64[ns]
:class:`Series`, :class:`DataFrame`, or :class:`DatetimeIndex` to timezone-aware dtype, useobj.tz_localize
orser.dt.tz_localize
instead (:issue:`39258`) - Enforced deprecation disallowing using
.astype
to convert a timezone-aware :class:`Series`, :class:`DataFrame`, or :class:`DatetimeIndex` to timezone-naivedatetime64[ns]
dtype, useobj.tz_localize(None)
orobj.tz_convert("UTC").tz_localize(None)
instead (:issue:`39258`) - Enforced deprecation disallowing passing non boolean argument to sort in :func:`concat` (:issue:`44629`)
- Removed Date parser functions :func:`~pandas.io.date_converters.parse_date_time`, :func:`~pandas.io.date_converters.parse_date_fields`, :func:`~pandas.io.date_converters.parse_all_fields` and :func:`~pandas.io.date_converters.generic_parser` (:issue:`24518`)
- Removed argument
index
from the :class:`core.arrays.SparseArray` constructor (:issue:`43523`) - Remove argument
squeeze
from :meth:`DataFrame.groupby` and :meth:`Series.groupby` (:issue:`32380`) - Removed deprecated
apply
,apply_index
,__call__
,onOffset
, andisAnchored
attributes from :class:`DateOffset` (:issue:`34171`) - Removed
keep_tz
argument in :meth:`DatetimeIndex.to_series` (:issue:`29731`) - Remove arguments
names
anddtype
from :meth:`Index.copy` andlevels
andcodes
from :meth:`MultiIndex.copy` (:issue:`35853`, :issue:`36685`) - Remove argument
inplace
from :meth:`MultiIndex.set_levels` and :meth:`MultiIndex.set_codes` (:issue:`35626`) - Removed arguments
verbose
andencoding
from :meth:`DataFrame.to_excel` and :meth:`Series.to_excel` (:issue:`47912`) - Removed argument
line_terminator
from :meth:`DataFrame.to_csv` and :meth:`Series.to_csv`, uselineterminator
instead (:issue:`45302`) - Removed argument
inplace
from :meth:`DataFrame.set_axis` and :meth:`Series.set_axis`, useobj = obj.set_axis(..., copy=False)
instead (:issue:`48130`) - Disallow passing positional arguments to :meth:`MultiIndex.set_levels` and :meth:`MultiIndex.set_codes` (:issue:`41485`)
- Disallow parsing to Timedelta strings with components with units "Y", "y", or "M", as these do not represent unambiguous durations (:issue:`36838`)
- Removed :meth:`MultiIndex.is_lexsorted` and :meth:`MultiIndex.lexsort_depth` (:issue:`38701`)
- Removed argument
how
from :meth:`PeriodIndex.astype`, use :meth:`PeriodIndex.to_timestamp` instead (:issue:`37982`) - Removed argument
try_cast
from :meth:`DataFrame.mask`, :meth:`DataFrame.where`, :meth:`Series.mask` and :meth:`Series.where` (:issue:`38836`) - Removed argument
tz
from :meth:`Period.to_timestamp`, useobj.to_timestamp(...).tz_localize(tz)
instead (:issue:`34522`) - Removed argument
sort_columns
in :meth:`DataFrame.plot` and :meth:`Series.plot` (:issue:`47563`) - Removed argument
is_copy
from :meth:`DataFrame.take` and :meth:`Series.take` (:issue:`30615`) - Removed argument
kind
from :meth:`Index.get_slice_bound`, :meth:`Index.slice_indexer` and :meth:`Index.slice_locs` (:issue:`41378`) - Removed arguments
prefix
,squeeze
,error_bad_lines
andwarn_bad_lines
from :func:`read_csv` (:issue:`40413`, :issue:`43427`) - Removed argument
datetime_is_numeric
from :meth:`DataFrame.describe` and :meth:`Series.describe` as datetime data will always be summarized as numeric data (:issue:`34798`) - Disallow passing list
key
to :meth:`Series.xs` and :meth:`DataFrame.xs`, pass a tuple instead (:issue:`41789`) - Disallow subclass-specific keywords (e.g. "freq", "tz", "names", "closed") in the :class:`Index` constructor (:issue:`38597`)
- Removed argument
inplace
from :meth:`Categorical.remove_unused_categories` (:issue:`37918`) - Disallow passing non-round floats to :class:`Timestamp` with
unit="M"
orunit="Y"
(:issue:`47266`) - Remove keywords
convert_float
andmangle_dupe_cols
from :func:`read_excel` (:issue:`41176`) - Remove keyword
mangle_dupe_cols
from :func:`read_csv` and :func:`read_table` (:issue:`48137`) - Removed
errors
keyword from :meth:`DataFrame.where`, :meth:`Series.where`, :meth:`DataFrame.mask` and :meth:`Series.mask` (:issue:`47728`) - Disallow passing non-keyword arguments to :func:`read_excel` except
io
andsheet_name
(:issue:`34418`) - Disallow passing non-keyword arguments to :meth:`DataFrame.drop` and :meth:`Series.drop` except
labels
(:issue:`41486`) - Disallow passing non-keyword arguments to :meth:`DataFrame.fillna` and :meth:`Series.fillna` except
value
(:issue:`41485`) - Disallow passing non-keyword arguments to :meth:`StringMethods.split` and :meth:`StringMethods.rsplit` except for
pat
(:issue:`47448`) - Disallow passing non-keyword arguments to :meth:`DataFrame.set_index` except
keys
(:issue:`41495`) - Disallow passing non-keyword arguments to :meth:`Resampler.interpolate` except
method
(:issue:`41699`) - Disallow passing non-keyword arguments to :meth:`DataFrame.reset_index` and :meth:`Series.reset_index` except
level
(:issue:`41496`) - Disallow passing non-keyword arguments to :meth:`DataFrame.dropna` and :meth:`Series.dropna` (:issue:`41504`)
- Disallow passing non-keyword arguments to :meth:`ExtensionArray.argsort` (:issue:`46134`)
- Disallow passing non-keyword arguments to :meth:`Categorical.sort_values` (:issue:`47618`)
- Disallow passing non-keyword arguments to :meth:`Index.drop_duplicates` and :meth:`Series.drop_duplicates` (:issue:`41485`)
- Disallow passing non-keyword arguments to :meth:`DataFrame.drop_duplicates` except for
subset
(:issue:`41485`) - Disallow passing non-keyword arguments to :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` (:issue:`41506`)
- Disallow passing non-keyword arguments to :meth:`DataFrame.interpolate` and :meth:`Series.interpolate` except for
method
(:issue:`41510`) - Disallow passing non-keyword arguments to :meth:`DataFrame.any` and :meth:`Series.any` (:issue:`44896`)
- Disallow passing non-keyword arguments to :meth:`Index.set_names` except for
names
(:issue:`41551`) - Disallow passing non-keyword arguments to :meth:`Index.join` except for
other
(:issue:`46518`) - Disallow passing non-keyword arguments to :func:`concat` except for
objs
(:issue:`41485`) - Disallow passing non-keyword arguments to :func:`pivot` except for
data
(:issue:`48301`) - Disallow passing non-keyword arguments to :meth:`DataFrame.pivot` (:issue:`48301`)
- Disallow passing non-keyword arguments to :func:`read_html` except for
io
(:issue:`27573`) - Disallow passing non-keyword arguments to :func:`read_json` except for
path_or_buf
(:issue:`27573`) - Disallow passing non-keyword arguments to :func:`read_sas` except for
filepath_or_buffer
(:issue:`47154`) - Disallow passing non-keyword arguments to :func:`read_stata` except for
filepath_or_buffer
(:issue:`48128`) - Disallow passing non-keyword arguments to :func:`read_csv` except
filepath_or_buffer
(:issue:`41485`) - Disallow passing non-keyword arguments to :func:`read_table` except
filepath_or_buffer
(:issue:`41485`) - Disallow passing non-keyword arguments to :func:`read_fwf` except
filepath_or_buffer
(:issue:`44710`) - Disallow passing non-keyword arguments to :func:`read_xml` except for
path_or_buffer
(:issue:`45133`) - Disallow passing non-keyword arguments to :meth:`Series.mask` and :meth:`DataFrame.mask` except
cond
andother
(:issue:`41580`) - Disallow passing non-keyword arguments to :meth:`DataFrame.to_stata` except for
path
(:issue:`48128`) - Disallow passing non-keyword arguments to :meth:`DataFrame.where` and :meth:`Series.where` except for
cond
andother
(:issue:`41523`) - Disallow passing non-keyword arguments to :meth:`Series.set_axis` and :meth:`DataFrame.set_axis` except for
labels
(:issue:`41491`) - Disallow passing non-keyword arguments to :meth:`Series.rename_axis` and :meth:`DataFrame.rename_axis` except for
mapper
(:issue:`47587`) - Disallow passing non-keyword arguments to :meth:`Series.clip` and :meth:`DataFrame.clip` (:issue:`41511`)
- Disallow passing non-keyword arguments to :meth:`Series.bfill`, :meth:`Series.ffill`, :meth:`DataFrame.bfill` and :meth:`DataFrame.ffill` (:issue:`41508`)
- Disallow passing non-keyword arguments to :meth:`DataFrame.replace`, :meth:`Series.replace` except for
to_replace
andvalue
(:issue:`47587`) - Disallow passing non-keyword arguments to :meth:`DataFrame.sort_values` except for
by
(:issue:`41505`) - Disallow passing non-keyword arguments to :meth:`Series.sort_values` (:issue:`41505`)
- Disallow :meth:`Index.reindex` with non-unique :class:`Index` objects (:issue:`42568`)
- Disallowed constructing :class:`Categorical` with scalar
data
(:issue:`38433`) - Disallowed constructing :class:`CategoricalIndex` without passing
data
(:issue:`38944`) - Removed :meth:`.Rolling.validate`, :meth:`.Expanding.validate`, and :meth:`.ExponentialMovingWindow.validate` (:issue:`43665`)
- Removed :attr:`Rolling.win_type` returning
"freq"
(:issue:`38963`) - Removed :attr:`Rolling.is_datetimelike` (:issue:`38963`)
- Removed the
level
keyword in :class:`DataFrame` and :class:`Series` aggregations; usegroupby
instead (:issue:`39983`) - Removed deprecated :meth:`Timedelta.delta`, :meth:`Timedelta.is_populated`, and :attr:`Timedelta.freq` (:issue:`46430`, :issue:`46476`)
- Removed deprecated :attr:`NaT.freq` (:issue:`45071`)
- Removed deprecated :meth:`Categorical.replace`, use :meth:`Series.replace` instead (:issue:`44929`)
- Removed the
numeric_only
keyword from :meth:`Categorical.min` and :meth:`Categorical.max` in favor ofskipna
(:issue:`48821`) - Changed behavior of :meth:`DataFrame.median` and :meth:`DataFrame.mean` with
numeric_only=None
to not exclude datetime-like columns THIS NOTE WILL BE IRRELEVANT ONCEnumeric_only=None
DEPRECATION IS ENFORCED (:issue:`29941`) - Removed :func:`is_extension_type` in favor of :func:`is_extension_array_dtype` (:issue:`29457`)
- Removed
.ExponentialMovingWindow.vol
(:issue:`39220`) - Removed :meth:`Index.get_value` and :meth:`Index.set_value` (:issue:`33907`, :issue:`28621`)
- Removed :meth:`Series.slice_shift` and :meth:`DataFrame.slice_shift` (:issue:`37601`)
- Remove :meth:`DataFrameGroupBy.pad` and :meth:`DataFrameGroupBy.backfill` (:issue:`45076`)
- Remove
numpy
argument from :func:`read_json` (:issue:`30636`) - Disallow passing abbreviations for
orient
in :meth:`DataFrame.to_dict` (:issue:`32516`) - Disallow partial slicing on an non-monotonic :class:`DatetimeIndex` with keys which are not in Index. This now raises a
KeyError
(:issue:`18531`) - Removed
get_offset
in favor of :func:`to_offset` (:issue:`30340`) - Removed the
warn
keyword in :func:`infer_freq` (:issue:`45947`) - Removed the
include_start
andinclude_end
arguments in :meth:`DataFrame.between_time` in favor ofinclusive
(:issue:`43248`) - Removed the
closed
argument in :meth:`date_range` and :meth:`bdate_range` in favor ofinclusive
argument (:issue:`40245`) - Removed the
center
keyword in :meth:`DataFrame.expanding` (:issue:`20647`) - Removed the
truediv
keyword from :func:`eval` (:issue:`29812`) - Removed the
method
andtolerance
arguments in :meth:`Index.get_loc`. Useindex.get_indexer([label], method=..., tolerance=...)
instead (:issue:`42269`) - Removed the
pandas.datetime
submodule (:issue:`30489`) - Removed the
pandas.np
submodule (:issue:`30296`) - Removed
pandas.util.testing
in favor ofpandas.testing
(:issue:`30745`) - Removed :meth:`Series.str.__iter__` (:issue:`28277`)
- Removed
pandas.SparseArray
in favor of :class:`arrays.SparseArray` (:issue:`30642`) - Removed
pandas.SparseSeries
andpandas.SparseDataFrame
, including pickle support. (:issue:`30642`) - Enforced disallowing passing an integer
fill_value
to :meth:`DataFrame.shift` and :meth:`Series.shift`` with datetime64, timedelta64, or period dtypes (:issue:`32591`) - Enforced disallowing a string column label into
times
in :meth:`DataFrame.ewm` (:issue:`43265`) - Enforced disallowing passing
True
andFalse
intoinclusive
in :meth:`Series.between` in favor of"both"
and"neither"
respectively (:issue:`40628`) - Enforced disallowing using
usecols
with out of bounds indices forread_csv
withengine="c"
(:issue:`25623`) - Enforced disallowing the use of
**kwargs
in :class:`.ExcelWriter`; use the keyword argumentengine_kwargs
instead (:issue:`40430`) - Enforced disallowing a tuple of column labels into :meth:`.DataFrameGroupBy.__getitem__` (:issue:`30546`)
- Enforced disallowing missing labels when indexing with a sequence of labels on a level of a :class:`MultiIndex`. This now raises a
KeyError
(:issue:`42351`) - Enforced disallowing setting values with
.loc
using a positional slice. Use.loc
with labels or.iloc
with positions instead (:issue:`31840`) - Enforced disallowing positional indexing with a
float
key even if that key is a round number, manually cast to integer instead (:issue:`34193`) - Enforced disallowing using a :class:`DataFrame` indexer with
.iloc
, use.loc
instead for automatic alignment (:issue:`39022`) - Enforced disallowing
set
ordict
indexers in__getitem__
and__setitem__
methods (:issue:`42825`) - Enforced disallowing indexing on a :class:`Index` or positional indexing on a :class:`Series` producing multi-dimensional objects e.g.
obj[:, None]
, convert to numpy before indexing instead (:issue:`35141`) - Enforced disallowing
dict
orset
objects insuffixes
in :func:`merge` (:issue:`34810`) - Enforced disallowing :func:`merge` to produce duplicated columns through the
suffixes
keyword and already existing columns (:issue:`22818`) - Enforced disallowing using :func:`merge` or :func:`join` on a different number of levels (:issue:`34862`)
- Enforced disallowing
value_name
argument in :func:`DataFrame.melt` to match an element in the :class:`DataFrame` columns (:issue:`35003`) - Enforced disallowing passing
showindex
into**kwargs
in :func:`DataFrame.to_markdown` and :func:`Series.to_markdown` in favor ofindex
(:issue:`33091`) - Removed setting Categorical._codes directly (:issue:`41429`)
- Removed setting Categorical.categories directly (:issue:`47834`)
- Removed argument
inplace
from :meth:`Categorical.add_categories`, :meth:`Categorical.remove_categories`, :meth:`Categorical.set_categories`, :meth:`Categorical.rename_categories`, :meth:`Categorical.reorder_categories`, :meth:`Categorical.set_ordered`, :meth:`Categorical.as_ordered`, :meth:`Categorical.as_unordered` (:issue:`37981`, :issue:`41118`, :issue:`41133`, :issue:`47834`) - Enforced :meth:`Rolling.count` with
min_periods=None
to default to the size of the window (:issue:`31302`) - Renamed
fname
topath
in :meth:`DataFrame.to_parquet`, :meth:`DataFrame.to_stata` and :meth:`DataFrame.to_feather` (:issue:`30338`) - Enforced disallowing indexing a :class:`Series` with a single item list with a slice (e.g.
ser[[slice(0, 2)]]
). Either convert the list to tuple, or pass the slice directly instead (:issue:`31333`) - Changed behavior indexing on a :class:`DataFrame` with a :class:`DatetimeIndex` index using a string indexer, previously this operated as a slice on rows, now it operates like any other column key; use
frame.loc[key]
for the old behavior (:issue:`36179`) - Enforced the
display.max_colwidth
option to not accept negative integers (:issue:`31569`) - Removed the
display.column_space
option in favor ofdf.to_string(col_space=...)
(:issue:`47280`) - Removed the deprecated method
mad
from pandas classes (:issue:`11787`) - Removed the deprecated method
tshift
from pandas classes (:issue:`11631`) - Changed behavior of empty data passed into :class:`Series`; the default dtype will be
object
instead offloat64
(:issue:`29405`) - Changed the behavior of :meth:`DatetimeIndex.union`, :meth:`DatetimeIndex.intersection`, and :meth:`DatetimeIndex.symmetric_difference` with mismatched timezones to convert to UTC instead of casting to object dtype (:issue:`39328`)
- Changed the behavior of :func:`to_datetime` with argument "now" with
utc=False
to matchTimestamp("now")
(:issue:`18705`) - Changed the behavior of indexing on a timezone-aware :class:`DatetimeIndex` with a timezone-naive
datetime
object or vice-versa; these now behave like any other non-comparable type by raisingKeyError
(:issue:`36148`) - Changed the behavior of :meth:`Index.reindex`, :meth:`Series.reindex`, and :meth:`DataFrame.reindex` with a
datetime64
dtype and adatetime.date
object forfill_value
; these are no longer considered equivalent todatetime.datetime
objects so the reindex casts to object dtype (:issue:`39767`) - Changed behavior of :meth:`SparseArray.astype` when given a dtype that is not explicitly
SparseDtype
, cast to the exact requested dtype rather than silently using aSparseDtype
instead (:issue:`34457`) - Changed behavior of :meth:`Index.ravel` to return a view on the original :class:`Index` instead of a
np.ndarray
(:issue:`36900`) - Changed behavior of :meth:`Series.to_frame` and :meth:`Index.to_frame` with explicit
name=None
to useNone
for the column name instead of the index's name or default0
(:issue:`45523`) - Changed behavior of :func:`concat` with one array of
bool
-dtype and another of integer dtype, this now returnsobject
dtype instead of integer dtype; explicitly cast the bool object to integer before concatenating to get the old behavior (:issue:`45101`) - Changed behavior of :class:`DataFrame` constructor given floating-point
data
and an integerdtype
, when the data cannot be cast losslessly, the floating point dtype is retained, matching :class:`Series` behavior (:issue:`41170`) - Changed behavior of :class:`Index` constructor when given a
np.ndarray
with object-dtype containing numeric entries; this now retains object dtype rather than inferring a numeric dtype, consistent with :class:`Series` behavior (:issue:`42870`) - Changed behavior of :meth:`Index.__and__`, :meth:`Index.__or__` and :meth:`Index.__xor__` to behave as logical operations (matching :class:`Series` behavior) instead of aliases for set operations (:issue:`37374`)
- Changed behavior of :class:`DataFrame` constructor when passed a list whose first element is a :class:`Categorical`, this now treats the elements as rows casting to
object
dtype, consistent with behavior for other types (:issue:`38845`) - Changed behavior of :class:`DataFrame` constructor when passed a
dtype
(other than int) that the data cannot be cast to; it now raises instead of silently ignoring the dtype (:issue:`41733`) - Changed the behavior of :class:`Series` constructor, it will no longer infer a datetime64 or timedelta64 dtype from string entries (:issue:`41731`)
- Changed behavior of :class:`Timestamp` constructor with a
np.datetime64
object and atz
passed to interpret the input as a wall-time as opposed to a UTC time (:issue:`42288`) - Changed behavior of :meth:`Timestamp.utcfromtimestamp` to return a timezone-aware object satisfying
Timestamp.utcfromtimestamp(val).timestamp() == val
(:issue:`45083`) - Changed behavior of :class:`Index` constructor when passed a
SparseArray
orSparseDtype
to retain that dtype instead of casting tonumpy.ndarray
(:issue:`43930`) - Changed behavior of setitem-like operations (
__setitem__
,fillna
,where
,mask
,replace
,insert
, fill_value forshift
) on an object with :class:`DatetimeTZDtype` when using a value with a non-matching timezone, the value will be cast to the object's timezone instead of casting both to object-dtype (:issue:`44243`) - Changed behavior of :class:`Index`, :class:`Series`, :class:`DataFrame` constructors with floating-dtype data and a :class:`DatetimeTZDtype`, the data are now interpreted as UTC-times instead of wall-times, consistent with how integer-dtype data are treated (:issue:`45573`)
- Changed behavior of :class:`Series` and :class:`DataFrame` constructors with integer dtype and floating-point data containing
NaN
, this now raisesIntCastingNaNError
(:issue:`40110`) - Changed behavior of :class:`Series` and :class:`DataFrame` constructors with an integer
dtype
and values that are too large to losslessly cast to this dtype, this now raisesValueError
(:issue:`41734`) - Changed behavior of :class:`Series` and :class:`DataFrame` constructors with an integer
dtype
and values having eitherdatetime64
ortimedelta64
dtypes, this now raisesTypeError
, usevalues.view("int64")
instead (:issue:`41770`) - Removed the deprecated
base
andloffset
arguments from :meth:`pandas.DataFrame.resample`, :meth:`pandas.Series.resample` and :class:`pandas.Grouper`. Useoffset
ororigin
instead (:issue:`31809`) - Changed behavior of :meth:`Series.fillna` and :meth:`DataFrame.fillna` with
timedelta64[ns]
dtype and an incompatiblefill_value
; this now casts toobject
dtype instead of raising, consistent with the behavior with other dtypes (:issue:`45746`) - Change the default argument of
regex
for :meth:`Series.str.replace` fromTrue
toFalse
. Additionally, a single characterpat
withregex=True
is now treated as a regular expression instead of a string literal. (:issue:`36695`, :issue:`24804`) - Changed behavior of :meth:`DataFrame.any` and :meth:`DataFrame.all` with
bool_only=True
; object-dtype columns with all-bool values will no longer be included, manually cast tobool
dtype first (:issue:`46188`) - Changed behavior of comparison of a :class:`Timestamp` with a
datetime.date
object; these now compare as un-equal and raise on inequality comparisons, matching thedatetime.datetime
behavior (:issue:`36131`) - Changed behavior of comparison of
NaT
with adatetime.date
object; these now raise on inequality comparisons (:issue:`39196`) - Enforced deprecation of silently dropping columns that raised a
TypeError
in :class:`Series.transform` and :class:`DataFrame.transform` when used with a list or dictionary (:issue:`43740`) - Changed behavior of :meth:`DataFrame.apply` with list-like so that any partial failure will raise an error (:issue:`43740`)
- Changed behavior of :meth:`Series.__setitem__` with an integer key and a :class:`Float64Index` when the key is not present in the index; previously we treated the key as positional (behaving like
series.iloc[key] = val
), now we treat it is a label (behaving likeseries.loc[key] = val
), consistent with :meth:`Series.__getitem__`` behavior (:issue:`33469`) - Removed
na_sentinel
argument from :func:`factorize`, :meth:`.Index.factorize`, and :meth:`.ExtensionArray.factorize` (:issue:`47157`) - Changed behavior of :meth:`Series.diff` and :meth:`DataFrame.diff` with :class:`ExtensionDtype` dtypes whose arrays do not implement
diff
, these now raiseTypeError
rather than casting to numpy (:issue:`31025`) - Enforced deprecation of calling numpy "ufunc"s on :class:`DataFrame` with
method="outer"
; this now raisesNotImplementedError
(:issue:`36955`) - Enforced deprecation disallowing passing
numeric_only=True
to :class:`Series` reductions (rank
,any
,all
, ...) with non-numeric dtype (:issue:`47500`) - Changed behavior of :meth:`DataFrameGroupBy.apply` and :meth:`SeriesGroupBy.apply` so that
group_keys
is respected even if a transformer is detected (:issue:`34998`) - Comparisons between a :class:`DataFrame` and a :class:`Series` where the frame's columns do not match the series's index raise
ValueError
instead of automatically aligning, doleft, right = left.align(right, axis=1, copy=False)
before comparing (:issue:`36795`) - Enforced deprecation
numeric_only=None
(the default) in DataFrame reductions that would silently drop columns that raised;numeric_only
now defaults toFalse
(:issue:`41480`) - Changed default of
numeric_only
toFalse
in all DataFrame methods with that argument (:issue:`46096`, :issue:`46906`) - Changed default of
numeric_only
toFalse
in :meth:`Series.rank` (:issue:`47561`) - Enforced deprecation of silently dropping nuisance columns in groupby and resample operations when
numeric_only=False
(:issue:`41475`) - Changed default of
numeric_only
in various :class:`.DataFrameGroupBy` methods; all methods now default tonumeric_only=False
(:issue:`46072`) - Changed default of
numeric_only
toFalse
in :class:`.Resampler` methods (:issue:`47177`) - Using the method :meth:`DataFrameGroupBy.transform` with a callable that returns DataFrames will align to the input's index (:issue:`47244`)
- When providing a list of columns of length one to :meth:`DataFrame.groupby`, the keys that are returned by iterating over the resulting :class:`DataFrameGroupBy` object will now be tuples of length one (:issue:`47761`)
- Performance improvement in :meth:`.DataFrameGroupBy.median` and :meth:`.SeriesGroupBy.median` and :meth:`.GroupBy.cumprod` for nullable dtypes (:issue:`37493`)
- Performance improvement in :meth:`MultiIndex.argsort` and :meth:`MultiIndex.sort_values` (:issue:`48406`)
- Performance improvement in :meth:`MultiIndex.size` (:issue:`48723`)
- Performance improvement in :meth:`MultiIndex.union` without missing values and without duplicates (:issue:`48505`, :issue:`48752`)
- Performance improvement in :meth:`MultiIndex.difference` (:issue:`48606`)
- Performance improvement in :class:`MultiIndex` set operations with sort=None (:issue:`49010`)
- Performance improvement in :meth:`.DataFrameGroupBy.mean`, :meth:`.SeriesGroupBy.mean`, :meth:`.DataFrameGroupBy.var`, and :meth:`.SeriesGroupBy.var` for extension array dtypes (:issue:`37493`)
- Performance improvement in :meth:`MultiIndex.isin` when
level=None
(:issue:`48622`, :issue:`49577`) - Performance improvement in :meth:`MultiIndex.putmask` (:issue:`49830`)
- Performance improvement in :meth:`Index.union` and :meth:`MultiIndex.union` when index contains duplicates (:issue:`48900`)
- Performance improvement in :meth:`Series.fillna` for extension array dtypes (:issue:`49722`, :issue:`50078`)
- Performance improvement for :meth:`Series.value_counts` with nullable dtype (:issue:`48338`)
- Performance improvement for :class:`Series` constructor passing integer numpy array with nullable dtype (:issue:`48338`)
- Performance improvement for :class:`DatetimeIndex` constructor passing a list (:issue:`48609`)
- Performance improvement in :func:`merge` and :meth:`DataFrame.join` when joining on a sorted :class:`MultiIndex` (:issue:`48504`)
- Performance improvement in :func:`to_datetime` when parsing strings with timezone offsets (:issue:`50107`)
- Performance improvement in :meth:`DataFrame.loc` and :meth:`Series.loc` for tuple-based indexing of a :class:`MultiIndex` (:issue:`48384`)
- Performance improvement for :meth:`MultiIndex.unique` (:issue:`48335`)
- Performance improvement for :func:`concat` with extension array backed indexes (:issue:`49128`, :issue:`49178`)
- Reduce memory usage of :meth:`DataFrame.to_pickle`/:meth:`Series.to_pickle` when using BZ2 or LZMA (:issue:`49068`)
- Performance improvement for :class:`~arrays.StringArray` constructor passing a numpy array with type
np.str_
(:issue:`49109`) - Performance improvement in :meth:`~arrays.ArrowExtensionArray.factorize` (:issue:`49177`)
- Performance improvement in :meth:`DataFrame.join` when joining on a subset of a :class:`MultiIndex` (:issue:`48611`)
- Performance improvement for :meth:`MultiIndex.intersection` (:issue:`48604`)
- Performance improvement in
var
for nullable dtypes (:issue:`48379`). - Performance improvement when iterating over pyarrow and nullable dtypes (:issue:`49825`, :issue:`49851`)
- Performance improvements to :func:`read_sas` (:issue:`47403`, :issue:`47405`, :issue:`47656`, :issue:`48502`)
- Memory improvement in :meth:`RangeIndex.sort_values` (:issue:`48801`)
- Performance improvement in :class:`DataFrameGroupBy` and :class:`SeriesGroupBy` when
by
is a categorical type andsort=False
(:issue:`48976`) - Performance improvement in :class:`DataFrameGroupBy` and :class:`SeriesGroupBy` when
by
is a categorical type andobserved=False
(:issue:`49596`) - Performance improvement in :func:`read_stata` with parameter
index_col
set toNone
(the default). Now the index will be a :class:`RangeIndex` instead of :class:`Int64Index` (:issue:`49745`) - Performance improvement in :func:`merge` when not merging on the index - the new index will now be :class:`RangeIndex` instead of :class:`Int64Index` (:issue:`49478`)
- Performance improvement in :meth:`DataFrame.to_dict` and :meth:`Series.to_dict` when using any non-object dtypes (:issue:`46470`)
- Performance improvement in :func:`read_html` when there are multiple tables (:issue:`49929`)
- Bug in :meth:`Categorical.set_categories` losing dtype information (:issue:`48812`)
- Bug in :meth:`DataFrame.groupby` and :meth:`Series.groupby` would reorder categories when used as a grouper (:issue:`48749`)
- Bug in :class:`Categorical` constructor when constructing from a :class:`Categorical` object and
dtype="category"
losing ordered-ness (:issue:`49309`)
- Bug in :func:`pandas.infer_freq`, raising
TypeError
when inferred on :class:`RangeIndex` (:issue:`47084`) - Bug in :func:`to_datetime` was raising on invalid offsets with
errors='coerce'
andinfer_datetime_format=True
(:issue:`48633`) - Bug in :class:`DatetimeIndex` constructor failing to raise when
tz=None
is explicitly specified in conjunction with timezone-awaredtype
or data (:issue:`48659`) - Bug in subtracting a
datetime
scalar from :class:`DatetimeIndex` failing to retain the originalfreq
attribute (:issue:`48818`) - Bug in
pandas.tseries.holiday.Holiday
where a half-open date interval causes inconsistent return types from :meth:`USFederalHolidayCalendar.holidays` (:issue:`49075`) - Bug in rendering :class:`DatetimeIndex` and :class:`Series` and :class:`DataFrame` with timezone-aware dtypes with
dateutil
orzoneinfo
timezones near daylight-savings transitions (:issue:`49684`) - Bug in :func:`to_datetime` was raising
ValueError
when parsing :class:`Timestamp`,datetime
, ornp.datetime64
objects with non-ISO8601format
(:issue:`49298`, :issue:`50036`)
- Bug in :func:`to_timedelta` raising error when input has nullable dtype
Float64
(:issue:`48796`) - Bug in :class:`Timedelta` constructor incorrectly raising instead of returning
NaT
when given anp.timedelta64("nat")
(:issue:`48898`) - Bug in :class:`Timedelta` constructor failing to raise when passed both a :class:`Timedelta` object and keywords (e.g. days, seconds) (:issue:`48898`)
- Bug in :meth:`Series.astype` and :meth:`DataFrame.astype` with object-dtype containing multiple timezone-aware
datetime
objects with heterogeneous timezones to a :class:`DatetimeTZDtype` incorrectly raising (:issue:`32581`) - Bug in :func:`to_datetime` was failing to parse date strings with timezone name when
format
was specified with%Z
(:issue:`49748`)
- Bug in :meth:`DataFrame.add` cannot apply ufunc when inputs contain mixed DataFrame type and Series type (:issue:`39853`)
- Bug in DataFrame reduction methods (e.g. :meth:`DataFrame.sum`) with object dtype,
axis=1
andnumeric_only=False
would not be coerced to float (:issue:`49551`) - Bug in :meth:`DataFrame.sem` and :meth:`Series.sem` where an erroneous
TypeError
would always raise when using data backed by an :class:`ArrowDtype` (:issue:`49759`)
- Bug in constructing :class:`Series` with
int64
dtype from a string list raising instead of casting (:issue:`44923`) - Bug in :meth:`DataFrame.eval` incorrectly raising an
AttributeError
when there are negative values in function call (:issue:`46471`) - Bug in :meth:`Series.convert_dtypes` not converting dtype to nullable dtype when :class:`Series` contains
NA
and has dtypeobject
(:issue:`48791`) - Bug where any :class:`ExtensionDtype` subclass with
kind="M"
would be interpreted as a timezone type (:issue:`34986`) - Bug in :class:`.arrays.ArrowExtensionArray` that would raise
NotImplementedError
when passed a sequence of strings or binary (:issue:`49172`) - Bug in :func:`to_datetime` was not respecting
exact
argument whenformat
was an ISO8601 format (:issue:`12649`) - Bug in :meth:`TimedeltaArray.astype` raising
TypeError
when converting to a pyarrow duration type (:issue:`49795`)
- Bug in :func:`pandas.api.dtypes.is_string_dtype` that would not return
True
for :class:`StringDtype` (:issue:`15585`)
- Bug in :meth:`IntervalIndex.is_overlapping` incorrect output if interval has duplicate left boundaries (:issue:`49581`)
- Bug in :meth:`Series.infer_objects` failing to infer :class:`IntervalDtype` for an object series of :class:`Interval` objects (:issue:`50090`)
- Bug in :meth:`DataFrame.reindex` filling with wrong values when indexing columns and index for
uint
dtypes (:issue:`48184`) - Bug in :meth:`DataFrame.loc` coercing dtypes when setting values with a list indexer (:issue:`49159`)
- Bug in :meth:`DataFrame.loc` raising
ValueError
withbool
indexer and :class:`MultiIndex` (:issue:`47687`) - Bug in :meth:`DataFrame.__setitem__` raising
ValueError
when right hand side is :class:`DataFrame` with :class:`MultiIndex` columns (:issue:`49121`) - Bug in :meth:`DataFrame.reindex` casting dtype to
object
when :class:`DataFrame` has single extension array column when re-indexingcolumns
andindex
(:issue:`48190`) - Bug in :func:`~DataFrame.describe` when formatting percentiles in the resulting index showed more decimals than needed (:issue:`46362`)
- Bug in :meth:`DataFrame.compare` does not recognize differences when comparing
NA
with value in nullable dtypes (:issue:`48939`)
- Bug in :meth:`Index.equals` raising
TypeError
when :class:`Index` consists of tuples that containNA
(:issue:`48446`) - Bug in :meth:`Series.map` caused incorrect result when data has NaNs and defaultdict mapping was used (:issue:`48813`)
- Bug in :class:`NA` raising a
TypeError
instead of return :class:`NA` when performing a binary operation with abytes
object (:issue:`49108`) - Bug in :meth:`DataFrame.update` with
overwrite=False
raisingTypeError
whenself
has column withNaT
values and column not present inother
(:issue:`16713`)
- Bug in :meth:`MultiIndex.get_indexer` not matching
NaN
values (:issue:`29252`, :issue:`37222`, :issue:`38623`, :issue:`42883`, :issue:`43222`, :issue:`46173`, :issue:`48905`) - Bug in :meth:`MultiIndex.argsort` raising
TypeError
when index contains :attr:`NA` (:issue:`48495`) - Bug in :meth:`MultiIndex.difference` losing extension array dtype (:issue:`48606`)
- Bug in :class:`MultiIndex.set_levels` raising
IndexError
when setting empty level (:issue:`48636`) - Bug in :meth:`MultiIndex.unique` losing extension array dtype (:issue:`48335`)
- Bug in :meth:`MultiIndex.intersection` losing extension array (:issue:`48604`)
- Bug in :meth:`MultiIndex.union` losing extension array (:issue:`48498`, :issue:`48505`, :issue:`48900`)
- Bug in :meth:`MultiIndex.union` not sorting when sort=None and index contains missing values (:issue:`49010`)
- Bug in :meth:`MultiIndex.append` not checking names for equality (:issue:`48288`)
- Bug in :meth:`MultiIndex.symmetric_difference` losing extension array (:issue:`48607`)
- Bug in :meth:`MultiIndex.join` losing dtypes when :class:`MultiIndex` has duplicates (:issue:`49830`)
- Bug in :meth:`MultiIndex.putmask` losing extension array (:issue:`49830`)
- Bug in :meth:`MultiIndex.value_counts` returning a :class:`Series` indexed by flat index of tuples instead of a :class:`MultiIndex` (:issue:`49558`)
- Bug in :func:`read_sas` caused fragmentation of :class:`DataFrame` and raised :class:`.errors.PerformanceWarning` (:issue:`48595`)
- Improved error message in :func:`read_excel` by including the offending sheet name when an exception is raised while reading a file (:issue:`48706`)
- Bug when a pickling a subset PyArrow-backed data that would serialize the entire data instead of the subset (:issue:`42600`)
- Bug in :func:`read_csv` for a single-line csv with fewer columns than
names
raised :class:`.errors.ParserError` withengine="c"
(:issue:`47566`) - Bug in :func:`DataFrame.to_string` with
header=False
that printed the index name on the same line as the first row of the data (:issue:`49230`) - Fixed memory leak which stemmed from the initialization of the internal JSON module (:issue:`49222`)
- Fixed issue where :func:`json_normalize` would incorrectly remove leading characters from column names that matched the
sep
argument (:issue:`49861`)
- Bug in :meth:`Period.strftime` and :meth:`PeriodIndex.strftime`, raising
UnicodeDecodeError
when a locale-specific directive was passed (:issue:`46319`)
ax.set_xlim
was sometimes raisingUserWarning
which users couldn't address due toset_xlim
not accepting parsing arguments - the converter now uses :func:`Timestamp` instead (:issue:`49148`)
- Bug in :class:`.ExponentialMovingWindow` with
online
not raising aNotImplementedError
for unsupported operations (:issue:`48834`) - Bug in :meth:`DataFrameGroupBy.sample` raises
ValueError
when the object is empty (:issue:`48459`) - Bug in :meth:`Series.groupby` raises
ValueError
when an entry of the index is equal to the name of the index (:issue:`48567`) - Bug in :meth:`DataFrameGroupBy.resample` produces inconsistent results when passing empty DataFrame (:issue:`47705`)
- Bug in :class:`.DataFrameGroupBy` and :class:`.SeriesGroupBy` would not include unobserved categories in result when grouping by categorical indexes (:issue:`49354`)
- Bug in :class:`.DataFrameGroupBy` and :class:`.SeriesGroupBy` would change result order depending on the input index when grouping by categoricals (:issue:`49223`)
- Bug in :class:`.DataFrameGroupBy` and :class:`.SeriesGroupBy` when grouping on categorical data would sort result values even when used with
sort=False
(:issue:`42482`) - Bug in :meth:`.DataFrameGroupBy.apply` and :class:`SeriesGroupBy.apply` with
as_index=False
would not attempt the computation without using the grouping keys when using them failed with aTypeError
(:issue:`49256`) - Bug in :meth:`.DataFrameGroupBy.describe` would describe the group keys (:issue:`49256`)
- Bug in :meth:`.SeriesGroupBy.describe` with
as_index=False
would have the incorrect shape (:issue:`49256`) - Bug in :class:`.DataFrameGroupBy` and :class:`.SeriesGroupBy` with
dropna=False
would drop NA values when the grouper was categorical (:issue:`36327`) - Bug in :meth:`.SeriesGroupBy.nunique` would incorrectly raise when the grouper was an empty categorical and
observed=True
(:issue:`21334`)
- Bug in :meth:`DataFrame.pivot_table` raising
TypeError
for nullable dtype andmargins=True
(:issue:`48681`) - Bug in :meth:`DataFrame.unstack` and :meth:`Series.unstack` unstacking wrong level of :class:`MultiIndex` when :class:`MultiIndex` has mixed names (:issue:`48763`)
- Bug in :meth:`DataFrame.pivot` not respecting
None
as column name (:issue:`48293`) - Bug in :func:`join` when
left_on
orright_on
is or includes a :class:`CategoricalIndex` incorrectly raisingAttributeError
(:issue:`48464`) - Bug in :meth:`DataFrame.pivot_table` raising
ValueError
with parametermargins=True
when result is an empty :class:`DataFrame` (:issue:`49240`) - Clarified error message in :func:`merge` when passing invalid
validate
option (:issue:`49417`) - Bug in :meth:`DataFrame.explode` raising
ValueError
on multiple columns withNaN
values or empty lists (:issue:`46084`) - Bug in :meth:`DataFrame.transpose` with
IntervalDtype
column withtimedelta64[ns]
endpoints (:issue:`44917`)
- Bug in :meth:`Series.astype` when converting a
SparseDtype
withdatetime64[ns]
subtype toint64
dtype raising, inconsistent with the non-sparse behavior (:issue:`49631`,:issue:50087) - Bug in :meth:`Series.astype` when converting a from
datetime64[ns]
toSparse[datetime64[ns]]
incorrectly raising (:issue:`50082`)
- Bug in :meth:`Series.mean` overflowing unnecessarily with nullable integers (:issue:`48378`)
- Bug in :meth:`Series.tolist` for nullable dtypes returning numpy scalars instead of python scalars (:issue:`49890`)
- Bug when concatenating an empty DataFrame with an ExtensionDtype to another DataFrame with the same ExtensionDtype, the resulting dtype turned into object (:issue:`48510`)
- Fixed metadata propagation in :meth:`DataFrame.corr` and :meth:`DataFrame.cov` (:issue:`28283`)