These are the changes in pandas 3.0.0. See :ref:`release` for a full changelog including other versions of pandas.
{{ header }}
- :func:`DataFrame.to_excel` now raises an
UserWarning
when the character count in a cell exceeds Excel's limitation of 32767 characters (:issue:`56954`) - :func:`read_stata` now returns
datetime64
resolutions better matching those natively stored in the stata format (:issue:`55642`) - :meth:`Styler.set_tooltips` provides alternative method to storing tooltips by using title attribute of td elements. (:issue:`56981`)
- Allow dictionaries to be passed to :meth:`pandas.Series.str.replace` via
pat
parameter (:issue:`51748`) - Support passing a :class:`Series` input to :func:`json_normalize` that retains the :class:`Series` :class:`Index` (:issue:`51452`)
- Users can globally disable any
PerformanceWarning
by setting the optionmode.performance_warnings
toFalse
(:issue:`56920`)
These are bug fixes that might have notable behavior changes.
Some minimum supported versions of dependencies were updated. If installed, we now require:
Package | Minimum Version | Required | Changed |
---|---|---|---|
numpy | 1.23.5 | X | X |
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
Package | New Minimum Version |
---|---|
fastparquet | 2023.04.0 |
See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.
- 3rd party
py.path
objects are no longer explicitly supported in IO methods. Use :py:class:`pathlib.Path` objects instead (:issue:`57091`) - :attr:`MultiIndex.codes`, :attr:`MultiIndex.levels`, and :attr:`MultiIndex.names` now returns a
tuple
instead of aFrozenList
(:issue:`53531`) - :func:`read_table`'s
parse_dates
argument defaults toNone
to improve consistency with :func:`read_csv` (:issue:`57476`) - Made
dtype
a required argument in :meth:`ExtensionArray._from_sequence_of_strings` (:issue:`56519`) - Updated :meth:`DataFrame.to_excel` so that the output spreadsheet has no styling. Custom styling can still be done using :meth:`Styler.to_excel` (:issue:`54154`)
- pickle and HDF (
.h5
) files created with Python 2 are no longer explicitly supported (:issue:`57387`) - pickled objects from pandas version less than
1.0.0
are no longer supported (:issue:`57155`)
- Deprecated :meth:`Timestamp.utcfromtimestamp`, use
Timestamp.fromtimestamp(ts, "UTC")
instead (:issue:`56680`) - Deprecated :meth:`Timestamp.utcnow`, use
Timestamp.now("UTC")
instead (:issue:`56680`) - Deprecated allowing non-keyword arguments in :meth:`Series.to_markdown` except
buf
. (:issue:`57280`) - Deprecated allowing non-keyword arguments in :meth:`Series.to_string` except
buf
. (:issue:`57280`)
- :func:`read_excel`, :func:`read_json`, :func:`read_html`, and :func:`read_xml` no longer accept raw string or byte representation of the data. That type of data must be wrapped in a :py:class:`StringIO` or :py:class:`BytesIO` (:issue:`53767`)
- :meth:`Series.dt.to_pydatetime` now returns a :class:`Series` of :py:class:`datetime.datetime` objects (:issue:`52459`)
- All arguments except
name
in :meth:`Index.rename` are now keyword only (:issue:`56493`) - All arguments except the first
path
-like argument in IO writers are now keyword only (:issue:`54229`) - All arguments in :meth:`Index.sort_values` are now keyword only (:issue:`56493`)
- All arguments in :meth:`Series.to_dict` are now keyword only (:issue:`56493`)
- Changed the default value of
observed
in :meth:`DataFrame.groupby` and :meth:`Series.groupby` toTrue
(:issue:`51811`) - Enforced deprecation disallowing parsing datetimes with mixed time zones unless user passes
utc=True
to :func:`to_datetime` (:issue:`57275`) - Enforced silent-downcasting deprecation for :ref:`all relevant methods <whatsnew_220.silent_downcasting>` (:issue:`54710`)
- In :meth:`DataFrame.stack`, the default value of
future_stack
is nowTrue
; specifyingFalse
will raise aFutureWarning
(:issue:`55448`) - Methods
apply
,agg
, andtransform
will no longer replace NumPy functions (e.g.np.sum
) and built-in functions (e.g.min
) with the equivalent pandas implementation; use string aliases (e.g."sum"
and"min"
) if you desire to use the pandas implementation (:issue:`53974`) - Passing both
freq
andfill_value
in :meth:`DataFrame.shift` and :meth:`Series.shift` and :meth:`.DataFrameGroupBy.shift` now raises aValueError
(:issue:`54818`) - Removed :meth:`DateOffset.is_anchored` and :meth:`offsets.Tick.is_anchored` (:issue:`56594`)
- Removed
DataFrame.applymap
,Styler.applymap
andStyler.applymap_index
(:issue:`52364`) - Removed
DataFrame.bool
andSeries.bool
(:issue:`51756`) - Removed
DataFrame.first
andDataFrame.last
(:issue:`53710`) - Removed
DataFrame.swapaxes
andSeries.swapaxes
(:issue:`51946`) - Removed
DataFrameGroupBy.grouper
andSeriesGroupBy.grouper
(:issue:`56521`) - Removed
DataFrameGroupby.fillna
andSeriesGroupBy.fillna`
(:issue:`55719`) - Removed
Index.format
, use :meth:`Index.astype` withstr
or :meth:`Index.map` with aformatter
function instead (:issue:`55439`) - Removed
Resample.fillna
(:issue:`55719`) - Removed
Series.__int__
andSeries.__float__
. Callint(Series.iloc[0])
orfloat(Series.iloc[0])
instead. (:issue:`51131`) - Removed
Series.ravel
(:issue:`56053`) - Removed
Series.view
(:issue:`56054`) - Removed
StataReader.close
(:issue:`49228`) - Removed
_data
from :class:`DataFrame`, :class:`Series`, :class:`.arrays.ArrowExtensionArray` (:issue:`52003`) - Removed
axis
argument from :meth:`DataFrame.groupby`, :meth:`Series.groupby`, :meth:`DataFrame.rolling`, :meth:`Series.rolling`, :meth:`DataFrame.resample`, and :meth:`Series.resample` (:issue:`51203`) - Removed
axis
argument from all groupby operations (:issue:`50405`) - Removed
convert_dtype
from :meth:`Series.apply` (:issue:`52257`) - Removed
method
,limit
fill_axis
andbroadcast_axis
keywords from :meth:`DataFrame.align` (:issue:`51968`) - Removed
pandas.api.types.is_interval
andpandas.api.types.is_period
, useisinstance(obj, pd.Interval)
andisinstance(obj, pd.Period)
instead (:issue:`55264`) - Removed
pandas.io.sql.execute
(:issue:`50185`) - Removed
pandas.value_counts
, use :meth:`Series.value_counts` instead (:issue:`53493`) - Removed
read_gbq
andDataFrame.to_gbq
. Usepandas_gbq.read_gbq
andpandas_gbq.to_gbq
instead https://pandas-gbq.readthedocs.io/en/latest/api.html (:issue:`55525`) - Removed
use_nullable_dtypes
from :func:`read_parquet` (:issue:`51853`) - Removed
year
,month
,quarter
,day
,hour
,minute
, andsecond
keywords in the :class:`PeriodIndex` constructor, use :meth:`PeriodIndex.from_fields` instead (:issue:`55960`) - Removed deprecated argument
obj
in :meth:`.DataFrameGroupBy.get_group` and :meth:`.SeriesGroupBy.get_group` (:issue:`53545`) - Removed deprecated behavior of :meth:`Series.agg` using :meth:`Series.apply` (:issue:`53325`)
- Removed option
mode.use_inf_as_na
, convert inf entries toNaN
before instead (:issue:`51684`) - Removed support for :class:`DataFrame` in :meth:`DataFrame.from_records`(:issue:`51697`)
- Removed support for
errors="ignore"
in :func:`to_datetime`, :func:`to_timedelta` and :func:`to_numeric` (:issue:`55734`) - Removed support for
slice
in :meth:`DataFrame.take` (:issue:`51539`) - Removed the
ArrayManager
(:issue:`55043`) - Removed the
fastpath
argument from the :class:`Series` constructor (:issue:`55466`) - Removed the
is_boolean
,is_integer
,is_floating
,holds_integer
,is_numeric
,is_categorical
,is_object
, andis_interval
attributes of :class:`Index` (:issue:`50042`) - Removed the
ordinal
keyword in :class:`PeriodIndex`, use :meth:`PeriodIndex.from_ordinals` instead (:issue:`55960`) - Removed unused arguments
*args
and**kwargs
in :class:`Resampler` methods (:issue:`50977`) - Unrecognized timezones when parsing strings to datetimes now raises a
ValueError
(:issue:`51477`)
- Performance improvement in :class:`DataFrame` when
data
is adict
andcolumns
is specified (:issue:`24368`) - Performance improvement in :meth:`DataFrame.join` for sorted but non-unique indexes (:issue:`56941`)
- Performance improvement in :meth:`DataFrame.join` when left and/or right are non-unique and
how
is"left"
,"right"
, or"inner"
(:issue:`56817`) - Performance improvement in :meth:`DataFrame.join` with
how="left"
orhow="right"
andsort=True
(:issue:`56919`) - Performance improvement in :meth:`DataFrameGroupBy.ffill`, :meth:`DataFrameGroupBy.bfill`, :meth:`SeriesGroupBy.ffill`, and :meth:`SeriesGroupBy.bfill` (:issue:`56902`)
- Performance improvement in :meth:`Index.join` by propagating cached attributes in cases where the result matches one of the inputs (:issue:`57023`)
- Performance improvement in :meth:`Index.take` when
indices
is a full range indexer from zero to length of index (:issue:`56806`) - Performance improvement in :meth:`MultiIndex.equals` for equal length indexes (:issue:`56990`)
- Performance improvement in :meth:`RangeIndex.append` when appending the same index (:issue:`57252`)
- Performance improvement in :meth:`RangeIndex.take` returning a :class:`RangeIndex` instead of a :class:`Index` when possible. (:issue:`57445`)
- Performance improvement in indexing operations for string dtypes (:issue:`56997`)
- :meth:`Series.str.extract` returns a :class:`RangeIndex` columns instead of an :class:`Index` column when possible (:issue:`?``)
- Performance improvement in
DataFrameGroupBy.__len__
andSeriesGroupBy.__len__
(:issue:`57595`)
- Fixed bug in :meth:`DataFrame.join` inconsistently setting result index name (:issue:`55815`)
- Fixed bug in :meth:`DataFrame.to_string` that raised
StopIteration
with nested DataFrames. (:issue:`16098`) - Fixed bug in :meth:`Series.diff` allowing non-integer values for the
periods
argument. (:issue:`56607`)
- Bug in :func:`date_range` where the last valid timestamp would sometimes not be produced (:issue:`56134`)
- Bug in
np.matmul
with :class:`Index` inputs raising aTypeError
(:issue:`57079`)
- Bug in :meth:`Series.astype` might modify read-only array inplace when casting to a string dtype (:issue:`57212`)
- Bug in :meth:`Series.reindex` not maintaining
float32
type when areindex
introduces a missing value (:issue:`45857`)
- Bug in :meth:`Series.value_counts` would not respect
sort=False
for series havingstring
dtype (:issue:`55224`)
- Bug in :func:`interval_range` where start and end numeric types were always cast to 64 bit (:issue:`57268`)
- Bug in :meth:`.DataFrameGroupBy.quantile` when
interpolation="nearest"
is inconsistent with :meth:`DataFrame.quantile` (:issue:`47942`) - Bug in :meth:`DataFrame.ewm` and :meth:`Series.ewm` when passed
times
and aggregation functions other than mean (:issue:`51695`) - Bug in :meth:`.DataFrameGroupBy.groups` and :meth:`.SeriesGroupby.groups` that would not respect groupby arguments
dropna
andsort
(:issue:`55919`, :issue:`56966`, :issue:`56851`) - Bug in :meth:`.DataFrameGroupBy.nunique` and :meth:`.SeriesGroupBy.nunique` would fail with multiple categorical groupings when
as_index=False
(:issue:`52848`) - Bug in :meth:`.DataFrameGroupBy.prod`, :meth:`.DataFrameGroupBy.any`, and :meth:`.DataFrameGroupBy.all` would result in NA values on unobserved groups; they now result in
1
,False
, andTrue
respectively (:issue:`55783`) - Bug in :meth:`.DataFrameGroupBy.value_counts` would produce incorrect results when used with some categorical and some non-categorical groupings and
observed=False
(:issue:`56016`)
- Fixed bug in :meth:`api.types.is_datetime64_any_dtype` where a custom :class:`ExtensionDtype` would return
False
for array-likes (:issue:`57055`)
- Bug in :class:`DataFrame` when passing a
dict
with a NA scalar andcolumns
that would always returnnp.nan
(:issue:`57205`) - Bug in :func:`tseries.api.guess_datetime_format` would fail to infer time format when "%Y" == "%H%M" (:issue:`57452`)
- Bug in :meth:`DataFrame.sort_index` when passing
axis="columns"
andignore_index=True
andascending=False
not returning a :class:`RangeIndex` columns (:issue:`57293`) - Bug in :meth:`DataFrame.where` where using a non-bool type array in the function would return a
ValueError
instead of aTypeError
(:issue:`56330`) - Bug in Dataframe Interchange Protocol implementation was returning incorrect results for data buffers' associated dtype, for string and datetime columns (:issue:`54781`)