These are the changes in pandas 2.1.0. See :ref:`release` for a full changelog including other versions of pandas.
{{ header }}
When given a callable, :meth:`Series.map` applies the callable to all elements of the :class:`Series`. Similarly, :meth:`DataFrame.applymap` applies the callable to all elements of the :class:`DataFrame`, while :meth:`Index.map` applies the callable to all elements of the :class:`Index`.
Frequently, it is not desirable to apply the callable to nan-like values of the array and to avoid doing
that, the map
method could be called with na_action="ignore"
, i.e. ser.map(func, na_action="ignore")
.
However, na_action="ignore"
was not implemented for many ExtensionArray
and Index
types
and na_action="ignore"
did not work correctly for any ExtensionArray
subclass except the nullable numeric ones (i.e. with dtype :class:`Int64` etc.).
na_action="ignore"
now works for all array types (:issue:`52219`, :issue:`51645`, :issue:`51809`, :issue:`51936`, :issue:`52033`; :issue:`52096`).
Previous behavior:
In [1]: ser = pd.Series(["a", "b", np.nan], dtype="category")
In [2]: ser.map(str.upper, na_action="ignore")
NotImplementedError
In [3]: df = pd.DataFrame(ser)
In [4]: df.applymap(str.upper, na_action="ignore") # worked for DataFrame
0
0 A
1 B
2 NaN
In [5]: idx = pd.Index(ser)
In [6]: idx.map(str.upper, na_action="ignore")
TypeError: CategoricalIndex.map() got an unexpected keyword argument 'na_action'
New behavior:
.. ipython:: python ser = pd.Series(["a", "b", np.nan], dtype="category") ser.map(str.upper, na_action="ignore") df = pd.DataFrame(ser) df.applymap(str.upper, na_action="ignore") idx = pd.Index(ser) idx.map(str.upper, na_action="ignore")
Also, note that :meth:`Categorical.map` implicitly has had its na_action
set to "ignore"
by default.
This has been deprecated and will :meth:`Categorical.map` in the future change the default
to na_action=None
, like for all the other array types.
- :meth:`Categorical.map` and :meth:`CategoricalIndex.map` now have a
na_action
parameter. :meth:`Categorical.map` implicitly had a default value of"ignore"
forna_action
. This has formally been deprecated and will be changed toNone
in the future. Also notice that :meth:`Series.map` has defaultna_action=None
and calls to series with categorical data will now usena_action=None
unless explicitly set otherwise (:issue:`44279`) - Implemented
__pandas_priority__
to allow custom types to take precedence over :class:`DataFrame`, :class:`Series`, :class:`Index`, or :class:`ExtensionArray` for arithmetic operations, :ref:`see the developer guide <extending.pandas_priority>` (:issue:`48347`) - :meth:`MultiIndex.sort_values` now supports
na_position
(:issue:`51612`) - :meth:`MultiIndex.sortlevel` and :meth:`Index.sortlevel` gained a new keyword
na_position
(:issue:`51612`) - :meth:`arrays.DatetimeArray.map`, :meth:`arrays.TimedeltaArray.map` and :meth:`arrays.PeriodArray.map` can now take a
na_action
argument (:issue:`51644`) - Improve error message when setting :class:`DataFrame` with wrong number of columns through :meth:`DataFrame.isetitem` (:issue:`51701`)
- Let :meth:`DataFrame.to_feather` accept a non-default :class:`Index` and non-string column names (:issue:`51787`)
- :class:`api.extensions.ExtensionArray` now has a :meth:`~api.extensions.ExtensionArray.map` method (:issue:`51809`)
- Improve error message when having incompatible columns using :meth:`DataFrame.merge` (:issue:`51861`)
- Added to the escape mode "latex-math" preserving without escaping all characters between "(" and ")" in formatter (:issue:`51903`)
- Improved error message when creating a DataFrame with empty data (0 rows), no index and an incorrect number of columns. (:issue:`52084`)
- :meth:`DataFrame.applymap` now uses the :meth:`~api.extensions.ExtensionArray.map` method of underlying :class:`api.extensions.ExtensionArray` instances (:issue:`52219`)
- :meth:`arrays.SparseArray.map` now supports
na_action
(:issue:`52096`). - Add dtype of categories to
repr
information of :class:`CategoricalDtype` (:issue:`52179`)
These are bug fixes that might have notable behavior changes.
Some minimum supported versions of dependencies were updated. If installed, we now require:
Package | Minimum Version | Required | Changed |
---|---|---|---|
X | X |
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
Package | Minimum Version | Changed |
---|---|---|
X |
See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.
- Deprecated silently dropping unrecognized timezones when parsing strings to datetimes (:issue:`18702`)
- Deprecated :meth:`DataFrame._data` and :meth:`Series._data`, use public APIs instead (:issue:`33333`)
- Deprecated :meth:`.Groupby.all` and :meth:`.GroupBy.any` with datetime64 or :class:`PeriodDtype` values, matching the :class:`Series` and :class:`DataFrame` deprecations (:issue:`34479`)
- Deprecating pinning
group.name
to each group in :meth:`SeriesGroupBy.aggregate` aggregations; if your operation requires utilizing the groupby keys, iterate over the groupby object instead (:issue:`41090`) - Deprecated the behavior of :func:`concat` with both
len(keys) != len(objs)
, in a future version this will raise instead of truncating to the shorter of the two sequences (:issue:`43485`) - Deprecated the default of
observed=False
in :meth:`DataFrame.groupby` and :meth:`Series.groupby`; this will default toTrue
in a future version (:issue:`43999`) - Deprecated explicit support for subclassing :class:`Index` (:issue:`45289`)
- Deprecated :meth:`DataFrameGroupBy.dtypes`, check
dtypes
on the underlying object instead (:issue:`51045`) - Deprecated
axis=1
in :meth:`DataFrame.groupby` and in :class:`Grouper` constructor, doframe.T.groupby(...)
instead (:issue:`51203`) - Deprecated :meth:`Categorical.to_list`, use
obj.tolist()
instead (:issue:`51254`) - Deprecated passing a :class:`DataFrame` to :meth:`DataFrame.from_records`, use :meth:`DataFrame.set_index` or :meth:`DataFrame.drop` instead (:issue:`51353`)
- Deprecated accepting slices in :meth:`DataFrame.take`, call
obj[slicer]
or pass a sequence of integers instead (:issue:`51539`) - Deprecated
axis=1
in :meth:`DataFrame.ewm`, :meth:`DataFrame.rolling`, :meth:`DataFrame.expanding`, transpose before calling the method instead (:issue:`51778`) - Deprecated the
axis
keyword in :meth:`DataFrame.ewm`, :meth:`Series.ewm`, :meth:`DataFrame.rolling`, :meth:`Series.rolling`, :meth:`DataFrame.expanding`, :meth:`Series.expanding` (:issue:`51778`) - Deprecated the
axis
keyword in :meth:`DataFrame.resample`, :meth:`Series.resample` (:issue:`51778`) - Deprecated 'method', 'limit', and 'fill_axis' keywords in :meth:`DataFrame.align` and :meth:`Series.align`, explicitly call
fillna
on the alignment results instead (:issue:`51856`) - Deprecated 'broadcast_axis' keyword in :meth:`Series.align` and :meth:`DataFrame.align`, upcast before calling
align
withleft = DataFrame({col: left for col in right.columns}, index=right.index)
(:issue:`51856`) - Deprecated the 'axis' keyword in :meth:`.GroupBy.idxmax`, :meth:`.GroupBy.idxmin`, :meth:`.GroupBy.fillna`, :meth:`.GroupBy.take`, :meth:`.GroupBy.skew`, :meth:`.GroupBy.rank`, :meth:`.GroupBy.cumprod`, :meth:`.GroupBy.cumsum`, :meth:`.GroupBy.cummax`, :meth:`.GroupBy.cummin`, :meth:`.GroupBy.pct_change`, :meth:`GroupBy.diff`, :meth:`.GroupBy.shift`, and :meth:`DataFrameGroupBy.corrwith`; for
axis=1
operate on the underlying :class:`DataFrame` instead (:issue:`50405`, :issue:`51046`) - Deprecated the "fastpath" keyword in :class:`Categorical` constructor, use :meth:`Categorical.from_codes` instead (:issue:`20110`)
- Deprecated behavior of :meth:`Series.dt.to_pydatetime`, in a future version this will return a :class:`Series` containing python
datetime
objects instead of anndarray
of datetimes; this matches the behavior of other :meth:`Series.dt` properties (:issue:`20306`) - Deprecated passing a dictionary to :meth:`.SeriesGroupBy.agg`; pass a list of aggregations instead (:issue:`50684`)
- Deprecated logical operations (
|
,&
,^
) between pandas objects and dtype-less sequences (e.g.list
,tuple
), wrap a sequence in a :class:`Series` or numpy array before operating instead (:issue:`51521`) - Deprecated the methods :meth:`Series.bool` and :meth:`DataFrame.bool` (:issue:`51749`)
- Deprecated :meth:`DataFrame.swapaxes` and :meth:`Series.swapaxes`, use :meth:`DataFrame.transpose` or :meth:`Series.transpose` instead (:issue:`51946`)
- Deprecated parameter
convert_type
in :meth:`Series.apply` (:issue:`52140`)
- Performance improvement in :func:`read_parquet` on string columns when using
use_nullable_dtypes=True
(:issue:`47345`) - Performance improvement in :meth:`DataFrame.clip` and :meth:`Series.clip` (:issue:`51472`)
- Performance improvement in :meth:`DataFrame.first_valid_index` and :meth:`DataFrame.last_valid_index` for extension array dtypes (:issue:`51549`)
- Performance improvement in :meth:`DataFrame.where` when
cond
is backed by an extension dtype (:issue:`51574`) - Performance improvement in :func:`read_orc` when reading a remote URI file path. (:issue:`51609`)
- Performance improvement in :func:`read_parquet` and :meth:`DataFrame.to_parquet` when reading a remote file with
engine="pyarrow"
(:issue:`51609`) - Performance improvement in :meth:`MultiIndex.sortlevel` when
ascending
is a list (:issue:`51612`) - Performance improvement in :meth:`~arrays.ArrowExtensionArray.isna` when array has zero nulls or is all nulls (:issue:`51630`)
- Performance improvement in :meth:`~arrays.ArrowExtensionArray.fillna` when array does not contain nulls (:issue:`51635`)
- Performance improvement when parsing strings to
boolean[pyarrow]
dtype (:issue:`51730`) - Performance improvement when searching an :class:`Index` sliced from other indexes (:issue:`51738`)
- Performance improvement in :meth:`Series.combine_first` (:issue:`51777`)
- Performance improvement in :meth:`MultiIndex.set_levels` and :meth:`MultiIndex.set_codes` when
verify_integrity=True
(:issue:`51873`) - Performance improvement in :func:`factorize` for object columns not containing strings (:issue:`51921`)
- Performance improvement in :class:`Series` reductions (:issue:`52341`)
- Performance improvement in :meth:`Series.to_numpy` when dtype is a numpy float dtype and
na_value
isnp.nan
(:issue:`52430`)
- Bug in :meth:`Series.map` , where the value of the
na_action
parameter was not used if the series held a :class:`Categorical` (:issue:`22527`).
- Bug in :meth:`Timestamp.round` with values close to the implementation bounds returning incorrect results instead of raising
OutOfBoundsDatetime
(:issue:`51494`) - :meth:`DatetimeIndex.map` with
na_action="ignore"
now works as expected. (:issue:`51644`) - Bug in :meth:`arrays.DatetimeArray.map` and :meth:`DatetimeIndex.map`, where the supplied callable operated array-wise instead of element-wise (:issue:`51977`)
- Bug in :meth:`Timedelta.round` with values close to the implementation bounds returning incorrect results instead of raising
OutOfBoundsTimedelta
(:issue:`51494`) - Bug in :class:`TimedeltaIndex` division or multiplication leading to
.freq
of "0 Days" instead ofNone
(:issue:`51575`) - :meth:`TimedeltaIndex.map` with
na_action="ignore"
now works as expected (:issue:`51644`) - Bug in :meth:`arrays.TimedeltaArray.map` and :meth:`TimedeltaIndex.map`, where the supplied callable operated array-wise instead of element-wise (:issue:`51977`)
- Bug in :meth:`Series.corr` and :meth:`Series.cov` raising
AttributeError
for masked dtypes (:issue:`51422`) - Bug in :meth:`DataFrame.corrwith` raising
NotImplementedError
for pyarrow-backed dtypes (:issue:`52314`)
- Bug in :meth:`ArrowDtype.numpy_dtype` returning nanosecond units for non-nanosecond
pyarrow.timestamp
andpyarrow.duration
types (:issue:`51800`) - Bug in :meth:`DataFrame.info` raising
ValueError
whenuse_numba
is set (:issue:`51922`)
- Bug in :meth:`MultiIndex.set_levels` not preserving dtypes for :class:`Categorical` (:issue:`52125`)
- Bug in :func:`read_html`, tail texts were removed together with elements containing
display:none
style (:issue:`51629`) - :meth:`DataFrame.to_orc` now raising
ValueError
when non-default :class:`Index` is given (:issue:`51828`) - Bug in :func:`read_html`, style elements were read into DataFrames (:issue:`52197`)
- Bug in :class:`PeriodDtype` constructor failing to raise
TypeError
when no argument is passed or whenNone
is passed (:issue:`27388`) - :meth:`PeriodIndex.map` with
na_action="ignore"
now works as expected (:issue:`51644`) - Bug in :class:`PeriodDtype` constructor raising
ValueError
instead ofTypeError
when an invalid type is passed (:issue:`51790`) - Bug in :meth:`arrays.PeriodArray.map` and :meth:`PeriodIndex.map`, where the supplied callable operated array-wise instead of element-wise (:issue:`51977`)
- Bug in :meth:`Series.plot` when invoked with
color=None
(:issue:`51953`)
- Bug in :meth:`DataFrameGroupBy.idxmin`, :meth:`SeriesGroupBy.idxmin`, :meth:`DataFrameGroupBy.idxmax`, :meth:`SeriesGroupBy.idxmax` return wrong dtype when used on empty DataFrameGroupBy or SeriesGroupBy (:issue:`51423`)
- Bug in weighted rolling aggregations when specifying
min_periods=0
(:issue:`51449`) - Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` in incorrectly allowing non-fixed
freq
when resampling on a :class:`TimedeltaIndex` (:issue:`51896`) - Bug in :meth:`DataFrame.groupby` and :meth:`Series.groupby`, where, when the index of the
grouped :class:`Series` or :class:`DataFrame` was a :class:`DatetimeIndex`, :class:`TimedeltaIndex`
or :class:`PeriodIndex`, and the
groupby
method was given a function as its first argument, the function operated on the whole index rather than each element of the index. (:issue:`51979`) - Bug in :meth:`GroupBy.var` failing to raise
TypeError
when called with datetime64 or :class:`PeriodDtype` values (:issue:`52128`) - Bug in :meth:`DataFrameGroupBy.apply` causing an error to be raised when the input :class:`DataFrame` was subset as a :class:`DataFrame` after groupby (
[['a']]
and not['a']
) and the given callable returned :class:`Series` that were not all indexed the same. (:issue:`52444`)
- Bug in :meth:`DataFrame.stack` losing extension dtypes when columns is a :class:`MultiIndex` and frame contains mixed dtypes (:issue:`45740`)
- Bug in :meth:`DataFrame.transpose` inferring dtype for object column (:issue:`51546`)
- Bug in :meth:`Series.combine_first` converting
int64
dtype tofloat64
and losing precision on very large integers (:issue:`51764`)
- Bug in :meth:`arrays.SparseArray.map` allowed the fill value to be included in the sparse values (:issue:`52095`)
- Bug in :func:`assert_almost_equal` now throwing assertion error for two unequal sets (:issue:`51727`)
- Bug in :meth:`Series.memory_usage` when
deep=True
throw an error with Series of objects and the returned value is incorrect, as it does not take into account GC corrections (:issue:`51858`) - Bug in :func:`assert_frame_equal` checks category dtypes even when asked not to check index type (:issue:`52126`)
- Bug in :meth:`Series.map` when giving a callable to an empty series, the returned series had
object
dtype. It now keeps the original dtype (:issue:`52384`)