These are the changes in pandas 1.5.1. See :ref:`release` for a full changelog including other versions of pandas.
{{ header }}
Behavior of groupby
with categorical groupers (:issue:`48645`)
In versions of pandas prior to 1.5, groupby
with dropna=False
would still drop
NA values when the grouper was a categorical dtype. A fix for this was attempted in
1.5, however it introduced a regression where passing observed=False
and
dropna=False
to groupby
would result in only observed categories. It was found
that the patch fixing the dropna=False
bug is incompatible with observed=False
,
and decided that the best resolution is to restore the correct observed=False
behavior at the cost of reintroducing the dropna=False
bug.
.. ipython:: python
df = pd.DataFrame(
{
"x": pd.Categorical([1, None], categories=[1, 2, 3]),
"y": [3, 4],
}
)
df
1.5.0 behavior:
In [3]: # Correct behavior, NA values are not dropped
df.groupby("x", observed=True, dropna=False).sum()
Out[3]:
y
x
1 3
NaN 4
In [4]: # Incorrect behavior, only observed categories present
df.groupby("x", observed=False, dropna=False).sum()
Out[4]:
y
x
1 3
NaN 4
1.5.1 behavior:
.. ipython:: python
# Incorrect behavior, NA values are dropped
df.groupby("x", observed=True, dropna=False).sum()
# Correct behavior, unobserved categories present (NA values still dropped)
df.groupby("x", observed=False, dropna=False).sum()
- Fixed Regression in :meth:`Series.__setitem__` casting
None
toNaN
for object dtype (:issue:`48665`) - Fixed Regression in :meth:`DataFrame.loc` when setting values as a :class:`DataFrame` with all
True
indexer (:issue:`48701`) - Regression in :func:`.read_csv` causing an
EmptyDataError
when using an UTF-8 file handle that was already read from (:issue:`48646`) - Fixed regression in :meth:`DataFrame.describe` raising
TypeError
when result containsNA
(:issue:`48778`) - Fixed regression in :meth:`DataFrame.plot` ignoring invalid
colormap
forkind="scatter"
(:issue:`48726`) - Fixed performance regression in :func:`factorize` when
na_sentinel
is notNone
andsort=False
(:issue:`48620`) - Fixed regression causing an
AttributeError
during warning emitted if the provided table name in :meth:`DataFrame.to_sql` and the table name actually used in the database do not match (:issue:`48733`) - Fixed :meth:`.DataFrameGroupBy.size` not returning a Series when
axis=1
(:issue:`48738`)
- Bug in :meth:`Series.__getitem__` not falling back to positional for integer keys and boolean :class:`Index` (:issue:`48653`)
- Bug in :meth:`DataFrame.to_hdf` raising
AssertionError
with boolean index (:issue:`48667`) - Bug in :func:`assert_index_equal` for extension arrays with non matching
NA
raisingValueError
(:issue:`48608`) - Bug in :meth:`DataFrame.pivot_table` raising unexpected
FutureWarning
when setting datetime column as index (:issue:`48683`) - Bug in :meth:`DataFrame.sort_values` emitting unnecessary
FutureWarning
when called on :class:`DataFrame` with boolean sparse columns (:issue:`48784`)
- Avoid showing deprecated signatures when introspecting functions with warnings about arguments becoming keyword-only (:issue:`48692`)