You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.24.0.txt
+46-7
Original file line number
Diff line number
Diff line change
@@ -381,6 +381,37 @@ is the case with :attr:`Period.end_time`, for example
381
381
382
382
p.end_time
383
383
384
+
.. _whatsnew_0240.api_breaking.sparse_values:
385
+
386
+
Sparse Data Structure Refactor
387
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
388
+
389
+
``SparseArray``, the array backing ``SparseSeries`` and the columns in a ``SparseDataFrame``,
390
+
is now an extension array (:issue:`21978`, :issue:`19056`, :issue:`22835`).
391
+
To conform to this interface and for consistency with the rest of pandas, some API breaking
392
+
changes were made:
393
+
394
+
- ``SparseArray`` is no longer a subclass of :class:`numpy.ndarray`. To convert a SparseArray to a NumPy array, use :meth:`numpy.asarray`.
395
+
- ``SparseArray.dtype`` and ``SparseSeries.dtype`` are now instances of :class:`SparseDtype`, rather than ``np.dtype``. Access the underlying dtype with ``SparseDtype.subtype``.
396
+
- :meth:`numpy.asarray(sparse_array)` now returns a dense array with all the values, not just the non-fill-value values (:issue:`14167`)
397
+
- ``SparseArray.take`` now matches the API of :meth:`pandas.api.extensions.ExtensionArray.take` (:issue:`19506`):
398
+
399
+
* The default value of ``allow_fill`` has changed from ``False`` to ``True``.
400
+
* The ``out`` and ``mode`` parameters are now longer accepted (previously, this raised if they were specified).
401
+
* Passing a scalar for ``indices`` is no longer allowed.
402
+
403
+
- The result of concatenating a mix of sparse and dense Series is a Series with sparse values, rather than a ``SparseSeries``.
404
+
- ``SparseDataFrame.combine`` and ``DataFrame.combine_first`` no longer supports combining a sparse column with a dense column while preserving the sparse subtype. The result will be an object-dtype SparseArray.
405
+
- Setting :attr:`SparseArray.fill_value` to a fill value with a different dtype is now allowed.
406
+
407
+
408
+
Some new warnings are issued for operations that require or are likely to materialize a large dense array:
409
+
410
+
- A :class:`errors.PerformanceWarning` is issued when using fillna with a ``method``, as a dense array is constructed to create the filled array. Filling with a ``value`` is the efficient way to fill a sparse array.
411
+
- A :class:`errors.PerformanceWarning` is now issued when concatenating sparse Series with differing fill values. The fill value from the first sparse array continues to be used.
412
+
413
+
In addition to these API breaking changes, many :ref:`performance improvements and bug fixes have been made <whatsnew_0240.bug_fixes.sparse>`.
Raise ValueError in ``DataFrame.to_dict(orient='index')``
@@ -574,6 +605,7 @@ update the ``ExtensionDtype._metadata`` tuple to match the signature of your
574
605
- Added :meth:`pandas.api.types.register_extension_dtype` to register an extension type with pandas (:issue:`22664`)
575
606
- Series backed by an ``ExtensionArray`` now work with :func:`util.hash_pandas_object` (:issue:`23066`)
576
607
- Updated the ``.type`` attribute for ``PeriodDtype``, ``DatetimeTZDtype``, and ``IntervalDtype`` to be instances of the dtype (``Period``, ``Timestamp``, and ``Interval`` respectively) (:issue:`22938`)
608
+
- :func:`ExtensionArray.isna` is allowed to return an ``ExtensionArray`` (:issue:`22325`).
577
609
- Support for reduction operations such as ``sum``, ``mean`` via opt-in base class method override (:issue:`22762`)
578
610
579
611
.. _whatsnew_0240.api.incompatibilities:
@@ -656,6 +688,7 @@ Other API Changes
656
688
- :class:`pandas.io.formats.style.Styler` supports a ``number-format`` property when using :meth:`~pandas.io.formats.style.Styler.to_excel` (:issue:`22015`)
657
689
- :meth:`DataFrame.corr` and :meth:`Series.corr` now raise a ``ValueError`` along with a helpful error message instead of a ``KeyError`` when supplied with an invalid method (:issue:`22298`)
658
690
- :meth:`shift` will now always return a copy, instead of the previous behaviour of returning self when shifting by 0 (:issue:`22397`)
691
+
- Slicing a single row of a DataFrame with multiple ExtensionArrays of the same type now preserves the dtype, rather than coercing to object (:issue:`22784`)
659
692
660
693
.. _whatsnew_0240.deprecations:
661
694
@@ -897,13 +930,6 @@ Groupby/Resample/Rolling
897
930
- :func:`RollingGroupby.agg` and :func:`ExpandingGroupby.agg` now support multiple aggregation functions as parameters (:issue:`15072`)
898
931
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` when resampling by a weekly offset (``'W'``) across a DST transition (:issue:`9119`, :issue:`21459`)
899
932
900
-
Sparse
901
-
^^^^^^
902
-
903
-
-
904
-
-
905
-
-
906
-
907
933
Reshaping
908
934
^^^^^^^^^
909
935
@@ -922,6 +948,19 @@ Reshaping
922
948
- Bug in :func:`merge_asof` when merging on float values within defined tolerance (:issue:`22981`)
923
949
- Bug in :func:`pandas.concat` when concatenating a multicolumn DataFrame with tz-aware data against a DataFrame with a different number of columns (:issue`22796`)
924
950
951
+
.. _whatsnew_0240.bug_fixes.sparse:
952
+
953
+
Sparse
954
+
^^^^^^
955
+
956
+
- Updating a boolean, datetime, or timedelta column to be Sparse now works (:issue:`22367`)
957
+
- Bug in :meth:`Series.to_sparse` with Series already holding sparse data not constructing properly (:issue:`22389`)
958
+
- Providing a ``sparse_index`` to the SparseArray constructor no longer defaults the na-value to ``np.nan`` for all dtypes. The correct na_value for ``data.dtype`` is now used.
959
+
- Bug in ``SparseArray.nbytes`` under-reporting its memory usage by not including the size of its sparse index.
960
+
- Improved performance of :meth:`Series.shift` for non-NA ``fill_value``, as values are no longer converted to a dense array.
961
+
- Bug in ``DataFrame.groupby`` not including ``fill_value`` in the groups for non-NA ``fill_value`` when grouping by a sparse column (:issue:`5078`)
962
+
- Bug in unary inversion operator (``~``) on a ``SparseSeries`` with boolean values. The performance of this has also been improved (:issue:`22835`)
0 commit comments