You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.19.0.txt
+54-1
Original file line number
Diff line number
Diff line change
@@ -17,6 +17,7 @@ Highlights include:
17
17
- ``.rolling()`` are now time-series aware, see :ref:`here <whatsnew_0190.enhancements.rolling_ts>`
18
18
- pandas development api, see :ref:`here <whatsnew_0190.dev_api>`
19
19
- ``PeriodIndex`` now has its own ``period`` dtype, and changed to be more consistent with other ``Index`` classes. See ref:`here <whatsnew_0190.api.period>`
20
+
- Sparse data structures now gained enhanced support of ``int`` and ``bool`` dtypes, see :ref:`here <whatsnew_0190.sparse>`
20
21
21
22
.. contents:: What's new in v0.19.0
22
23
:local:
@@ -975,6 +976,51 @@ Sparse Changes
975
976
976
977
These changes allow pandas to handle sparse data with more dtypes, and for work to make a smoother experience with data handling.
977
978
979
+
980
+
``int64`` and ``bool`` support enhancements
981
+
"""""""""""""""""""""""""""""""""""""""""""
982
+
983
+
Sparse data structures now gained enhanced support of ``int64`` and ``bool`` ``dtype`` (:issue:`667`, :issue:`13849`)
984
+
985
+
Previously, sparse data were ``float64`` dtype by default, even if all inputs were ``int`` or ``bool`` dtype. You had to specify ``dtype`` explicitly to create sparse data with ``int64`` dtype. Also, ``fill_value`` had to be specified explicitly becuase it's default was ``np.nan`` which doesn't appear in ``int64`` or ``bool`` data.
986
+
987
+
.. code-block:: ipython
988
+
989
+
In [1]: pd.SparseArray([1, 2, 0, 0])
990
+
Out[1]:
991
+
[1.0, 2.0, 0.0, 0.0]
992
+
Fill: nan
993
+
IntIndex
994
+
Indices: array([0, 1, 2, 3], dtype=int32)
995
+
996
+
# specifying int64 dtype, but all values are stored in sp_values because
997
+
# fill_value default is np.nan
998
+
In [2]: pd.SparseArray([1, 2, 0, 0], dtype=np.int64)
999
+
Out[2]:
1000
+
[1, 2, 0, 0]
1001
+
Fill: nan
1002
+
IntIndex
1003
+
Indices: array([0, 1, 2, 3], dtype=int32)
1004
+
1005
+
In [3]: pd.SparseArray([1, 2, 0, 0], dtype=np.int64, fill_value=0)
1006
+
Out[3]:
1007
+
[1, 2, 0, 0]
1008
+
Fill: 0
1009
+
IntIndex
1010
+
Indices: array([0, 1], dtype=int32)
1011
+
1012
+
As of v0.19.0, sparse data keeps the input dtype, and assign more appropriate ``fill_value`` default (``0`` for ``int64`` dtype, ``False`` for ``bool`` dtype).
1013
+
1014
+
.. ipython :: python
1015
+
1016
+
pd.SparseArray([1, 2, 0, 0], dtype=np.int64)
1017
+
pd.SparseArray([True, False, False, False])
1018
+
1019
+
See the :ref:`docs <sparse.dtype>` for more details.
1020
+
1021
+
Operators now preserve dtypes
1022
+
"""""""""""""""""""""""""""""
1023
+
978
1024
- Sparse data structure now can preserve ``dtype`` after arithmetic ops (:issue:`13848`)
979
1025
980
1026
.. ipython:: python
@@ -1001,6 +1047,9 @@ Note that the limitation is applied to ``fill_value`` which default is ``np.nan`
1001
1047
Out[7]:
1002
1048
ValueError: unable to coerce current fill_value nan to int64 dtype
1003
1049
1050
+
Other sparse fixes
1051
+
""""""""""""""""""
1052
+
1004
1053
- Subclassed ``SparseDataFrame`` and ``SparseSeries`` now preserve class types when slicing or transposing. (:issue:`13787`)
1005
1054
- ``SparseArray`` with ``bool`` dtype now supports logical (bool) operators (:issue:`14000`)
1006
1055
- Bug in ``SparseSeries`` with ``MultiIndex`` ``[]`` indexing may raise ``IndexError`` (:issue:`13144`)
@@ -1011,6 +1060,11 @@ Note that the limitation is applied to ``fill_value`` which default is ``np.nan`
1011
1060
- Bug in ``SparseArray`` and ``SparseSeries`` don't apply ufunc to ``fill_value`` (:issue:`13853`)
1012
1061
- Bug in ``SparseSeries.abs`` incorrectly keeps negative ``fill_value`` (:issue:`13853`)
1013
1062
- Bug in single row slicing on multi-type ``SparseDataFrame``s, types were previously forced to float (:issue:`13917`)
1063
+
- Bug in ``SparseSeries`` slicing changes integer dtype to float (:issue:`8292`)
1064
+
- Bug in ``SparseDataFarme`` comparison ops may raise ``TypeError`` (:issue:`13001`)
1065
+
- Bug in ``SparseDataFarme.isnull`` raises ``ValueError`` (:issue:`8276`)
1066
+
- Bug in ``SparseSeries`` representation with ``bool`` dtype may raise ``IndexError`` (:issue:`13110`)
1067
+
- Bug in ``SparseSeries`` and ``SparseDataFrame`` of ``bool`` or ``int64`` dtype may display its values like ``float64`` dtype (:issue:`13110`)
1014
1068
- Bug in sparse indexing using ``SparseArray`` with ``bool`` dtype may return incorrect result (:issue:`13985`)
1015
1069
- Bug in ``SparseArray`` created from ``SparseSeries`` may lose ``dtype`` (:issue:`13999`)
1016
1070
- Bug in ``SparseSeries`` comparison with dense returns normal ``Series`` rather than ``SparseSeries`` (:issue:`13999`)
0 commit comments