Skip to content

DOC: Remove old SparseDataFrame/SparseSeries references #52092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 2 additions & 92 deletions doc/source/user_guide/sparse.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,70 +153,14 @@ the correct dense result.
np.abs(arr)
np.abs(arr).to_dense()

.. _sparse.migration:

Migrating
---------

.. note::

``SparseSeries`` and ``SparseDataFrame`` were removed in pandas 1.0.0. This migration
guide is present to aid in migrating from previous versions.

In older versions of pandas, the ``SparseSeries`` and ``SparseDataFrame`` classes (documented below)
were the preferred way to work with sparse data. With the advent of extension arrays, these subclasses
are no longer needed. Their purpose is better served by using a regular Series or DataFrame with
sparse values instead.

.. note::

There's no performance or memory penalty to using a Series or DataFrame with sparse values,
rather than a SparseSeries or SparseDataFrame.

This section provides some guidance on migrating your code to the new style. As a reminder,
you can use the Python warnings module to control warnings. But we recommend modifying
your code, rather than ignoring the warning.

**Construction**

From an array-like, use the regular :class:`Series` or
:class:`DataFrame` constructors with :class:`arrays.SparseArray` values.

.. code-block:: python

# Previous way
>>> pd.SparseDataFrame({"A": [0, 1]})

.. ipython:: python

# New way
pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])})

From a SciPy sparse matrix, use :meth:`DataFrame.sparse.from_spmatrix`,

.. code-block:: python

# Previous way
>>> from scipy import sparse
>>> mat = sparse.eye(3)
>>> df = pd.SparseDataFrame(mat, columns=['A', 'B', 'C'])

.. ipython:: python

# New way
from scipy import sparse
mat = sparse.eye(3)
df = pd.DataFrame.sparse.from_spmatrix(mat, columns=['A', 'B', 'C'])
df.dtypes

**Conversion**

From sparse to dense, use the ``.sparse`` accessors
To convert data from sparse to dense, use the ``.sparse`` accessors

.. ipython:: python

df.sparse.to_dense()
df.sparse.to_coo()
sdf.sparse.to_dense()

From dense to sparse, use :meth:`DataFrame.astype` with a :class:`SparseDtype`.

Expand All @@ -226,40 +170,6 @@ From dense to sparse, use :meth:`DataFrame.astype` with a :class:`SparseDtype`.
dtype = pd.SparseDtype(int, fill_value=0)
dense.astype(dtype)

**Sparse Properties**

Sparse-specific properties, like ``density``, are available on the ``.sparse`` accessor.

.. ipython:: python

df.sparse.density

**General differences**

In a ``SparseDataFrame``, *all* columns were sparse. A :class:`DataFrame` can have a mixture of
sparse and dense columns. As a consequence, assigning new columns to a :class:`DataFrame` with sparse
values will not automatically convert the input to be sparse.

.. code-block:: python

# Previous Way
>>> df = pd.SparseDataFrame({"A": [0, 1]})
>>> df['B'] = [0, 0] # implicitly becomes Sparse
>>> df['B'].dtype
Sparse[int64, nan]

Instead, you'll need to ensure that the values being assigned are sparse

.. ipython:: python

df = pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])})
df['B'] = [0, 0] # remains dense
df['B'].dtype
df['B'] = pd.arrays.SparseArray([0, 0])
df['B'].dtype

The ``SparseDataFrame.default_kind`` and ``SparseDataFrame.default_fill_value`` attributes
have no replacement.

.. _sparse.scipysparse:

Expand Down
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -921,7 +921,7 @@ by a ``Series`` or ``DataFrame`` with sparse values.
df = pd.DataFrame({"A": pd.arrays.SparseArray([0, 0, 1, 2])})
df.dtypes

The memory usage of the two approaches is identical. See :ref:`sparse.migration` for more (:issue:`19239`).
The memory usage of the two approaches is identical (:issue:`19239`).

msgpack format
^^^^^^^^^^^^^^
Expand Down
3 changes: 1 addition & 2 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -872,8 +872,7 @@ Removal of prior version deprecations/changes

``SparseSeries``, ``SparseDataFrame`` and the ``DataFrame.to_sparse`` method
have been removed (:issue:`28425`). We recommend using a ``Series`` or
``DataFrame`` with sparse values instead. See :ref:`sparse.migration` for help
with migrating existing code.
``DataFrame`` with sparse values instead.

.. _whatsnew_100.matplotlib_units:

Expand Down