From 23696b248c22aa044b566bce37ddcd9878f0f9a6 Mon Sep 17 00:00:00 2001 From: Irv Lustig Date: Fri, 3 Jan 2020 10:16:44 -0500 Subject: [PATCH] DOC: Deprecate pandas.SparseArray --- .../development/contributing_docstring.rst | 2 +- doc/source/getting_started/basics.rst | 2 +- doc/source/getting_started/dsintro.rst | 2 +- doc/source/reference/arrays.rst | 4 ++-- doc/source/user_guide/sparse.rst | 16 ++++++++-------- doc/source/whatsnew/v1.0.0.rst | 1 + 6 files changed, 14 insertions(+), 13 deletions(-) diff --git a/doc/source/development/contributing_docstring.rst b/doc/source/development/contributing_docstring.rst index 34bc5f44eb0c0..d897889ed9eff 100644 --- a/doc/source/development/contributing_docstring.rst +++ b/doc/source/development/contributing_docstring.rst @@ -399,7 +399,7 @@ DataFrame: * DataFrame * pandas.Index * pandas.Categorical -* pandas.SparseArray +* pandas.arrays.SparseArray If the exact type is not relevant, but must be compatible with a numpy array, array-like can be specified. If Any type that can be iterated is diff --git a/doc/source/getting_started/basics.rst b/doc/source/getting_started/basics.rst index f47fa48eb6202..4fef5efbd1551 100644 --- a/doc/source/getting_started/basics.rst +++ b/doc/source/getting_started/basics.rst @@ -1951,7 +1951,7 @@ documentation sections for more on each type. | period | :class:`PeriodDtype` | :class:`Period` | :class:`arrays.PeriodArray` | ``'period[]'``, | :ref:`timeseries.periods` | | (time spans) | | | | ``'Period[]'`` | | +-------------------+---------------------------+--------------------+-------------------------------+-----------------------------------------+-------------------------------+ -| sparse | :class:`SparseDtype` | (none) | :class:`SparseArray` | ``'Sparse'``, ``'Sparse[int]'``, | :ref:`sparse` | +| sparse | :class:`SparseDtype` | (none) | :class:`arrays.SparseArray` | ``'Sparse'``, ``'Sparse[int]'``, | :ref:`sparse` | | | | | | ``'Sparse[float]'`` | | +-------------------+---------------------------+--------------------+-------------------------------+-----------------------------------------+-------------------------------+ | intervals | :class:`IntervalDtype` | :class:`Interval` | :class:`arrays.IntervalArray` | ``'interval'``, ``'Interval'``, | :ref:`advanced.intervalindex` | diff --git a/doc/source/getting_started/dsintro.rst b/doc/source/getting_started/dsintro.rst index a07fcbd8b67c4..82d4b5e34e4f8 100644 --- a/doc/source/getting_started/dsintro.rst +++ b/doc/source/getting_started/dsintro.rst @@ -741,7 +741,7 @@ implementation takes precedence and a Series is returned. np.maximum(ser, idx) NumPy ufuncs are safe to apply to :class:`Series` backed by non-ndarray arrays, -for example :class:`SparseArray` (see :ref:`sparse.calculation`). If possible, +for example :class:`arrays.SparseArray` (see :ref:`sparse.calculation`). If possible, the ufunc is applied without converting the underlying data to an ndarray. Console display diff --git a/doc/source/reference/arrays.rst b/doc/source/reference/arrays.rst index 2c8382e916ed8..c71350ecd73b3 100644 --- a/doc/source/reference/arrays.rst +++ b/doc/source/reference/arrays.rst @@ -444,13 +444,13 @@ Sparse data ----------- Data where a single value is repeated many times (e.g. ``0`` or ``NaN``) may -be stored efficiently as a :class:`SparseArray`. +be stored efficiently as a :class:`arrays.SparseArray`. .. autosummary:: :toctree: api/ :template: autosummary/class_without_autosummary.rst - SparseArray + arrays.SparseArray .. autosummary:: :toctree: api/ diff --git a/doc/source/user_guide/sparse.rst b/doc/source/user_guide/sparse.rst index c258a8840b714..8588fac4a18d0 100644 --- a/doc/source/user_guide/sparse.rst +++ b/doc/source/user_guide/sparse.rst @@ -15,7 +15,7 @@ can be chosen, including 0) is omitted. The compressed values are not actually s arr = np.random.randn(10) arr[2:-2] = np.nan - ts = pd.Series(pd.SparseArray(arr)) + ts = pd.Series(pd.arrays.SparseArray(arr)) ts Notice the dtype, ``Sparse[float64, nan]``. The ``nan`` means that elements in the @@ -51,7 +51,7 @@ identical to their dense counterparts. SparseArray ----------- -:class:`SparseArray` is a :class:`~pandas.api.extensions.ExtensionArray` +:class:`arrays.SparseArray` is a :class:`~pandas.api.extensions.ExtensionArray` for storing an array of sparse values (see :ref:`basics.dtypes` for more on extension arrays). It is a 1-dimensional ndarray-like object storing only values distinct from the ``fill_value``: @@ -61,7 +61,7 @@ only values distinct from the ``fill_value``: arr = np.random.randn(10) arr[2:5] = np.nan arr[7:8] = np.nan - sparr = pd.SparseArray(arr) + sparr = pd.arrays.SparseArray(arr) sparr A sparse array can be converted to a regular (dense) ndarray with :meth:`numpy.asarray` @@ -144,7 +144,7 @@ to ``SparseArray`` and get a ``SparseArray`` as a result. .. ipython:: python - arr = pd.SparseArray([1., np.nan, np.nan, -2., np.nan]) + arr = pd.arrays.SparseArray([1., np.nan, np.nan, -2., np.nan]) np.abs(arr) @@ -153,7 +153,7 @@ the correct dense result. .. ipython:: python - arr = pd.SparseArray([1., -1, -1, -2., -1], fill_value=-1) + arr = pd.arrays.SparseArray([1., -1, -1, -2., -1], fill_value=-1) np.abs(arr) np.abs(arr).to_dense() @@ -194,7 +194,7 @@ From an array-like, use the regular :class:`Series` or .. ipython:: python # New way - pd.DataFrame({"A": pd.SparseArray([0, 1])}) + pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])}) From a SciPy sparse matrix, use :meth:`DataFrame.sparse.from_spmatrix`, @@ -256,10 +256,10 @@ Instead, you'll need to ensure that the values being assigned are sparse .. ipython:: python - df = pd.DataFrame({"A": pd.SparseArray([0, 1])}) + df = pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])}) df['B'] = [0, 0] # remains dense df['B'].dtype - df['B'] = pd.SparseArray([0, 0]) + df['B'] = pd.arrays.SparseArray([0, 0]) df['B'].dtype The ``SparseDataFrame.default_kind`` and ``SparseDataFrame.default_fill_value`` attributes diff --git a/doc/source/whatsnew/v1.0.0.rst b/doc/source/whatsnew/v1.0.0.rst index a5ea60d0a0d19..59ce49b9144c7 100755 --- a/doc/source/whatsnew/v1.0.0.rst +++ b/doc/source/whatsnew/v1.0.0.rst @@ -577,6 +577,7 @@ Deprecations it is recommended to use ``json_normalize`` as :func:`pandas.json_normalize` instead (:issue:`27586`). - :meth:`DataFrame.to_stata`, :meth:`DataFrame.to_feather`, and :meth:`DataFrame.to_parquet` argument "fname" is deprecated, use "path" instead (:issue:`23574`) - The deprecated internal attributes ``_start``, ``_stop`` and ``_step`` of :class:`RangeIndex` now raise a ``FutureWarning`` instead of a ``DeprecationWarning`` (:issue:`26581`) +- ``pandas.SparseArray`` has been deprecated. Use ``pandas.arrays.SparseArray`` (:class:`arrays.SparseArray`) instead. (:issue:`30642`) **Selecting Columns from a Grouped DataFrame**