Skip to content

Commit 1dd05cc

Browse files
gioiabdatapythonista
authored andcommitted
DOC: update the pandas.DataFrame.to_sparse docstring (#20193)
* Updates the documentation for pandas.DataFrame.to_sparse. * Minor fixes and adding more real world examples
1 parent 3fa171f commit 1dd05cc

File tree

1 file changed

+45
-4
lines changed

1 file changed

+45
-4
lines changed

pandas/core/frame.py

+45-4
Original file line numberDiff line numberDiff line change
@@ -1599,16 +1599,57 @@ def from_csv(cls, path, header=0, sep=',', index_col=0, parse_dates=True,
15991599

16001600
def to_sparse(self, fill_value=None, kind='block'):
16011601
"""
1602-
Convert to SparseDataFrame
1602+
Convert to SparseDataFrame.
1603+
1604+
Implement the sparse version of the DataFrame meaning that any data
1605+
matching a specific value it's omitted in the representation.
1606+
The sparse DataFrame allows for a more efficient storage.
16031607
16041608
Parameters
16051609
----------
1606-
fill_value : float, default NaN
1607-
kind : {'block', 'integer'}
1610+
fill_value : float, default None
1611+
The specific value that should be omitted in the representation.
1612+
kind : {'block', 'integer'}, default 'block'
1613+
The kind of the SparseIndex tracking where data is not equal to
1614+
the fill value:
1615+
1616+
- 'block' tracks only the locations and sizes of blocks of data.
1617+
- 'integer' keeps an array with all the locations of the data.
1618+
1619+
In most cases 'block' is recommended, since it's more memory
1620+
efficient.
16081621
16091622
Returns
16101623
-------
1611-
y : SparseDataFrame
1624+
SparseDataFrame
1625+
The sparse representation of the DataFrame.
1626+
1627+
See Also
1628+
--------
1629+
DataFrame.to_dense :
1630+
Converts the DataFrame back to the its dense form.
1631+
1632+
Examples
1633+
--------
1634+
>>> df = pd.DataFrame([(np.nan, np.nan),
1635+
... (1., np.nan),
1636+
... (np.nan, 1.)])
1637+
>>> df
1638+
0 1
1639+
0 NaN NaN
1640+
1 1.0 NaN
1641+
2 NaN 1.0
1642+
>>> type(df)
1643+
<class 'pandas.core.frame.DataFrame'>
1644+
1645+
>>> sdf = df.to_sparse()
1646+
>>> sdf
1647+
0 1
1648+
0 NaN NaN
1649+
1 1.0 NaN
1650+
2 NaN 1.0
1651+
>>> type(sdf)
1652+
<class 'pandas.core.sparse.frame.SparseDataFrame'>
16121653
"""
16131654
from pandas.core.sparse.frame import SparseDataFrame
16141655
return SparseDataFrame(self._series, index=self.index,

0 commit comments

Comments
 (0)