-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
SparseSeries accepts scipy.sparse.spmatrix in constructor #16617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
639fc6f
690a09f
9d6d2fe
3a12685
6bc8c8a
97da8bd
293bb47
47ef68a
ef03e73
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,6 +25,10 @@ New features | |
- Added `__fspath__` method to :class`:pandas.HDFStore`, :class:`pandas.ExcelFile`, | ||
and :class:`pandas.ExcelWriter` to work properly with the file system path protocol (:issue:`13823`) | ||
|
||
- ``SparseSeries`` and ``SparseArray`` now support 1d ``scipy.sparse.spmatrix`` in constructor. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in the constructor |
||
Additionally, ``SparseDataFrame`` can be assigned columns of ``scipy.sparse.spmatrix``; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. make this 2nd sentence a separate bullet point (you can use same issue on both of them, or 2nd one should be the PR number maybe) |
||
see :ref:`here <sparse.scipysparse_series>`. (:issue:`15634`) | ||
|
||
.. _whatsnew_0210.enhancements.other: | ||
|
||
Other Enhancements | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -433,6 +433,16 @@ def __getitem__(self, key): | |
else: | ||
return self._get_item_cache(key) | ||
|
||
def __setitem__(self, key, value): | ||
if is_scipy_sparse(value): | ||
if any(ax == 1 for ax in value.shape): # 1d spmatrix | ||
value = SparseArray(value, fill_value=self._default_fill_value, | ||
kind=self._default_kind) | ||
else: | ||
# 2d; make it iterable | ||
value = list(value.tocsc().T) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does this materialize? |
||
super(SparseDataFrame, self).__setitem__(key, value) | ||
|
||
@Appender(DataFrame.get_value.__doc__, indents=0) | ||
def get_value(self, index, col, takeable=False): | ||
if takeable is True: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -540,6 +540,37 @@ def test_setitem_array(self): | |
self.frame['F'].reindex(index), | ||
check_names=False) | ||
|
||
def test_setitem_spmatrix(self): | ||
# GH-15634 | ||
tm.skip_if_no_package('scipy') | ||
from scipy.sparse import csr_matrix | ||
|
||
sdf = self.frame.copy(False) | ||
|
||
def _equal(spm1, spm2): | ||
return np.all(spm1.toarray() == spm2.toarray()) | ||
|
||
# 1d -- column | ||
spm = csr_matrix(np.arange(len(sdf))).T | ||
sdf['X'] = spm | ||
assert _equal(sdf[['X']].to_coo(), spm) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this comparision on the scipy side is fine, but also let's compare with assert_sparse_series/frame_equal |
||
# 1d -- existing column | ||
sdf['A'] = spm.T | ||
assert _equal(sdf[['X']].to_coo(), spm) | ||
|
||
# 1d row -- changing series contents not yet supported | ||
spm = csr_matrix(np.arange(sdf.shape[1], dtype=float)) | ||
idx = np.zeros(sdf.shape[0], dtype=bool) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you test with |
||
idx[1] = True | ||
tm.assert_raises_regex(TypeError, 'assignment', | ||
lambda: sdf.__setitem__(idx, spm)) | ||
|
||
# 2d -- 2 columns | ||
spm = csr_matrix(np.eye(len(sdf))[:, :2]) | ||
sdf[['X', 'A']] = spm | ||
assert _equal(sdf[['X', 'A']].to_coo(), spm) | ||
|
||
def test_delitem(self): | ||
A = self.frame['A'] | ||
C = self.frame['C'] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
say that this is deprecated in 0.21.0