Skip to content

API/DEPR: deprecate SparseSeries.from_coo and accept in constructor #15634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Mar 9, 2017 · 6 comments · Fixed by #28425
Closed

API/DEPR: deprecate SparseSeries.from_coo and accept in constructor #15634

jreback opened this issue Mar 9, 2017 · 6 comments · Fixed by #28425
Labels
Deprecate Functionality to remove in pandas Sparse Sparse Data Type
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Mar 9, 2017

xref #15497

using SparseDataFrame as the model.

@jreback jreback added Deprecate Functionality to remove in pandas Difficulty Intermediate Sparse Sparse Data Type labels Mar 9, 2017
@jreback jreback added this to the 0.20.0 milestone Mar 9, 2017
@jreback
Copy link
Contributor Author

jreback commented Mar 9, 2017

cc @kernc

@kernc
Copy link
Contributor

kernc commented Mar 20, 2017

So how would this work? The exact same functionality as currently in .from_coo which assigns a multiindex? Note, series from_coo is currently different from any series sliced from a sparse data frame:

>>> spm = scipy.sparse.coo_matrix(np.eye(5))

>>> pd.SparseSeries.from_coo(spm)
0  0    1.0
1  1    1.0
2  2    1.0
3  3    1.0
4  4    1.0
dtype: float64
BlockIndex
Block locations: array([0], dtype=int32)
Block lengths: array([5], dtype=int32)

>>> pd.SparseSeries.from_coo(spm.tocsr()[0].tocoo())  # first row only
0  0    1.0
dtype: float64
BlockIndex
Block locations: array([0], dtype=int32)
Block lengths: array([1], dtype=int32)

>>> pd.SparseDataFrame(spm).iloc[0]  # first row sliced from df
0    1.0
1    NaN
2    NaN
3    NaN
4    NaN
Name: 0, dtype: float64
BlockIndex
Block locations: array([0], dtype=int32)
Block lengths: array([1], dtype=int32)

@jreback
Copy link
Contributor Author

jreback commented Mar 20, 2017

Looking at this again, I think we should just fully deprecate this method. You can do the exact same thing with SparseDataFrame and its more natural. Though I can see a MultiIndex with a set of data is pretty much what COO is.

I could see SparseSeries accepting a 1-d sparse structure though (is there a concept in scipy)?

Maybe have a read in the original issue and see that the use was.

@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017
@kernc
Copy link
Contributor

kernc commented Apr 24, 2017

a MultiIndex with a set of data is pretty much what COO is

But that's an implementation detail of what is generally an arbitrary n-d structure. (row, col), value triplets could just as well (or better) be represented with a non-sparse Series.

I could see SparseSeries accepting a 1-d sparse structure though (is there a concept in scipy)?

All scipy sparse matrces are ndim == 2. Since spm.mean(0) is a 1x5 matrix, we still might accept such a matrix in the constructor when any dimension has length 1. Will prepare something. Is it too late to get it in v0.20.0?


to_coo/from_coo introduced in PR #9076 fixing #8048.
The main use case seems to have been the use of SparseSeries.unstack() to shape the MultiIndex sparse series into a sparse data frame. So indeed, using the frame is more natural.

@jreback
Copy link
Contributor Author

jreback commented Apr 27, 2017

@kernc yeah the more I look at this the more I think we should just remove these from Series entirely. Is there truly a useful case for having them on Series?

@jreback
Copy link
Contributor Author

jreback commented Apr 27, 2017

Will prepare something. Is it too late to get it in v0.20.0?

yes; to the extent that something is a true bug fix or simply an enhancement it can go in a point release (e.g. 0.20.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Sparse Sparse Data Type
Projects
None yet
2 participants