Skip to content

Commit 76e3696

Browse files
topper-123pooja-subramaniam
authored andcommitted
API: Harmonize dtype for index levels for Series.sparse.from_coo (pandas-dev#50926)
* API: Harmonize dtype for index levels for Series.sparse.from_coo * add gh number
1 parent 1317afe commit 76e3696

File tree

3 files changed

+5
-10
lines changed

3 files changed

+5
-10
lines changed

doc/source/whatsnew/v2.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -602,6 +602,7 @@ Other API changes
602602
methods to get a full slice (for example ``df.loc[:]`` or ``df[:]``) (:issue:`49469`)
603603
- Disallow computing ``cumprod`` for :class:`Timedelta` object; previously this returned incorrect values (:issue:`50246`)
604604
- Loading a JSON file with duplicate columns using ``read_json(orient='split')`` renames columns to avoid duplicates, as :func:`read_csv` and the other readers do (:issue:`50370`)
605+
- The levels of the index of the :class:`Series` returned from ``Series.sparse.from_coo`` now always have dtype ``int32``. Previously they had dtype ``int64`` (:issue:`50926`)
605606
- :func:`to_datetime` with ``unit`` of either "Y" or "M" will now raise if a sequence contains a non-round ``float`` value, matching the ``Timestamp`` behavior (:issue:`50301`)
606607
-
607608

pandas/core/arrays/sparse/scipy_sparse.py

+1-4
Original file line numberDiff line numberDiff line change
@@ -203,9 +203,6 @@ def coo_to_sparse_series(
203203
ser = ser.sort_index()
204204
ser = ser.astype(SparseDtype(ser.dtype))
205205
if dense_index:
206-
# is there a better constructor method to use here?
207-
i = range(A.shape[0])
208-
j = range(A.shape[1])
209-
ind = MultiIndex.from_product([i, j])
206+
ind = MultiIndex.from_product([A.row, A.col])
210207
ser = ser.reindex(ind)
211208
return ser

pandas/tests/arrays/sparse/test_accessor.py

+3-6
Original file line numberDiff line numberDiff line change
@@ -218,14 +218,11 @@ def test_series_from_coo(self, dtype, dense_index):
218218
A = scipy.sparse.eye(3, format="coo", dtype=dtype)
219219
result = pd.Series.sparse.from_coo(A, dense_index=dense_index)
220220

221-
# TODO: GH49560: scipy.sparse.eye always has A.row and A.col dtype as int32.
222-
# fix index_dtype to follow scipy.sparse convention (always int32)?
223-
index_dtype = np.int64 if dense_index else np.int32
224221
index = pd.MultiIndex.from_tuples(
225222
[
226-
np.array([0, 0], dtype=index_dtype),
227-
np.array([1, 1], dtype=index_dtype),
228-
np.array([2, 2], dtype=index_dtype),
223+
np.array([0, 0], dtype=np.int32),
224+
np.array([1, 1], dtype=np.int32),
225+
np.array([2, 2], dtype=np.int32),
229226
],
230227
)
231228
expected = pd.Series(SparseArray(np.array([1, 1, 1], dtype=dtype)), index=index)

0 commit comments

Comments
 (0)