REGR: concat of Sparse with incompatible dtype now gives Sparse[object] instead of object #34336

jorisvandenbossche · 2020-05-23T12:00:32Z

Regression on master related to the common_dtype mechanism, I suppose:

In [5]: pd.__version__    
Out[5]: '1.0.3'

In [6]: s1 = pd.Series([1, 0, 2], dtype=pd.SparseDtype("int64", 0)) 

In [7]: s2 = pd.Series(["a", "b", "c"], dtype="category") 

In [8]: pd.concat([s1, s2]) 
Out[8]: 
0    1
1    0
2    2
0    a
1    b
2    c
dtype: object

In [9]: pd.concat([s2, s1]) 
Out[9]: 
0    a
1    b
2    c
0    1
1    0
2    2
dtype: object

(and the same on v0.25.3)

But on master:

# raising in SparseDtype._get_common_dtype
In [3]: pd.concat([s1, s2])  
...
TypeError: data type not understood

In [4]: pd.concat([s2, s1])   
Out[4]: 
0    a
1    b
2    c
0    1
1    0
2    2
dtype: Sparse[object, 0]

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2020-05-23T12:24:51Z

So actually the behaviour is a bit more complicated ..

This is also on pandas 1.0.3:

In [9]: s3 = pd.Series(["a", "b", "c"]) 

In [10]: pd.concat([s1, s3]) 
Out[10]: 
0    1
1    0
2    2
0    a
1    b
2    c
dtype: Sparse[object, 0]

So my original example with a string categorical, that resulted in object dtype, but a plain string series, results in Sparse[object] ..

I am not fully sure whether we should always try to preserve "sparseness" (so giving Sparse[object]) or follow the "normal" rule (concatting incompatible dtypes results in object dtype).

cc @TomAugspurger

jorisvandenbossche added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type labels May 23, 2020

jorisvandenbossche added this to the 1.1 milestone May 23, 2020

jorisvandenbossche self-assigned this May 23, 2020

jorisvandenbossche added the Blocker Blocking issue or pull request for an upcoming release label May 23, 2020

jorisvandenbossche mentioned this issue May 23, 2020

BUG: fix concat of Sparse with non-sparse dtypes #34338

Merged

jorisvandenbossche closed this as completed in #34338 May 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REGR: concat of Sparse with incompatible dtype now gives Sparse[object] instead of object #34336

REGR: concat of Sparse with incompatible dtype now gives Sparse[object] instead of object #34336

jorisvandenbossche commented May 23, 2020 •

edited

Loading

jorisvandenbossche commented May 23, 2020

REGR: concat of Sparse with incompatible dtype now gives Sparse[object] instead of object #34336

REGR: concat of Sparse with incompatible dtype now gives Sparse[object] instead of object #34336

Comments

jorisvandenbossche commented May 23, 2020 • edited Loading

jorisvandenbossche commented May 23, 2020

jorisvandenbossche commented May 23, 2020 •

edited

Loading