You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Regression on master related to the common_dtype mechanism, I suppose:
In [5]: pd.__version__
Out[5]: '1.0.3'
In [6]: s1 = pd.Series([1, 0, 2], dtype=pd.SparseDtype("int64", 0))
In [7]: s2 = pd.Series(["a", "b", "c"], dtype="category")
In [8]: pd.concat([s1, s2])
Out[8]:
0 1
1 0
2 2
0 a
1 b
2 c
dtype: object
In [9]: pd.concat([s2, s1])
Out[9]:
0 a
1 b
2 c
0 1
1 0
2 2
dtype: object
(and the same on v0.25.3)
But on master:
# raising in SparseDtype._get_common_dtype
In [3]: pd.concat([s1, s2])
...
TypeError: data type not understood
In [4]: pd.concat([s2, s1])
Out[4]:
0 a
1 b
2 c
0 1
1 0
2 2
dtype: Sparse[object, 0]
The text was updated successfully, but these errors were encountered:
So actually the behaviour is a bit more complicated ..
This is also on pandas 1.0.3:
In [9]: s3 = pd.Series(["a", "b", "c"])
In [10]: pd.concat([s1, s3])
Out[10]:
0 1
1 0
2 2
0 a
1 b
2 c
dtype: Sparse[object, 0]
So my original example with a string categorical, that resulted in object dtype, but a plain string series, results in Sparse[object] ..
I am not fully sure whether we should always try to preserve "sparseness" (so giving Sparse[object]) or follow the "normal" rule (concatting incompatible dtypes results in object dtype).
Regression on master related to the
common_dtype
mechanism, I suppose:(and the same on v0.25.3)
But on master:
The text was updated successfully, but these errors were encountered: