You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The issue is in df._sanitize_column that returns a 2d dense array with non-fill elements:
In [9]: df=pd.DataFrame({'c_1': list('abc')})
In [10]: sp_col=pd.Series([0,0,1]).to_sparse(fill_value=0)
In [11]: df._sanitize_column('n', sp_col)
Out[11]: array([[1]])
Hacking sanitize column is easy, but it uncovers yet another issue with ndarray subclassing:
In [54]: sp_arr=pd.SparseArray([0,0,1], fill_value=0)
In [55]: sp_arrOut[55]:
[0, 0, 1.0]
Fill: 0IntIndexIndices: array([2], dtype=int32)
In [56]: np.asarray(sp_arr)
Out[56]: array([ 1.])
This happens because np.asarray checks on C level that sp_arr provides PEP3118 buffer interface (which ndarray does) and uses that representation which contains only non-fill elements. Which is unfortunate because it can not be overridden by inheriting class on Python level (see python issue).
from SO
thought this was well tested.....
The text was updated successfully, but these errors were encountered: