-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Behaviour of Categorical inputs to sparse data structures #19278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
why would you actually want to do this? using |
okay. so make it invalid. I'll admit i don't have a use case for it. it
just came up when looking into the implementation of unstack and how to
make that sparse. I still think the SparseSeries error is inappropriate.
…On 17 Jan 2018 11:40 pm, "Jeff Reback" ***@***.***> wrote:
why would you actually want to do this? using object dtypes in sparse has
very little utility and is barely supported.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#19278 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6y6ZQqux6x9BvzU5oaVt1Vkiijd0ks5tLeo4gaJpZM4Rg6sJ>
.
|
would appreciate a PR |
@jnothman am happy to have a look at this if it is needed. Am having a little trouble understanding what you mean 'make it invalid' though? |
I am currently working on this issue and am thinking that the solution is to add a check to the elif isinstance(data, CategoricalDtype):
if dtype is not None:
data = data.astype(dtype)
if index is None:
index = data.index.view()
else:
data=data.reindex(index, copy=False) However the boolean For reference this elif block would be added at ~ line 174 of |
|
@LEO-E-100 are you still working on this issue? |
We can re-purpose this issue to be for allowing |
Code Sample, a copy-pastable example if possible
Problem description
SparseArray
andSparseDataFrame
(or when callingSeries.to_sparse()
). This is inconsistent with the categorical dtype retained by dense Series and DataFrame.Expected Output
SparseDataFrame({'a': c})['a'].dtype == SparseSeries(c).dtype == SparseArray(c).dtype == Series(c).dtype
or at a minimum:
SparseSeries(c)
raises no error, and produces object dtype.Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Darwin
OS-release: 17.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_AU.UTF-8
LOCALE: en_AU.UTF-8
pandas: 0+unknown
pytest: None
pip: 9.0.1
setuptools: 38.4.0
Cython: 0.27.3
numpy: 1.14.0
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: