-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Build empty SparseDataFrame by columns very loog compared to by index. #16197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
can can try this (avail in 0.20.0 releasing shortly): http://pandas-docs.github.io/pandas-docs-travis/sparse.html#sparsedataframe. though I suspect that may not work. pandas sparse is row based, NOT column based, so your size would blow up memory. |
Thank you. |
pandas-dev#16191) (cherry picked from commit 1c0b632)
Code Sample, a copy-pastable example if possible
I want to create a sparse matrix with a 4 level multiindex and about 340 000 x 340 000 cells.
I is not possible to build it in dense and to sparse it.
So I tried to build it directly in SparseDataFrame.
But if I tried to construct:
or
An all night wasn't enough to build the empty SparseDataFrame.
I don't understand how to build this empty SparseDataFrame in a quite reseanoble time (less than <20 minutes with 8 GoRam).
Output of
pd.show_versions()
pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.19.0
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.1
matplotlib: 2.0.0
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: 2.45.0
The text was updated successfully, but these errors were encountered: