-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG - sparse dataframes lose multi-index column names #11600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
sparse has not gotten a lot of love, so pull-requests are welcome. |
I've been looking for a way to get involved with pandas and contribute, maybe this will be my start. Although once I get setup I see some other stuff on the list that looks less challenging :) |
that would be great! there are a bunch of sparse issues which are on the easier side as well (though I don't think this is terribly involved) lmk when u need help |
http://pandas.pydata.org/pandas-docs/stable/contributing.html for how to contribute |
Yup, already have it forked and cloned to my desktop and am exploring how the code connects, that page was quite useful! |
Okay, I think I have it fixed. The test code is longer than the fix code by far. Passes the most obvious testing. I'm going to run the whole suit of tests then will get you a pull request to review. |
OK, pull request submitted. Ran all nosetests and got OK (SKIP=116) on 9172 tests. If this gets integrated I'll post a short update as an answer on SO. |
closed by #11606 |
From SO: http://stackoverflow.com/questions/33702198/do-python-pandas-sparse-dataframes-lose-multi-index-column-names-or-am-i-doing-i
Bug is simple in concept, multi-index with column level names loses those names when going into sparse dataframes.
Minimal example - first create a multi-index dataframe:
This gives us a nice test multi-index with column and row level names. Now if I make a sparse matrix out of that and show it, the column level names are gone.
And if I convert the sparse version back to dense those level names are still gone.
I AM aware that displaying the sparse version calls to_dense() but the loss appears to be happening at the conversion to sparse. I'm exploring moving to sparse to reduce memory usage for a code base and my attempts to access the levels within the sparse dataframe generate "KeyError: 'Level not found'"
The text was updated successfully, but these errors were encountered: