Skip to content

BUG: SparseDataFrame does not allow single value data #5470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BrenBarn opened this issue Nov 8, 2013 · 3 comments · Fixed by #28425
Closed

BUG: SparseDataFrame does not allow single value data #5470

BrenBarn opened this issue Nov 8, 2013 · 3 comments · Fixed by #28425
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type

Comments

@BrenBarn
Copy link

BrenBarn commented Nov 8, 2013

You can create a normal DataFrame with, e.g., pandas.DataFrame(0, index=[1, 2, 3], columns=["A", "B", "C"]). However, this fails with a SparseDataFrame:

>>> x = pandas.SparseDataFrame(0, index=[1, 2, 3], columns=["A", "B", "C"])
Traceback (most recent call last):
  File "<pyshell#97>", line 1, in <module>
    x = pandas.SparseDataFrame(0, index=[1, 2, 3], columns=["A", "B", "C"])
  File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\sparse\frame.py", line 123, in __init__
    NDFrame.__init__(self, mgr)
UnboundLocalError: local variable 'mgr' referenced before assignment

It looks like the code in SparseDataFrame.__init__ is not handling the case where the data is a single value. It has a bunch of if statements to handle different kinds of data, but falls through without creating mgr if none of them are met.

This makes it difficult to create an "empty" SparseDataFrame whose default fill value is something other than nan, e.g., pandas.SparseDataFrame(0, index=[1, 2, 3], columns=["A", "B", "C"], default_fill_value=0). In fact, I would suggest that if a default_fill_value is provided but no data is provided, the SparseDataFrame should be filled with the fill value (not nan), but perhaps that warrants a separate issue.

@jreback
Copy link
Contributor

jreback commented Nov 8, 2013

would of course welcome PR / API ideas on this (and other sparse issues).

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Feb 15, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 1, 2015
@Motorrat
Copy link

Motorrat commented Oct 9, 2016

# "real world" data that can be used to demonstrate this behavior
#http://archive.ics.uci.edu/ml/datasets/URL+Reputation
# http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_svmlight_file.html
import pandas as pd
from sklearn.datasets import load_svmlight_file
X,y = load_svmlight_file('Day0.svm')
dataframe=pd.SparseDataFrame(X)

Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda\lib\site-packages\pandas\sparse\frame.py", line 122, in init
NDFrame.init(self, mgr)
UnboundLocalError: local variable 'mgr' referenced before assignment

@jreback
Copy link
Contributor

jreback commented Oct 9, 2016

looks like this is not tested, so pull-requests are welcome.

this is a nice work-around actually.

In [28]: pd.SparseDataFrame( index=[1, 2, 3], columns=["A", "B", "C"]).fillna(0)
Out[28]: 
     A    B    C
1  0.0  0.0  0.0
2  0.0  0.0  0.0
3  0.0  0.0  0.0

@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Difficulty Intermediate labels Oct 9, 2016
@jreback jreback changed the title SparseDataFrame does not allow single value data BUG: SparseDataFrame does not allow single value data Oct 9, 2016
@datapythonista datapythonista modified the milestones: Contributions Welcome, Someday Jul 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants