BUG: get_dummies not returning SparseDataFrame #10535

artemyk · 2015-07-09T18:14:28Z

artemyk · 2015-07-09T22:52:56Z

@jreback Perhaps this would be more elegantly fixed by having concat drop empty passed-in DataFrames?

jreback · 2015-07-11T16:56:55Z

jreback · 2015-07-11T16:59:08Z

I think #10536 should be fixed first. But this introduces the problem of when/how to convert a concat of series (that happen to be all SparseSeries) into a SparseDataFrame (as opposed to a DataFrame).

artemyk · 2015-07-11T17:48:16Z

@jreback Not sure why #10536 needs to be fixed first --- it does not affect this issue (we are concating SparseDataFrames here)

jreback · 2015-07-12T14:26:27Z

pandas/core/reshape.py

@@ -957,13 +957,15 @@ def get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False,
        If `columns` is None then all the columns with
        `object` or `category` dtype will be converted.
    sparse : bool, default False
-        Whether the returned DataFrame should be sparse or not.
+        Whether the dummy columns should be sparse or not.  Returns
+        SparseDataFrame if `data` is a Series or if all columns are included.


this is confusing, I think we should just always return a SparseDataFrame.

What about the case when some of the blocks are dense ? e.g.

In [1]: import pandas as pd In [2]: df = pd.DataFrame({'a':['A','B','C'],'b':['D','E','D']}) In [3]: pd.get_dummies(df, sparse=True, columns='b') Out[3]: a b_D b_E 0 A 1 0 1 B 0 1 2 C 1 0

Here a is still dense. Is a df where some blocks are dense and some sparse more accurately called a DataFrame or a SparseDataFrame?

Tests redo

artemyk · 2015-07-21T01:03:10Z

pandas/tests/test_reshape.py

+        self.assertEqual(type(r[['a_0']]._data.blocks[0]), exp_blk_type)
+        self.assertEqual(type(r[['a_1']]._data.blocks[0]), exp_blk_type)
+        self.assertEqual(type(r[['a_2']]._data.blocks[0]), exp_blk_type)
+


@jreback How about now?

ok looks reasonable

ping when green

FYI - don't take lines out of the what's new I put them there on purpose - helps avoid merge conflicts (in fact u have one now because of that)

@jreback OK , green now.
Sorry about the whatsnew newlines (though I don't think there's a merge conflict?) So, where is a good place to insert whatsnew notes? Somewhere between two newlines?

jreback · 2015-07-21T10:34:24Z

merged via d065374

thanks!

jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type labels Jul 11, 2015

jreback added this to the 0.17.0 milestone Jul 11, 2015

jreback reviewed Jul 12, 2015
View reviewed changes

artemyk force-pushed the sparse_dataframe branch from 7296068 to ec58429 Compare July 21, 2015 01:02

BUG: get_dummies not returning SparseDataFrame

ec58429

Tests redo

artemyk reviewed Jul 21, 2015
View reviewed changes

jreback closed this Jul 21, 2015

jreback mentioned this pull request Jul 21, 2015

get_dummies(df,sparse=True) does not return sparse DataFrame #10531

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: get_dummies not returning SparseDataFrame #10535

BUG: get_dummies not returning SparseDataFrame #10535

artemyk commented Jul 9, 2015

artemyk commented Jul 9, 2015

jreback commented Jul 11, 2015

jreback commented Jul 11, 2015

artemyk commented Jul 11, 2015

jreback Jul 12, 2015

artemyk Jul 12, 2015

artemyk Jul 20, 2015

artemyk Jul 21, 2015

jreback Jul 21, 2015

artemyk Jul 21, 2015

jreback commented Jul 21, 2015

BUG: get_dummies not returning SparseDataFrame #10535

BUG: get_dummies not returning SparseDataFrame #10535

Conversation

artemyk commented Jul 9, 2015

artemyk commented Jul 9, 2015

jreback commented Jul 11, 2015

jreback commented Jul 11, 2015

artemyk commented Jul 11, 2015

jreback Jul 12, 2015

Choose a reason for hiding this comment

artemyk Jul 12, 2015

Choose a reason for hiding this comment

artemyk Jul 20, 2015

Choose a reason for hiding this comment

artemyk Jul 21, 2015

Choose a reason for hiding this comment

jreback Jul 21, 2015

Choose a reason for hiding this comment

artemyk Jul 21, 2015

Choose a reason for hiding this comment

jreback commented Jul 21, 2015