Skip to content

Commit 2ffbad6

Browse files
committed
BUG: fix issue with sparse concatting
This was originally brought up in :issue:`18686` and :issue:`18914`. Basically the problem is when you use get_dummies with sparse=True it will return a SparseDataFrame with sparse and dense columns. This is in fact not what we want. What we want is a DataFrame with sparse and dense columns. Inside of pandas.core.dtypes.concat is a function that defines the factory class which needed to be changed.
1 parent dbec3c9 commit 2ffbad6

File tree

4 files changed

+10
-2
lines changed

4 files changed

+10
-2
lines changed

doc/source/whatsnew/v0.23.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -353,6 +353,7 @@ Reshaping
353353
- Bug in :func:`Series.rank` where ``Series`` containing ``NaT`` modifies the ``Series`` inplace (:issue:`18521`)
354354
- Bug in :func:`cut` which fails when using readonly arrays (:issue:`18773`)
355355
- Bug in :func:`Dataframe.pivot_table` which fails when the ``aggfunc`` arg is of type string. The behavior is now consistent with other methods like ``agg`` and ``apply`` (:issue:`18713`)
356+
- Bug in :func:`pandas.core.dtypes.concat._get_series_result_type` which returns SparseDataFrame even if not all contained in Frame are sparse. (:issue:`18914` and :issue:`18686`)
356357

357358

358359
Numeric

pandas/core/dtypes/concat.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ def _get_frame_result_type(result, objs):
9292
if any block is SparseBlock, return SparseDataFrame
9393
otherwise, return 1st obj
9494
"""
95-
if any(b.is_sparse for b in result.blocks):
95+
if all(b.is_sparse for b in result.blocks):
9696
from pandas.core.sparse.api import SparseDataFrame
9797
return SparseDataFrame
9898
else:

pandas/core/sparse/series.py

-1
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,6 @@ def __init__(self, data=None, index=None, sparse_index=None, kind='block',
168168
if index is None:
169169
index = data.index.view()
170170
else:
171-
172171
data = data.reindex(index, copy=False)
173172

174173
else:

pandas/tests/reshape/test_reshape.py

+8
Original file line numberDiff line numberDiff line change
@@ -454,6 +454,14 @@ def test_dataframe_dummies_preserve_categorical_dtype(self, dtype):
454454

455455
tm.assert_frame_equal(result, expected)
456456

457+
def test_get_dummies_dont_sparsify_all_columns(self):
458+
# GH18914
459+
df = DataFrame.from_items([('GDP', [1, 2]), ('Nation', ['AB', 'CD'])])
460+
df = get_dummies(df, columns=['Nation'], sparse=True)
461+
df2 = df.reindex(columns=['GDP'])
462+
463+
tm.assert_index_equal(df.index, df2.index)
464+
457465

458466
class TestCategoricalReshape(object):
459467

0 commit comments

Comments
 (0)