Skip to content

ENH: Make explode work for sets #35637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 5, 2020
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ For example:
Other enhancements
^^^^^^^^^^^^^^^^^^
- :class:`Index` with object dtype supports division and multiplication (:issue:`34160`)
-
- :meth:`DataFrame.explode` and :meth:`Series.explode` now support exploding of sets (:issue:`35614`)
-


Expand Down
6 changes: 4 additions & 2 deletions pandas/_libs/reshape.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,8 @@ def explode(ndarray[object] values):
counts = np.zeros(n, dtype='int64')
for i in range(n):
v = values[i]
if c_is_list_like(v, False):

if c_is_list_like(v, True):
if len(v):
counts[i] += len(v)
else:
Expand All @@ -138,8 +139,9 @@ def explode(ndarray[object] values):
for i in range(n):
v = values[i]

if c_is_list_like(v, False):
if c_is_list_like(v, True):
if len(v):
v = list(v)
for j in range(len(v)):
result[count] = v[j]
count += 1
Expand Down
7 changes: 7 additions & 0 deletions pandas/tests/frame/methods/test_explode.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,3 +172,10 @@ def test_ignore_index():
{"id": [0, 0, 10, 10], "values": list("abcd")}, index=[0, 1, 2, 3]
)
tm.assert_frame_equal(result, expected)


def test_explode_sets():
df = pd.DataFrame({"a": [{"x", "y"}], "b": [1]}, index=[1])
result = df.explode(column="a").sort_values(by="a")
expected = pd.DataFrame({"a": ["x", "y"], "b": [1, 1]}, index=[1, 1])
tm.assert_frame_equal(result, expected)
7 changes: 7 additions & 0 deletions pandas/tests/series/methods/test_explode.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,3 +126,10 @@ def test_ignore_index():
result = s.explode(ignore_index=True)
expected = pd.Series([1, 2, 3, 4], index=[0, 1, 2, 3], dtype=object)
tm.assert_series_equal(result, expected)


def test_explode_sets():
s = pd.Series([{"a", "b", "c"}], index=[1])
result = s.explode().sort_values()
expected = pd.Series(["a", "b", "c"], index=[1, 1, 1])
tm.assert_series_equal(result, expected)