Skip to content

Commit d6bb589

Browse files
debnathshohammliu08
authored andcommitted
BUG: df.explode mulitcol with Nan+emptylist (pandas-dev#49680)
* BUG: df.explode mulitcol with Nan+emptylist * suggested changes
1 parent 7828999 commit d6bb589

File tree

3 files changed

+23
-1
lines changed

3 files changed

+23
-1
lines changed

doc/source/whatsnew/v2.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -721,6 +721,7 @@ Reshaping
721721
- Bug in :func:`join` when ``left_on`` or ``right_on`` is or includes a :class:`CategoricalIndex` incorrectly raising ``AttributeError`` (:issue:`48464`)
722722
- Bug in :meth:`DataFrame.pivot_table` raising ``ValueError`` with parameter ``margins=True`` when result is an empty :class:`DataFrame` (:issue:`49240`)
723723
- Clarified error message in :func:`merge` when passing invalid ``validate`` option (:issue:`49417`)
724+
- Bug in :meth:`DataFrame.explode` raising ``ValueError`` on multiple columns with ``NaN`` values or empty lists (:issue:`46084`)
724725

725726
Sparse
726727
^^^^^^

pandas/core/frame.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -8848,7 +8848,7 @@ def explode(
88488848
if len(columns) == 1:
88498849
result = df[columns[0]].explode()
88508850
else:
8851-
mylen = lambda x: len(x) if is_list_like(x) else -1
8851+
mylen = lambda x: len(x) if (is_list_like(x) and len(x) > 0) else 1
88528852
counts0 = self[columns[0]].apply(mylen)
88538853
for c in columns[1:]:
88548854
if not all(counts0 == self[c].apply(mylen)):

pandas/tests/frame/methods/test_explode.py

+21
Original file line numberDiff line numberDiff line change
@@ -280,3 +280,24 @@ def test_multi_columns(input_subset, expected_dict, expected_index):
280280
result = df.explode(input_subset)
281281
expected = pd.DataFrame(expected_dict, expected_index)
282282
tm.assert_frame_equal(result, expected)
283+
284+
285+
def test_multi_columns_nan_empty():
286+
# GH 46084
287+
df = pd.DataFrame(
288+
{
289+
"A": [[0, 1], [5], [], [2, 3]],
290+
"B": [9, 8, 7, 6],
291+
"C": [[1, 2], np.nan, [], [3, 4]],
292+
}
293+
)
294+
result = df.explode(["A", "C"])
295+
expected = pd.DataFrame(
296+
{
297+
"A": np.array([0, 1, 5, np.nan, 2, 3], dtype=object),
298+
"B": [9, 9, 8, 7, 6, 6],
299+
"C": np.array([1, 2, np.nan, np.nan, 3, 4], dtype=object),
300+
},
301+
index=[0, 0, 1, 2, 3, 3],
302+
)
303+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)