Skip to content

Commit c573cb7

Browse files
committed
BUG: DataFrame.drop_duplicates method fails when a column with a list dtype is present pandas-dev#56784
1 parent e158765 commit c573cb7

File tree

1 file changed

+16
-0
lines changed

1 file changed

+16
-0
lines changed

pandas/core/frame.py

+16
Original file line numberDiff line numberDiff line change
@@ -6748,6 +6748,22 @@ def drop_duplicates(
67486748
DataFrame or None
67496749
DataFrame with duplicates removed or None if ``inplace=True``.
67506750
6751+
Notes
6752+
-------
6753+
To handle mutable objects such as list, convert the list column
6754+
to a tuple before using it in the subset.
6755+
6756+
>>> df = pd.DataFrame([
6757+
... {'number': 1, 'item_ids': [1, 2, 3]},
6758+
... {'number': 1, 'item_ids': [1, 2, 3]},
6759+
... ])
6760+
6761+
>>> df['item_ids'] = df['item_ids'].apply(tuple)
6762+
>>> df.drop_duplicates(inplace=True)
6763+
>>> df['item_ids'] = df['item_ids'].apply(list)
6764+
number item_ids
6765+
0 1 [1, 2, 3]
6766+
67516767
See Also
67526768
--------
67536769
DataFrame.value_counts: Count unique combinations of columns.

0 commit comments

Comments
 (0)