Skip to content

Commit 916d1f3

Browse files
DOC: Fix EX01 in DataFrame.drop_duplicates (#33283)
1 parent 4fc8c25 commit 916d1f3

File tree

1 file changed

+41
-0
lines changed

1 file changed

+41
-0
lines changed

pandas/core/frame.py

+41
Original file line numberDiff line numberDiff line change
@@ -4744,6 +4744,47 @@ def drop_duplicates(
47444744
See Also
47454745
--------
47464746
DataFrame.value_counts: Count unique combinations of columns.
4747+
4748+
Examples
4749+
--------
4750+
Consider dataset containing ramen rating.
4751+
4752+
>>> df = pd.DataFrame({
4753+
... 'brand': ['Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie'],
4754+
... 'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
4755+
... 'rating': [4, 4, 3.5, 15, 5]
4756+
... })
4757+
>>> df
4758+
brand style rating
4759+
0 Yum Yum cup 4.0
4760+
1 Yum Yum cup 4.0
4761+
2 Indomie cup 3.5
4762+
3 Indomie pack 15.0
4763+
4 Indomie pack 5.0
4764+
4765+
By default, it removes duplicate rows based on all columns.
4766+
4767+
>>> df.drop_duplicates()
4768+
brand style rating
4769+
0 Yum Yum cup 4.0
4770+
2 Indomie cup 3.5
4771+
3 Indomie pack 15.0
4772+
4 Indomie pack 5.0
4773+
4774+
To remove duplicates on specific column(s), use ``subset``.
4775+
4776+
>>> df.drop_duplicates(subset=['brand'])
4777+
brand style rating
4778+
0 Yum Yum cup 4.0
4779+
2 Indomie cup 3.5
4780+
4781+
To remove duplicates and keep last occurences, use ``keep``.
4782+
4783+
>>> df.drop_duplicates(subset=['brand', 'style'], keep='last')
4784+
brand style rating
4785+
1 Yum Yum cup 4.0
4786+
2 Indomie cup 3.5
4787+
4 Indomie pack 5.0
47474788
"""
47484789
if self.empty:
47494790
return self.copy()

0 commit comments

Comments
 (0)