-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame.drop_duplicates method fails when a column with a list dtype is present #56784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report, but these methods rely on hashing and mutable objects are not hashable. As such, I don't think we can support such columns. However, this limitation is not in the documentation. I think it would be good to add this documentation. |
take |
take |
import pandas as pd data={ "number": 1, "item_ids":[[1, 2, 3]], drop_duplicates and drop_duplicated is not working with subset list , |
take |
This is a nice issue to work since I am a newcomer. |
Still in my code the function fails when we have a list as a data type in a column: |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
DataFrame.drop_duplicates
andDataFrame.duplicated
fail if there is a column of the list dtype in thesubset
parameter and there is more than one column insubset
Expected Behavior
should output
Installed Versions
INSTALLED VERSIONS
commit : 04b45b1
python : 3.11.7.final.0
python-bits : 64
OS : Linux
OS-release : 6.5.0-14-generic
Version : #14-Ubuntu SMP PREEMPT_DYNAMIC Tue Nov 14 14:59:49 UTC 2023
machine : x86_64
processor :
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.3.0.dev0+54.g04b45b10b1
numpy : 1.26.2
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 69.0.3
pip : 23.3.1
Cython : None
pytest : 7.4.3
...
zstandard : None
tzdata : 2023.4
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: