-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add validation checks for non-unique merge keys #35373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks @LewisDavies for the PR. The current behaviour is documented so this is an breaking API change. Would first need a deprecation. from https://pandas.pydata.org/pandas-docs/version/1.1.0/reference/api/pandas.merge.html
FWIW i'm -1 on changing this. |
Happy to contribute! I didn't realize the current behaviour was intentional so I understand not wanting to implement this. That said, it feels odd that if a user specifies With the benefit of hindsight the names could be clearer, especially for one-to-many merges. Adding |
Thanks @LewisDavies. That's just my opinion. others may disagree so we'll leave this open for a while.
another alternative could be to have, say, a There is functionality currently being added to pandas, to disallow allow duplicates and propagate that property in further operations, see #28394. It could be that in the future the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we would likely break the world with erroring on this now, try making this a warning, and then see what tests you need to catch the warning.
@LewisDavies are you able to address @jreback comment #35373 (review) |
Closing as stale, @LewisDavies let us know if you'd still like to work on this |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
After implementing the suggestion in #27430 I realized it should also apply to
1:m
andm:1
merges. When any type ofmany
merge is specified, an error will be raised if themany
side is actually unique.