-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: qcut can fail for highly discontinuous data distributions #31626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Needs refactoring
Thanks @puneet29 However, before this PR can be accepted, you need to write tests / make sure you don't break existing ones - see contributing to the code base:
|
Hey @MarcoGorelli ! Kindly guide me on what to do with the duplicates parameter, because this PR deals with duplicate values in bins. Can you review the code? Thanks a lot. |
@puneet29 this needs tests so we know that the bug is fixed. it will probably go in pandas.tests.reshape.test_qcut |
I think this is stale so closing for now, but ping @puneet29 if you'd like to pick back up |
Hi, I'm sorry. I will take it up again. I was quite busy before. 😅 |
No worries - thanks! |
Hi @jbrockmendel @MarcoGorelli |
From what I've understood reading through the original issue, the output should not change if Anyway, if you write a test case, then it'll be easier for anyone reviewing to tell how the output has changed. |
closing as stale if you want to continue, please open a new PR. |
Fixes #15069. Needs refactoring
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff