-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add base test for Series[ExtensionArray].sort_values with sort=False #23515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Update fork
Add base test for Series[ExtensionArray].sort_values with sort=False Addresses pandas-dev#23074
Hello @akulagrawal! Thanks for submitting the PR.
|
Some failing tests that'll need to be fixed up. I haven't looked closely. This raises an ExtensionArray API question: should we require that At the very least, we should ensure that our EAs implement In [21]: ser = pd.Series(pd.core.arrays.integer_array([1, 2, 2, 2, 1, 0]))
In [22]: ser.value_counts(sort=False)
Out[22]:
2 3
1 2
0 1
dtype: int64
In [23]: ser.astype(int).value_counts(sort=False)
Out[23]:
0 1
1 2
2 3
dtype: int64 |
Oh my bad! I am a beginner and this is my first contribution to pandas. As for the match, according to my system the output is:
I understand they are incorrect but they don't match with yours atleast |
Yeah, I'm not sure why those don't match. Presumably there's whatever internally does this hashing doesn't preserve order. I'd recommend holding off on this PR and maybe finding another issue until we can decide exactly what we want the requirements on order to be here. |
Oh... first of all there was some error in my last comment. I meant they matched on my system (output updated in the comment), though both seem to be incorrect. |
Closing the PR (to shorten our backlog of open PRs). We still have the issue to track it, so if we decide to keep value_counts on that issue, we can simply reopen this PR. @akulagrawal That is OK? |
yes sure :) |
git diff upstream/master -u -- "*.py" | flake8 --diff