-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
COMPAT/TST: fix group_info dtype issues, xref #10981 #10988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
COMPAT/TST: fix group_info dtype issues, xref #10981
@behzadnouri it may be less readable, but I didn't have the time to debug. welcome your assistance with that. these win- issues are annoying. if you can spin up a vm would be great (I do this on macosx in fact ,with a win vm). |
if u post the traceback i can look into. |
I took master and reverted that commit
errors
|
you need int64 in BinGrouper.group_info but not in GroupBy.nunique |
|
the type which comes out of numpy calls is correct. nunique values should have the same types as array of indices, which is always these tests are not failing in cython calls or in the actual values. they are failing only because the constructed expected right frame/series is set to have int64 type. you may fix the expected right value in the tests to have |
from completely disagree. You HAVE to return its irrelevant what numpy returns. Pandas should always return the same dtype as input except if an upcast is needed. If I pass in
|
that is why there are so many ugly
are you sure? cause multi-index labels are more indexer than nuinque: >>> MultiIndex.from_arrays([[1, 2], [3, 4]]).labels[0]
FrozenNDArray([0, 1], dtype='int8') |
I agree, but that is a completely separate issue to do that. You cannot simply change it. It STILL has to work on windows
We ALWAYS return |
|
certainly welcome work on this: #3033 but as I said, that is a bit non-trivial and out-of-scope for a fix here. |
well you are doing it wrong. if one place it needs to be forced for unique/count i do not see why it has to deviate from numpy. |
the point is that it is a long-time consistency to return data to the users in We certainly could change this. But again that is out of scope for this issue, and further that would have to be an announced API change. You are changing the user facing API in a potentially unexpected way. It might be ok, it might not. But I think it would be wise to actually push this into master and have people tests it for a while. As far as what is used internally, #3033 covers this. I am certainly open to cleaning this up. I think everyone can agree its fairly ugly and when it was done there were other considerations. |
closes #10981