-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: alignment of CategoricalIndex #13440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Current coverage is 84.34%@@ master #13440 diff @@
==========================================
Files 138 138
Lines 51107 51124 +17
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43103 43119 +16
- Misses 8004 8005 +1
Partials 0 0
|
you are adding a crazy amount of code here. |
"a.categories" must already contain all categories that "values" uses. | ||
""" | ||
|
||
# No vectorized way to build the mapping: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.get_indexer
Thanks for the |
@jreback : what do you want to do?
|
The approach is not bad, but the actual impl seems not very elegant. Further your impl of |
Any constructive criticism? :-) My mood doesn't depend on them, but maybe this PR/issue does. |
maybe @sinhrks has some |
I may misunderstand, but the point is how pandas handle extended dtype? We should decide whether to make them more numpy compat, or use separated logic using if-else branch. I think most of other impl takes latter (if-else) option currently. |
@sinhrks : yes, the bug is due to a numpy method ( Certainly the cleanest option (for code organization, not that it would spare us from including the missing logic) would be that Then sure, we can solve this also with "if-else"s without overriding... I have hard time considering such approach more "elegant" than mine, but if it is the path of least resistance to closing the bug I'll update the PR. |
@toobaz I wasn't looking for a general 'why are we doing things this way', that is obvious. The issue is you are jumping thru hoops and not using the existing |
""" | ||
Must be overridden by Indexes which have non standard | ||
self.values.take() keyword args (e.g. CategoricalIndex). | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really really odd way to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... as already said, if you want me I override the entire _join_non_unique
I will, I don't see many other alternatives currently.
OK, now I'm (extending and) using |
@jreback ping Feel free to close, I won't be offended. |
yep, don't really have time to look, but needs some refactoring. if you wan to submit again pls feel free. |
Submit again?! Well... what?! I guess the bug(s) will be fixed by somebody who understands how contributing to this project works. I tried in vain. |
git diff upstream/master | flake8 --diff
@jreback take this just as a wild guess of how to fix #13365 . If the approach makes sense, I will add tests, whatsnew and some more docs.