-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix for unequal comparisons of categorical and scalar #9848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'll paste this here, just ran into this on
|
7817721
to
86e0376
Compare
values = self.get_values() | ||
other = _index.convert_scalar(values,_values_from_object(other)) | ||
if com.is_categorical_dtype(self): | ||
# cats are a special case as get_values() would return an ndarray, which would then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit hacky to me. Ideally, shouldn't blocks contain all the logic for arithmetic operations? That way, the special casing for different dtypes would just be a different methods defined on block subclasses, not a mess of dtype checks on the series itself.
This is not your technical debt, though, so we can save that cleanup for later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in reaility some of the core/ops.py
should be in internals. @shoyer let's make an issue about this, but @JanSchulz change is ok for now.
@JanSchulz can you add a release note. I'll merge this before CategoricalIndex
Another design question: what is the goal of |
so here's the difference between
|
@jreback Then
|
@JanSchulz not sure what you mean, that's exactly what it should do. In the case of a basic type (e.g. a FloatBlock, this just returns the ndarray). |
@jreback: why not simply Anyway, the problem in the above was, that even when I took that out (by only using |
@JanSchulz for a simple Block, I understand the case for a SparseArray, but I don't know if this should also be the case for a Categorical. With |
It definitely should NOT do that in case of a
|
@JanSchulz the point of these methods is to get 2 differently thing. A ndarray w/o doing any work, and the actual object. I think some of this type inference in |
As far as I understood it, we tried to make |
@JanSchulz as I said, this is ok for now. pls add a release note. |
Before, unequal comparisons were not checking the order of the categories. This was due to a conversion to an ndarray, which turned the comparison to one between ndarray and scalar, which of course has no categories to take into account. Also add test cases and remove the one which actually tested the wrong behaviour.
86e0376
to
725729f
Compare
@jreback: release note added. Haven't run a doc build, though :-/ |
merged via f0ac930 ty! |
Isn't this behavior the exact opposite of what @JanSchulz was saying in #8995? |
@jreback: should I also add an issue about |
@JanSchulz not sure what you are talking about .get_values() it is defined correctly |
@jreback I know you've called [1] #9580 (comment) |
well public for quite a long time |
we can keep |
I am NOT encouraging use :) @JanSchulz brought up something - still not sure about what |
This was more along the line of #9859 which asked that ops, which check for the content of the type should be pushed to the blocks layer. I still think that |
@JanSchulz what's the point ? .values already does that |
If |
docstring I think it is less confusing that and @jreback you could say the same as ".to_dense already does that" :-) |
@jreback @jorisvandenbossche @shoyer Whats teh status of the
There is also the issue of pushing some ops into the |
@JanSchulz Here is the issue I made for pushing ops to the blocks layer: #9859. If we could go back in time and remove |
@JanSchulz I suppose you could deprecatee But if you can figure out a better way, gr8! |
Fixes: #9836