-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: perform .str operations on categoricals #8627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I agree this is a good idea, but note that categoricals intentionally do not support arithmetic. That seems inconsistent to me. So I would either consider adding support for arithmetic with numeric categories, or create a more specialized "interned string" array type, which has slightly different meaning than a categorical. |
I was thinking more along the lines of this:
FYI, cc @JanSchultz I think it might be nice to have a |
@jreback I totally agree with you, but Here's what the categorical docs say about numeric operations:
Maybe this is more a practical statement than a principled one? |
not talking about numeric ops |
side issue - you interested in speaking at pydata in November in NYC ? (I think. 22-23) |
@jreback OK, made a new issue for arithmetic. As for PyData NYC, I am sort of intrigued but already doing a lot of travel these days. |
@shoyer ok cool np. |
Again I think that this is more an argument to implement a 'pandas-string' type. If one wants to have the above gains then doing a PS: Schulz with 'z' and not 'tz'. :-) |
@JanSchulz sorry about that git is annoying with that :) |
IMO, this should be closed in favor of #8640... |
@JanSchulz you mean #10661 right? (which mostly solves the problem ,though does blow back to |
Nope: I understand this issue that you want all You could actually do that now by simple calling the Today (as of #10661) you can do IMO the real solution is to build a "PandasString" class in the same spirit as "Categorical" and use that: Or get a real numpy string type... |
closing in favor of #8640 |
So huge win to perform .str operations on the
.categories
of aCategorical
(versus actually doing these on an object array), when you have << number of categories relative to the number of objects.The text was updated successfully, but these errors were encountered: