-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API/BUG: awkward syntax to add categories to a Categorical #9927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @JanSchulz |
…duplicates optional, fixes pandas-dev#9927
I'm against A bad example for "append the difference": take two series, one ["a","b"] and one ["a","c"] and then use the above with ["a","b","c"] -> you get two different categories, one ["a","b","c"] and one ["a","c","b"]. We do have I think accepting ndarray/ndex would be good, but with the same restraints as the current list. |
cc @JanSchulz I am not actually suggesting anything new, here, just appending, where the order is not defined, except that it occurs AFTER the existing categories (as I think this is a very natural thing to do, and pretty awkward to force the user to make sure that they are not adding duplicate categories. Of course they can use |
regarding "what's the point of add_cats": for the usecases, which categorical should IMO optimized for ("lickert scales" or "american states"), this method is mostly useless: I can't really think about a usecase where IMO this issue/SO question is another "workaround"/indicator for the "categoricals as memory efficient strings": if you have a need for setting arbitrary length categories (e.g. |
see #9929 seems reasonable to me. @shoyer @jorisvandenbossche @JanSchulz I not sure where you think this doesn't apply to Categoricals. I want to add a bunch of categories. Not saying they are in any particular order, just that we don't rewrite the existing category mappings. I think that is a reasonable API guarantee. This is actually what we do in reality now, this just codifies it and makes it a convient to do. |
I just think that a report starting with "I'm trying to reduce the size..." is not something which should be used to influence the API design of Regarding the the usecases: I think that If you have a cat with Even if you want "append in the end", a simple |
@JanSchulz ok, closing for now (I did merge the bug fix though, but separate issue) |
from SO
.add_categories
should be able to take anIndex
/ndarray, ATM it must be converted to a listI would ideally just like to say:
(maybe not the best keyword, but something like this)
The text was updated successfully, but these errors were encountered: