-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
astype errors with Categorical #16697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The correct api is the one that works -
So we could give a better message, but I think this is basically working as it should. |
this is purely a user error. You are incorrectly passing things as shown by @chris-b1 .
fixes. this is a bit tricky as categorical is right now a singleton type. |
Makes sense, thanks. Two follow-up questions. Is there a canonical way to pick out categorical columns within a DataFrame? Is there a way to pin a pre-existing Category to a Series? Here I'm thinking of cases where multiple columns of a DataFrame should be explicitly tied to the same dtype. |
You can use The problem with
You mean a pre-existing set of categories? Then you can do the |
Simple solution , which works for me always: str( df.dtypes[colname] ) == 'category' |
with 0.20.1, this raises:
This doesn't appear to be quite the intended usage.
.astype("category", categories=cat)
also fails, though.astype("category", categories=cat.categories)
is OK.I suspect this is related to similar errors in trying to identify which columns of a DataFrame are categorical (possible repeat of #16659):
df.dtypes[colname] == 'category'
evaluates as True for categorical columns and raisesTypeError: data type "category" not understood
for np.float64 columns.df.dtypes == pd.Categorical
raisesTypeError: Could not compare <type 'type'> type with Series
Also related: #15078
The text was updated successfully, but these errors were encountered: