Skip to content

BUG: Categoricals shouldn't allow non-strings when object dtype is passed (#13919) #14027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

wcwagner
Copy link
Contributor

@wcwagner wcwagner commented Aug 18, 2016

Why this change is needed: Categorical variables are by definition single types, so to allow them to take on various different kinds of values is misleading. Object dtypes should only be allowed when ALL strings or ALL periods are passed (due to the way there are handled internally).

The result of this PR will raise a TypeError when a categorical is created that has an object dtype but doesn't contain all string or all period values.

I have a couple questions:
When using MultiIndex.from_arrays, it creates Categories here, which can have mixed dtypes, and unfortunately my code disallows this. Any tips on how to circumvent this?

Also, this test in test_constructor is problematic bc it converts the catergories dtype to object if NaN is in the categories (although this is deprecated). Should I change my code to allow this, or can I assert that this produces a TypeError

Any feedback is appreciated, thanks

@wcwagner wcwagner closed this Aug 18, 2016
@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Error Reporting Incorrect or improved errors from pandas Categorical Categorical Data Type labels Aug 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Dtype Conversions Unexpected or buggy dtype conversions Error Reporting Incorrect or improved errors from pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ERR: Categoricals should not allow non-strings when an object dtype is passed
2 participants