Skip to content

PERF: removed coercion to int64 in Categorical.from_codes #20961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

PERF: removed coercion to int64 in Categorical.from_codes #20961

wants to merge 1 commit into from

Conversation

nlee737
Copy link
Contributor

@nlee737 nlee737 commented May 5, 2018

@WillAyd
Copy link
Member

WillAyd commented May 5, 2018

Since this is performance related - is there an ASV for this? If not can you add and post result?

@nlee737
Copy link
Contributor Author

nlee737 commented May 5, 2018

Different solution needed; removing coercion causes type errors with reduce operations.

@nlee737 nlee737 closed this May 5, 2018
@TomAugspurger
Copy link
Contributor

@fivemok could you post more details about the issues? Future contributors trying to fix this would appreciate it.

@nlee737
Copy link
Contributor Author

nlee737 commented May 9, 2018

@TomAugspurger apologies for the lack of details. The from_codes function can take arrays of non-numeric types which is coerced to an array of np.int64. Removing coercion completely can break NumPy reduction operations like np.max due to array having a flexible type. I think only removing coercion for arrays of numeric dtype will work. I'll create a new PR after running the tests.

@WillAyd should the full ASV be posted or just for pd.Categoricals?

@WillAyd
Copy link
Member

WillAyd commented May 9, 2018

@fivemok just for pd.Categoricals and the relevant method(s)

@nlee737
Copy link
Contributor Author

nlee737 commented May 10, 2018

@WillAyd @TomAugspurger I created a new PR here for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Categorical.from_codes shouldn't coerce to int64
3 participants