Skip to content

PERF: avoid casting to float in IntegerArray reducing ops #30436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Dec 23, 2019 · 0 comments · Fixed by #50833
Closed

PERF: avoid casting to float in IntegerArray reducing ops #30436

jorisvandenbossche opened this issue Dec 23, 2019 · 0 comments · Fixed by #50833
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. NA - MaskedArrays Related to pd.NA and nullable extension arrays Performance Memory or execution speed performance Reduction Operations sum, mean, min, max, etc.

Comments

@jorisvandenbossche
Copy link
Member

Currently, we cast to float in the IntegerArray reducing ops:

# coerce to a nan-aware float if needed
if mask.any():
data = self._data.astype("float64")
data[mask] = self._na_value
op = getattr(nanops, "nan" + name)

However, the nanops functions can already handle a mask. So with the appropriate fill_value, there should be no need to cast the integers to float (and check for missing values). At least this is the case for skipna=True, the case of skipna=False might need to be handled separately.

@jorisvandenbossche jorisvandenbossche added Numeric Operations Arithmetic, Comparison, and Logical operations Performance Memory or execution speed performance ExtensionArray Extending pandas with custom dtypes or arrays. labels Dec 23, 2019
@jorisvandenbossche jorisvandenbossche added this to the Contributions Welcome milestone Dec 23, 2019
@jbrockmendel jbrockmendel added the Reduction Operations sum, mean, min, max, etc. label Oct 27, 2020
@mroeschke mroeschke removed the Numeric Operations Arithmetic, Comparison, and Logical operations label Jul 25, 2021
@jbrockmendel jbrockmendel added the NA - MaskedArrays Related to pd.NA and nullable extension arrays label Dec 21, 2021
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. NA - MaskedArrays Related to pd.NA and nullable extension arrays Performance Memory or execution speed performance Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants