-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: IntegerArray reductions: data type of the scalar result? #23106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this add the patch, but some tests are failing
|
@jreback that changes the output of accessing a single item to python int instead of numpy int. I was actually thinking we should ensure the reductions return a numpy scalar instead of python int to be consistent with accessing items. And you would still have the inconsistency about return type of |
I'm facing a similar issue with >>> arr = SparseArray([0, 1], fill_value=0)
>>> isinstance(arr[0], arr.dtype.type) is false, since it's a python int, and the scalar Having these scalar types be numpy integers is more consistent with the way things are currently, but do we want to be bound to that going forward? |
So that indeed gives this inconsistency:
which is bad, I would say. |
I have a WIP doing that (well, making sure that the dtype matches), and this came up. There's some issues with string dtypes, but overall I think it'll be fine to require that the scalar matches. |
Follow-up on #22762 which added
_reduce
to the ExtensionArray interface and added an implementation for IntegerArray.Currently, for that IntegerArray we special case 'sum', 'min', 'max' to make sure they return ints (because the underlying implementation can be based on a float array, depending whether there are missing values).
pandas/pandas/core/arrays/integer.py
Lines 549 to 554 in 08e2752
That basic check has the following "problem":
The text was updated successfully, but these errors were encountered: