-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG / ENH: Implement groupby_helper funcs for int #16676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
so this is actually straightforward. we add a groupby_mean that accepts int, but returns float. |
I wish it was that simple, though even when implementing those functions, I still see rounding issues, as can be seen from an attempted patch off of 5bf7f9: |
im not clear on what the ask is here. @gfyoung what would the suggested helper(s) be doing? |
In the case the input is an integer ( Removing the downcasting leads to 26 failed test. I haven't looked into these yet to see if there are any surprises, or if they are all just because of the dtype change for mean. Marking this as a good first issue for investigation. The downcasting to remove is done in the code below. Once removed, try fixing up the tests and see if there are any cases where not downcasting leads to unexpected / undesirable results. pandas/pandas/core/reshape/pivot.py Lines 175 to 195 in db27c36
|
From #15091 (at 028188):
When we pivot, we have to aggregate the values we group together by the
index
andcolumns
parameters. When we specifymean
as the aggregator, we eventually get around to calling_get_cython_function
, which searches for an implemention ofmean
ingroupby.pyx
for integers. However,groupby_helper.pxi
only defines them for floats, so the data is then cast tofloat
for aggregating before being reconverted back toint
in the final result, leading to the mysterious increment due to rounding.Had there been an implementation of
mean
forint
, then this wouldn't happen. However, implementingmean
forint
isn't straightforward because we can't guarantee returningint
as thefloat
implementations can't guarantee returningfloat
without losing precision (which is the contract in thegroupby_helper.pxi
template).The text was updated successfully, but these errors were encountered: