-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: GroupBy.std floating point error #51332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In writing some test cases for some of my code I had an oddity where I'd compute the std of some values to figure out what the output of a routine doing multiple aggregations of groups should be and the std result via the group by gave a very slightly different result for one set of values. When I searched the list of open issues I saw this bug report and wondered if what I'm observing is related (or a misunderstanding on my part). I pared it down to a short bit of code to illustrate:
And the output:
|
I expect the difference is due to having different implementations of std for Series vs GroupBy, xref #53261 |
I guess the implementation difference in std for GroupBy is also why the GroupBy version of it seems to give different results depending on the order of the values fed into it...
...
It took me a while to figure out why the test I wrote to test some code I have which aggregates via groupby was calculating a different answer for std in some cases even when using groupby itself until I double checked the problematic group of values and noticed the order was the only difference. |
In the groupby std method we cast from ints to floats, which I don't think we do in the relevant Series code (in nanops). This is my best guess for the culprit in this mismatch.
The text was updated successfully, but these errors were encountered: