-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Bug when computing rolling_mean with extreme value #11645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is not a bug but is instead a feature of floating point math. Efficient rolling mean makes use of a rolling sum. Having numbers that are differ in magnitude by 1/np.finfo(np.double).eps results in truncation. So when you add the big number in you effectively los all information in the small numbers, and when that number is finally removed, there is nothing in the rolling sum about the small numbers, and so it is as if they were 0. |
Thank you for your quick anwser. I take the point. Best regards |
I don't think it is really possible to warn about numeric limits without substantially affecting performance. For example x = np.array([2e17]) ** 2 + 1 - np.array([2e17]) ** 2
Also np.array([2e17]) ** 2 - np.array([2e17]) ** 2 + 1
np.array([2e17]) ** 2 + 1 - np.array([2e17]) ** 2 should be the same but they aren't, and numpy doesn't provide any warning. I think it is a lot to ask them to protect the end user form numerical limits. |
I think it would be fair to add a note in the doc about the implementation. In this example, a user may not know that previous values affect later values even when the window no longer contains those values. The same goes for other algorithms, and info about time/space complexity can be useful too. |
I agree because I clearly tried to find some explainations in the doc before making tests on my own. Some implementation infos would have helped. |
ok, how about we add to the docs, @julienvienne up for a pull-request? note that #11603 will be merged shortly. So do against the new structure for docs (well its the same in the original but going to be deprecated, so do on the new ones) |
Hello,
Please consider the following code :
For the last date (2015-01-10), you should obtain 7, which corresponds to [5, 6, 7, 8, 9] mean value.
Now, please replace the 2015-01-03 value by -9+33 extreme value.
And compute rolling_mean again :
As you can see, from the 2015-01-08, computation returns an incorrect result i.e [1, 2, 3] instead of [5, 6, 7]. The extreme value has introduced some perturbations in following date computation.
Best regards,
The text was updated successfully, but these errors were encountered: