-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: Remove rolling window fixed algorithms #36567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: Remove rolling window fixed algorithms #36567
Conversation
… with center tests still failing
@@ -767,7 +767,7 @@ def test_rolling_numerical_too_large_numbers(): | |||
ds[2] = -9e33 | |||
result = ds.rolling(5).mean() | |||
expected = pd.Series( | |||
[np.nan, np.nan, np.nan, np.nan, -1.8e33, -1.8e33, -1.8e33, 0.0, 6.0, 7.0], | |||
[np.nan, np.nan, np.nan, np.nan, -1.8e33, -1.8e33, -1.8e33, 5.0, 6.0, 7.0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was thought that the 0
was expected due to numerical precision, but by using the variable algorithm instead of the fixed algorithm we get the correct value
xref #11645 (comment)
so all of your bencharmarks on quantile are a bit misleading; they are actually measuring the perf diff in min/max (which is what percentile 0 or 1 does) |
@@ -414,7 +356,7 @@ cdef inline float64_t calc_var(int64_t minp, int ddof, float64_t nobs, | |||
result = 0 | |||
else: | |||
result = ssqdm_x / (nobs - <float64_t>ddof) | |||
if result < 0: | |||
if result < 1e-15: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this new? ok but can you add a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, turns out var/std doesn't use Kahan Summation so using the variable algorithm has a small numerical imprecision
let's file an issue for followon to see if we can improve min//max rolling with variable. |
any comments @TomAugspurger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you can add a release, e.g. slight performance decrease in min/max fixed algos.
do these changes help / hurt #36132 ? (maybe worth pulling in and xfailing these tests)? though that can certainly be a followon |
Yeah I was thinking #36132 could be simplified once this PR is merged in. Probably best on a follow-on |
All green |
cc @pandas-dev/pandas-core if any comments. |
can you merge master and ping on green. |
@jreback green |
thanks @mroeschke very nice |
* Fix rolling test result, removed fixed algorithms, some rolling apply with center tests still failing * Rename function and move offset to correct location * Impliment center in terms of indexers * Get all the tests to pass * Remove center from _apply as no longer needed * Add better typing * Deal with numeric precision * Remove code related to center now being implimented in the fixed indexer * Add note regarding precision issues * Note performance hit * Remove tilde * Change useage of get_cython_func_type * Remove self.center from count's _apply Co-authored-by: Matt Roeschke <[email protected]>
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Pros:
Cons: