REF: Remove rolling window fixed algorithms #36567

mroeschke · 2020-09-23T06:33:13Z

tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff

Pros:

Less code to maintain + makes internals simplier
Fixed a bug: https://github.com/pandas-dev/pandas/pull/36567/files#diff-7656adc204bcddba0534588cabdab62eR770

Cons:

A performance hit

       before           after         ratio
     [d9722efe]       [198ab9e8]
     <clean/rolling_aggregations^2>       <clean/rolling_aggregations>
+     1.27±0.02ms         4.05±1ms     3.19  rolling.Quantile.time_quantile('Series', 10, 'float', 1, 'linear')
+     1.27±0.01ms       3.67±0.7ms     2.88  rolling.Quantile.time_quantile('Series', 10, 'float', 1, 'nearest')
+     1.27±0.01ms       3.22±0.3ms     2.54  rolling.Quantile.time_quantile('Series', 10, 'float', 1, 'higher')
+     1.27±0.01ms      2.99±0.08ms     2.36  rolling.Quantile.time_quantile('Series', 10, 'float', 1, 'midpoint')
+        1.29±0ms      2.92±0.01ms     2.26  rolling.Quantile.time_quantile('Series', 10, 'float', 0, 'nearest')
+        1.29±0ms      2.91±0.01ms     2.26  rolling.Quantile.time_quantile('Series', 10, 'float', 0, 'midpoint')
+     1.29±0.08ms      2.91±0.01ms     2.25  rolling.Methods.time_rolling('Series', 10, 'float', 'max')
+     1.29±0.01ms         2.91±0ms     2.25  rolling.Quantile.time_quantile('Series', 10, 'float', 0, 'higher')
+     1.29±0.01ms      2.90±0.02ms     2.24  rolling.Quantile.time_quantile('Series', 10, 'float', 0, 'linear')
+     1.30±0.02ms      2.91±0.04ms     2.24  rolling.Quantile.time_quantile('Series', 10, 'float', 0, 'lower')
+     1.30±0.03ms      2.90±0.01ms     2.23  rolling.Quantile.time_quantile('Series', 10, 'float', 1, 'lower')
+     1.32±0.08ms      2.93±0.02ms     2.22  rolling.Methods.time_rolling('Series', 10, 'float', 'min')
+     1.33±0.03ms         2.95±0ms     2.21  rolling.Methods.time_rolling('Series', 10, 'int', 'max')
+     1.44±0.01ms      3.11±0.03ms     2.16  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 1, 'lower')
+     1.37±0.05ms      2.96±0.01ms     2.16  rolling.Methods.time_rolling('Series', 10, 'int', 'min')
+     1.44±0.01ms      3.10±0.02ms     2.15  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 1, 'midpoint')
+     1.44±0.01ms      3.07±0.01ms     2.14  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 1, 'higher')
+        1.44±0ms      3.08±0.01ms     2.14  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 1, 'nearest')
+     1.45±0.01ms      3.07±0.01ms     2.13  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'max')
+     1.46±0.01ms      3.07±0.01ms     2.10  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 1, 'linear')
+        1.47±0ms      3.09±0.02ms     2.10  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 0, 'linear')
+     1.48±0.02ms      3.10±0.02ms     2.10  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 0, 'higher')
+        1.47±0ms         3.08±0ms     2.09  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'min')
+     1.47±0.01ms      3.07±0.01ms     2.09  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 0, 'midpoint')
+      1.48±0.2ms      3.10±0.02ms     2.09  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 0, 'nearest')
+     1.47±0.04ms      3.05±0.02ms     2.08  rolling.Quantile.time_quantile('Series', 1000, 'float', 1, 'linear')
+     1.52±0.01ms      3.15±0.02ms     2.07  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'max')
+     1.54±0.01ms      3.17±0.03ms     2.06  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'min')
+        1.49±0ms      3.05±0.06ms     2.04  rolling.Quantile.time_quantile('Series', 1000, 'float', 1, 'nearest')
+        1.49±0ms      3.03±0.03ms     2.03  rolling.Quantile.time_quantile('Series', 1000, 'float', 1, 'lower')
+        1.49±0ms      2.98±0.03ms     1.99  rolling.Quantile.time_quantile('Series', 1000, 'float', 1, 'midpoint')
+     1.40±0.01ms      2.75±0.01ms     1.97  rolling.ExpandingMethods.time_expanding('Series', 'float', 'max')
+        1.49±0ms      2.93±0.02ms     1.96  rolling.Quantile.time_quantile('Series', 1000, 'float', 1, 'higher')
+     1.44±0.01ms      2.81±0.02ms     1.96  rolling.ExpandingMethods.time_expanding('Series', 'int', 'max')
+      1.59±0.2ms      3.11±0.03ms     1.95  rolling.Quantile.time_quantile('DataFrame', 10, 'float', 0, 'lower')
+         930±3μs      1.80±0.01ms     1.94  rolling.Quantile.time_quantile('Series', 10, 'int', 0, 'higher')
+        1.52±0ms      2.94±0.02ms     1.93  rolling.Quantile.time_quantile('Series', 1000, 'float', 0, 'linear')
+      1.51±0.2ms      2.92±0.01ms     1.93  rolling.Methods.time_rolling('Series', 1000, 'float', 'max')
+         933±4μs      1.80±0.01ms     1.93  rolling.Quantile.time_quantile('Series', 10, 'int', 0, 'nearest')
+         931±4μs      1.79±0.01ms     1.93  rolling.Quantile.time_quantile('Series', 10, 'int', 1, 'nearest')
+     1.69±0.01ms      3.26±0.01ms     1.93  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 1, 'linear')
+         933±4μs      1.80±0.01ms     1.93  rolling.Quantile.time_quantile('Series', 10, 'int', 0, 'midpoint')
+      1.52±0.2ms      2.93±0.01ms     1.93  rolling.Methods.time_rolling('Series', 1000, 'float', 'min')
+        937±10μs      1.80±0.01ms     1.92  rolling.Quantile.time_quantile('Series', 10, 'int', 0, 'lower')
+        1.56±0ms      2.99±0.05ms     1.92  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'max')
+        940±30μs      1.80±0.01ms     1.91  rolling.Quantile.time_quantile('Series', 10, 'int', 1, 'midpoint')
+     1.56±0.02ms      2.98±0.01ms     1.91  rolling.Methods.time_rolling('Series', 1000, 'int', 'max')
+         935±3μs      1.79±0.01ms     1.91  rolling.Quantile.time_quantile('Series', 10, 'int', 1, 'higher')
+         934±7μs      1.78±0.01ms     1.91  rolling.Quantile.time_quantile('Series', 10, 'int', 0, 'linear')
+         931±3μs      1.78±0.01ms     1.91  rolling.Quantile.time_quantile('Series', 10, 'int', 1, 'lower')
+     1.45±0.01ms         2.76±0ms     1.91  rolling.ExpandingMethods.time_expanding('Series', 'float', 'min')
+        1.63±0ms      3.11±0.01ms     1.90  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0, 'nearest')
+         937±4μs      1.78±0.01ms     1.90  rolling.Quantile.time_quantile('Series', 10, 'int', 1, 'linear')
+         961±2μs      1.82±0.04ms     1.90  rolling.Quantile.time_quantile('Series', 1000, 'int', 0, 'midpoint')
+     1.49±0.02ms      2.82±0.01ms     1.89  rolling.ExpandingMethods.time_expanding('Series', 'int', 'min')
+         964±9μs      1.82±0.02ms     1.88  rolling.Quantile.time_quantile('Series', 1000, 'int', 0, 'nearest')
+         963±7μs      1.81±0.05ms     1.88  rolling.Quantile.time_quantile('Series', 1000, 'int', 0, 'linear')
+         962±8μs      1.81±0.01ms     1.88  rolling.Quantile.time_quantile('Series', 1000, 'int', 0, 'higher')
+         967±3μs      1.81±0.01ms     1.88  rolling.Quantile.time_quantile('Series', 1000, 'int', 0, 'lower')
+         966±3μs      1.81±0.02ms     1.87  rolling.Quantile.time_quantile('Series', 1000, 'int', 1, 'linear')
+         964±1μs      1.80±0.02ms     1.87  rolling.Quantile.time_quantile('Series', 1000, 'int', 1, 'nearest')
+     1.66±0.03ms      3.10±0.01ms     1.87  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 1, 'nearest')
+         964±4μs         1.80±0ms     1.87  rolling.Quantile.time_quantile('Series', 1000, 'int', 1, 'higher')
+        1.12±0ms      2.08±0.01ms     1.87  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 1, 'nearest')
+     1.67±0.01ms      3.10±0.01ms     1.86  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 1, 'midpoint')
+     1.63±0.02ms      3.02±0.03ms     1.86  rolling.ExpandingMethods.time_expanding('DataFrame', 'int', 'max')
+     1.68±0.01ms      3.11±0.01ms     1.86  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 1, 'lower')
+         969±3μs      1.80±0.02ms     1.85  rolling.Quantile.time_quantile('Series', 1000, 'int', 1, 'midpoint')
+     1.68±0.01ms      3.11±0.02ms     1.85  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 1, 'higher')
+     1.61±0.09ms      2.97±0.01ms     1.85  rolling.Methods.time_rolling('Series', 1000, 'int', 'min')
+         970±7μs      1.79±0.01ms     1.84  rolling.Quantile.time_quantile('Series', 1000, 'int', 1, 'lower')
+     1.72±0.01ms       3.16±0.1ms     1.84  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0, 'linear')
+     1.71±0.02ms      3.16±0.07ms     1.84  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'max')
+     1.70±0.08ms      3.13±0.02ms     1.84  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'min')
+     1.70±0.07ms      3.11±0.01ms     1.83  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0, 'lower')
+     1.71±0.01ms      3.13±0.02ms     1.83  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0, 'higher')
+     1.65±0.01ms      3.00±0.02ms     1.82  rolling.ExpandingMethods.time_expanding('DataFrame', 'int', 'min')
+     1.15±0.01ms      2.09±0.01ms     1.82  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 1, 'linear')
+     1.71±0.02ms      3.11±0.01ms     1.82  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'max')
+        1.71±0ms      3.11±0.01ms     1.82  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0, 'midpoint')
+     1.62±0.01ms      2.94±0.03ms     1.82  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'min')
+        1.12±0ms      2.02±0.05ms     1.80  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 1, 'higher')
+     1.76±0.08ms      3.15±0.01ms     1.80  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'min')
+     1.12±0.01ms      1.99±0.03ms     1.78  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0, 'nearest')
+     1.12±0.01ms      1.99±0.02ms     1.78  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 1, 'lower')
+        1.14±0ms      2.03±0.03ms     1.78  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 0, 'higher')
+        1.12±0ms      1.99±0.01ms     1.78  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0, 'higher')
+        1.15±0ms      2.03±0.05ms     1.77  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 0, 'linear')
+     1.13±0.01ms      2.00±0.05ms     1.77  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 0, 'nearest')
+        1.12±0ms      1.98±0.04ms     1.77  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0, 'lower')
+        1.12±0ms      1.97±0.01ms     1.76  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 1, 'midpoint')
+     1.13±0.01ms      1.99±0.01ms     1.76  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0, 'midpoint')
+        1.12±0ms      1.97±0.01ms     1.76  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 1, 'linear')
+     1.14±0.01ms      1.98±0.01ms     1.74  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 1, 'midpoint')
+     1.14±0.02ms         1.97±0ms     1.74  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 0, 'lower')
+     1.14±0.01ms         1.97±0ms     1.73  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 1, 'higher')
+     1.14±0.01ms      1.97±0.01ms     1.73  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 0, 'midpoint')
+        1.14±0ms      1.97±0.01ms     1.72  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 1, 'nearest')
+     1.14±0.01ms      1.97±0.01ms     1.72  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 1, 'lower')
+     1.46±0.07ms      2.48±0.01ms     1.70  rolling.Methods.time_rolling('Series', 10, 'float', 'std')
+     1.21±0.09ms      2.00±0.02ms     1.66  rolling.Quantile.time_quantile('DataFrame', 10, 'int', 0, 'linear')
+     1.54±0.03ms      2.54±0.01ms     1.65  rolling.Methods.time_rolling('Series', 1000, 'int', 'std')
+     1.62±0.04ms      2.66±0.01ms     1.65  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'std')
+     1.68±0.01ms      2.75±0.02ms     1.64  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'std')
+      1.56±0.1ms      2.53±0.01ms     1.63  rolling.Methods.time_rolling('Series', 10, 'int', 'std')
+     1.66±0.03ms      2.67±0.01ms     1.61  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'std')
+     1.71±0.03ms      2.73±0.01ms     1.59  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'std')
+        913±40μs         1.42±0ms     1.56  rolling.Methods.time_rolling('Series', 10, 'float', 'mean')
+      1.62±0.2ms      2.49±0.01ms     1.54  rolling.Methods.time_rolling('Series', 1000, 'float', 'std')
+       933±100μs      1.43±0.01ms     1.53  rolling.Methods.time_rolling('Series', 1000, 'float', 'mean')
+        980±30μs      1.48±0.01ms     1.51  rolling.Methods.time_rolling('Series', 10, 'int', 'mean')
+     1.07±0.01ms      1.60±0.01ms     1.50  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'mean')
+        985±60μs      1.47±0.01ms     1.50  rolling.Methods.time_rolling('Series', 1000, 'int', 'mean')
+        1.21±0ms      1.81±0.01ms     1.49  rolling.ExpandingMethods.time_expanding('Series', 'float', 'kurt')
+        1.43±0ms      2.11±0.01ms     1.47  rolling.Methods.time_rolling('Series', 1000, 'int', 'kurt')
+        1.14±0ms      1.68±0.01ms     1.47  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'mean')
+        1.26±0ms         1.86±0ms     1.47  rolling.ExpandingMethods.time_expanding('Series', 'int', 'kurt')
+     1.40±0.08ms      2.06±0.01ms     1.47  rolling.Methods.time_rolling('Series', 10, 'float', 'kurt')
+     1.14±0.01ms         1.67±0ms     1.46  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'mean')
+     1.12±0.01ms      1.63±0.01ms     1.45  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'mean')
+     1.55±0.01ms      2.24±0.01ms     1.44  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'kurt')
+         846±4μs         1.22±0ms     1.44  rolling.ExpandingMethods.time_expanding('Series', 'float', 'mean')
+     1.38±0.01ms      1.98±0.01ms     1.43  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'kurt')
+        1.62±0ms      2.30±0.01ms     1.42  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'kurt')
+     1.27±0.02ms      1.80±0.02ms     1.41  rolling.Methods.time_rolling('Series', 10, 'float', 'skew')
+     1.57±0.02ms      2.22±0.06ms     1.41  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'kurt')
+     1.32±0.08ms      1.86±0.01ms     1.41  rolling.Methods.time_rolling('Series', 10, 'int', 'skew')
+     1.44±0.01ms      2.03±0.01ms     1.41  rolling.ExpandingMethods.time_expanding('DataFrame', 'int', 'kurt')
+         895±3μs         1.26±0ms     1.40  rolling.ExpandingMethods.time_expanding('Series', 'int', 'mean')
+     1.32±0.06ms         1.84±0ms     1.40  rolling.Methods.time_rolling('Series', 1000, 'int', 'skew')
+     1.64±0.04ms      2.29±0.01ms     1.40  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'kurt')
+     1.42±0.01ms         1.97±0ms     1.39  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'skew')
+        900±50μs      1.25±0.02ms     1.38  rolling.Methods.time_rolling('Series', 10, 'float', 'sum')
+     2.12±0.07ms      2.92±0.01ms     1.38  rolling.Quantile.time_quantile('Series', 1000, 'float', 0, 'lower')
+     1.49±0.01ms         2.04±0ms     1.37  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'skew')
+     1.45±0.04ms      1.98±0.03ms     1.37  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'skew')
+     1.51±0.02ms      2.05±0.01ms     1.36  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'skew')
+      1.51±0.2ms      2.05±0.01ms     1.36  rolling.Methods.time_rolling('Series', 1000, 'float', 'kurt')
+     1.02±0.01ms      1.38±0.01ms     1.36  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'mean')
+     1.08±0.01ms      1.46±0.02ms     1.36  rolling.ExpandingMethods.time_expanding('DataFrame', 'int', 'mean')
+        962±60μs         1.28±0ms     1.34  rolling.Methods.time_rolling('Series', 10, 'int', 'sum')
+         841±5μs      1.12±0.01ms     1.33  rolling.ExpandingMethods.time_expanding('Series', 'float', 'sum')
+     1.26±0.01ms      1.66±0.02ms     1.33  rolling.ExpandingMethods.time_expanding('Series', 'int', 'skew')
+        969±80μs      1.28±0.01ms     1.32  rolling.Methods.time_rolling('Series', 1000, 'int', 'sum')
+     1.20±0.02ms      1.58±0.01ms     1.31  rolling.ExpandingMethods.time_expanding('Series', 'float', 'skew')
+     1.14±0.01ms      1.48±0.01ms     1.30  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'sum')
+         899±5μs         1.17±0ms     1.30  rolling.ExpandingMethods.time_expanding('Series', 'int', 'sum')
+     1.08±0.04ms      1.40±0.01ms     1.30  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'sum')
+     1.09±0.04ms      1.41±0.01ms     1.29  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'sum')
+        1.36±0ms      1.75±0.01ms     1.29  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'skew')
+     1.15±0.06ms      1.47±0.01ms     1.28  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'sum')
+        1.01±0ms         1.28±0ms     1.27  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'sum')
+        1.43±0ms      1.81±0.01ms     1.27  rolling.ExpandingMethods.time_expanding('DataFrame', 'int', 'skew')
+     1.22±0.03ms      1.54±0.01ms     1.26  rolling.Methods.time_rolling('Series', 10, 'float', 'count')
+        1.10±0ms      1.38±0.01ms     1.26  rolling.ExpandingMethods.time_expanding('Series', 'int', 'count')
+     1.07±0.01ms         1.35±0ms     1.26  rolling.ExpandingMethods.time_expanding('DataFrame', 'int', 'sum')
+        1.36±0ms      1.71±0.02ms     1.26  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'count')
+     1.40±0.01ms      1.74±0.01ms     1.24  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'count')
+     1.14±0.01ms         1.40±0ms     1.23  rolling.ExpandingMethods.time_expanding('Series', 'float', 'count')
+      1.22±0.1ms         1.49±0ms     1.22  rolling.Methods.time_rolling('Series', 10, 'int', 'count')
+     1.40±0.04ms      1.70±0.01ms     1.21  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'count')
+     1.43±0.09ms      1.73±0.01ms     1.21  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'count')
+        1.29±0ms      1.56±0.01ms     1.21  rolling.ExpandingMethods.time_expanding('DataFrame', 'int', 'count')
+     1.33±0.01ms      1.60±0.01ms     1.20  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'count')
+        1.38±0ms      1.66±0.01ms     1.20  rolling.ExpandingMethods.time_expanding('Series', 'float', 'std')
+     1.44±0.01ms      1.72±0.01ms     1.19  rolling.ExpandingMethods.time_expanding('Series', 'int', 'std')
+     1.56±0.01ms         1.84±0ms     1.18  rolling.ExpandingMethods.time_expanding('DataFrame', 'float', 'std')
+     1.62±0.01ms      1.89±0.01ms     1.17  rolling.ExpandingMethods.time_expanding('DataFrame', 'int', 'std')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

… with center tests still failing

…regations

pandas/_libs/window/aggregations.pyx

…regations

mroeschke · 2020-09-25T04:17:58Z

pandas/tests/window/test_rolling.py

@@ -767,7 +767,7 @@ def test_rolling_numerical_too_large_numbers():
    ds[2] = -9e33
    result = ds.rolling(5).mean()
    expected = pd.Series(
-        [np.nan, np.nan, np.nan, np.nan, -1.8e33, -1.8e33, -1.8e33, 0.0, 6.0, 7.0],
+        [np.nan, np.nan, np.nan, np.nan, -1.8e33, -1.8e33, -1.8e33, 5.0, 6.0, 7.0],


It was thought that the 0 was expected due to numerical precision, but by using the variable algorithm instead of the fixed algorithm we get the correct value

xref #11645 (comment)

…regations

jreback · 2020-09-26T01:41:01Z

so all of your bencharmarks on quantile are a bit misleading; they are actually measuring the perf diff in min/max (which is what percentile 0 or 1 does)

…regations

jreback · 2020-09-27T19:34:49Z

pandas/_libs/window/aggregations.pyx

@@ -414,7 +356,7 @@ cdef inline float64_t calc_var(int64_t minp, int ddof, float64_t nobs,
            result = 0
        else:
            result = ssqdm_x / (nobs - <float64_t>ddof)
-            if result < 0:
+            if result < 1e-15:


is this new? ok but can you add a comment

Sure, turns out var/std doesn't use Kahan Summation so using the variable algorithm has a small numerical imprecision

…regations

jreback · 2020-10-02T20:11:43Z

let's file an issue for followon to see if we can improve min//max rolling with variable.

jreback · 2020-10-02T20:11:55Z

any comments @TomAugspurger

jreback

if you can add a release, e.g. slight performance decrease in min/max fixed algos.

…regations

jreback · 2020-10-02T20:31:28Z

do these changes help / hurt #36132 ? (maybe worth pulling in and xfailing these tests)? though that can certainly be a followon

…regations

mroeschke · 2020-10-02T21:07:11Z

Yeah I was thinking #36132 could be simplified once this PR is merged in. Probably best on a follow-on

…regations

mroeschke · 2020-10-05T04:50:34Z

All green

…regations

jreback · 2020-10-05T21:33:22Z

cc @pandas-dev/pandas-core if any comments.

…regations

jreback · 2020-10-09T20:04:43Z

can you merge master and ping on green.

…regations

mroeschke · 2020-10-09T22:13:54Z

@jreback green

jreback · 2020-10-09T23:16:50Z

thanks @mroeschke very nice

* Fix rolling test result, removed fixed algorithms, some rolling apply with center tests still failing * Rename function and move offset to correct location * Impliment center in terms of indexers * Get all the tests to pass * Remove center from _apply as no longer needed * Add better typing * Deal with numeric precision * Remove code related to center now being implimented in the fixed indexer * Add note regarding precision issues * Note performance hit * Remove tilde * Change useage of get_cython_func_type * Remove self.center from count's _apply Co-authored-by: Matt Roeschke <[email protected]>

Matt Roeschke added 4 commits September 20, 2020 10:32

Fix rolling test result, removed fixed algorithms, some rolling apply…

30d609e

… with center tests still failing

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

1c60dd3

…regations

Rename function and move offset to correct location

57d8607

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

198ab9e

…regations

jbrockmendel reviewed Sep 23, 2020

View reviewed changes

pandas/_libs/window/aggregations.pyx Outdated Show resolved Hide resolved

Matt Roeschke added 3 commits September 24, 2020 00:20

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

4403980

…regations

Impliment center in terms of indexers

f057ac6

Get all the tests to pass

f00f16e

mroeschke mentioned this pull request Sep 25, 2020

API: reimplement FixedWindowIndexer.get_window_bounds to fix groupby bug #36132

Closed

5 tasks

Remove center from _apply as no longer needed

3ba5e8a

mroeschke changed the title ~~WIP: REF: Remove rolling window fixed algorithms~~ REF: Remove rolling window fixed algorithms Sep 25, 2020

Add better typing

980a8ef

mroeschke commented Sep 25, 2020

View reviewed changes

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

95b1129

…regations

mroeschke added Refactor Internal refactoring of code Window rolling, ewma, expanding labels Sep 25, 2020

mroeschke added this to the 1.2 milestone Sep 25, 2020

Deal with numeric precision

21443f7

Matt Roeschke added 2 commits September 27, 2020 12:22

Remove code related to center now being implimented in the fixed indexer

b9ef1a0

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

59f86b5

…regations

jreback reviewed Sep 27, 2020

View reviewed changes

Matt Roeschke added 4 commits September 27, 2020 12:44

Add note regarding precision issues

7670551

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

526dd4d

…regations

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

a0bf4d3

…regations

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

7b8a333

…regations

jreback requested changes Oct 2, 2020

View reviewed changes

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

ff47e59

…regations

Matt Roeschke added 2 commits October 2, 2020 13:25

Note performance hit

e6cf772

Remove tilde

10d5177

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

8b0fa34

…regations

Matt Roeschke added 3 commits October 2, 2020 14:27

Change useage of get_cython_func_type

4913c84

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

1926408

…regations

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

8007605

…regations

mroeschke mentioned this pull request Oct 4, 2020

TRACKER: milestones twosigma/pandas#44

Open

32 tasks

Matt Roeschke added 2 commits October 4, 2020 16:10

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

ec3b2a4

…regations

Remove self.center from count's _apply

cbce3b0

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

3f22811

…regations

Matt Roeschke added 4 commits October 6, 2020 18:38

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

43a8a02

…regations

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

c11d6a9

…regations

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

d658fce

…regations

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

b399543

…regations

jreback approved these changes Oct 9, 2020

View reviewed changes

Merge remote-tracking branch 'upstream/master' into clean/rolling_agg…

32667ba

…regations

jreback merged commit 846cff9 into pandas-dev:master Oct 9, 2020

mroeschke deleted the clean/rolling_aggregations branch October 9, 2020 23:34

mroeschke mentioned this pull request Oct 10, 2020

PERF: Investivate potential performance improvements for rolling min/max #37022

Closed

This was referenced Oct 10, 2020

API: reimplement FixedWindowIndexer.get_window_bounds #37035

Merged

BUG: Rolling min_periods not working on groupby object #36040

Closed

simonjayhawkins mentioned this pull request Jan 1, 2021

REGR: incorrect results with std on rolling window since 1.2.0 #38874

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

REF: Remove rolling window fixed algorithms #36567

REF: Remove rolling window fixed algorithms #36567

Uh oh!

mroeschke commented Sep 23, 2020 •

edited by jreback

Loading

Uh oh!

Uh oh!

mroeschke Sep 25, 2020

Uh oh!

jreback commented Sep 26, 2020 •

edited

Loading

Uh oh!

jreback Sep 27, 2020

Uh oh!

mroeschke Sep 27, 2020

Uh oh!

jreback commented Oct 2, 2020

Uh oh!

jreback commented Oct 2, 2020

Uh oh!

jreback left a comment

Uh oh!

jreback commented Oct 2, 2020

Uh oh!

mroeschke commented Oct 2, 2020 •

edited

Loading

Uh oh!

mroeschke commented Oct 5, 2020

Uh oh!

jreback commented Oct 5, 2020

Uh oh!

jreback commented Oct 9, 2020

Uh oh!

mroeschke commented Oct 9, 2020

Uh oh!

jreback commented Oct 9, 2020

Uh oh!

Uh oh!

Uh oh!

REF: Remove rolling window fixed algorithms #36567

REF: Remove rolling window fixed algorithms #36567

Uh oh!

Conversation

mroeschke commented Sep 23, 2020 • edited by jreback Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mroeschke Sep 25, 2020

Choose a reason for hiding this comment

Uh oh!

jreback commented Sep 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback Sep 27, 2020

Choose a reason for hiding this comment

Uh oh!

mroeschke Sep 27, 2020

Choose a reason for hiding this comment

Uh oh!

jreback commented Oct 2, 2020

Uh oh!

jreback commented Oct 2, 2020

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

jreback commented Oct 2, 2020

Uh oh!

mroeschke commented Oct 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mroeschke commented Oct 5, 2020

Uh oh!

jreback commented Oct 5, 2020

Uh oh!

jreback commented Oct 9, 2020

Uh oh!

mroeschke commented Oct 9, 2020

Uh oh!

jreback commented Oct 9, 2020

Uh oh!

Uh oh!

mroeschke commented Sep 23, 2020 •

edited by jreback

Loading

jreback commented Sep 26, 2020 •

edited

Loading

mroeschke commented Oct 2, 2020 •

edited

Loading