CLN: ASV stat_ops #19049

mroeschke · 2018-01-03T05:41:26Z

There were some old pd.rolling_* methods being tested in stat_ops.py that I moved to rolling.py (or should they just be removed?), otherwise the usual cleanup:

$ asv dev -b ^stat_ops
· Discovering benchmarks
· Running 7 total benchmarks (1 commits * 1 environments * 7 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[ 14.29%] ··· Running stat_ops.Correlation.time_corr                         ok
[ 14.29%] ···· 
               ========== ========
                 method           
               ---------- --------
                spearman   118ms  
                kendall    693ms  
                pearson    5.80ms 
               ========== ========

[ 28.57%] ··· Running stat_ops.FrameMultiIndexOps.time_op                    ok
[ 28.57%] ···· 
               ======== ======== ======== ========
               --                   op            
               -------- --------------------------
                level     mean     sum     median 
               ======== ======== ======== ========
                  0      8.40ms   8.51ms   21.8ms 
                  1      8.57ms   8.52ms   22.6ms 
                [0, 1]   17.3ms   17.0ms   31.9ms 
               ======== ======== ======== ========

[ 42.86%] ··· Running stat_ops.FrameOps.time_op                              ok
[ 42.86%] ···· 
               ======== ================ ======= ======== ========
               --                                       axis      
               --------------------------------- -----------------
                  op     use_bottleneck   dtype     0        1    
               ======== ================ ======= ======== ========
                 mean         True        float   1.16ms   2.07ms 
                 mean         True         int    1.27ms   2.10ms 
                 mean        False        float   11.8ms   12.2ms 
                 mean        False         int    10.2ms   11.0ms 
                 sum          True        float   11.6ms   11.6ms 
                 sum          True         int    7.41ms   8.42ms 
                 sum         False        float   11.7ms   11.6ms 
                 sum         False         int    7.41ms   8.19ms 
                median        True        float   6.86ms   6.05ms 
                median        True         int    4.53ms   5.67ms 
                median       False        float   23.4ms   7.45s  
                median       False         int    24.9ms   7.44s  
                 std          True        float   1.95ms   4.51ms 
                 std          True         int    3.42ms   6.06ms 
                 std         False        float   23.2ms   26.6ms 
                 std         False         int    24.5ms   25.8ms 
               ======== ================ ======= ======== ========

[ 57.14%] ··· Running stat_ops.Rank.time_average_old                         ok
[ 57.14%] ···· 
               ============= ======= =======
               --                  pct      
               ------------- ---------------
                constructor    True   False 
               ============= ======= =======
                 DataFrame    435ms   432ms 
                   Series     432ms   435ms 
               ============= ======= =======

[ 71.43%] ··· Running stat_ops.Rank.time_rank                                ok
[ 71.43%] ···· 
               ============= ======== ========
               --                   pct       
               ------------- -----------------
                constructor    True    False  
               ============= ======== ========
                 DataFrame    18.6ms   18.4ms 
                   Series     18.9ms   18.2ms 
               ============= ======== ========

[ 85.71%] ··· Running stat_ops.SeriesMultiIndexOps.time_op                   ok
[ 85.71%] ···· 
               ======== ======== ======== ========
               --                   op            
               -------- --------------------------
                level     mean     sum     median 
               ======== ======== ======== ========
                  0      21.2ms   20.3ms   23.6ms 
                  1      21.0ms   21.0ms   25.0ms 
                [0, 1]   15.2ms   15.6ms   18.9ms 
               ======== ======== ======== ========

[100.00%] ··· Running stat_ops.SeriesOps.time_op                             ok
[100.00%] ···· 
               ======== ================ ======== ========
               --                              dtype      
               ------------------------- -----------------
                  op     use_bottleneck   float     int   
               ======== ================ ======== ========
                 mean         True        421μs    388μs  
                 mean        False        2.01ms   2.20ms 
                 sum          True        2.02ms   2.09ms 
                 sum         False        2.01ms   2.04ms 
                median        True        2.23ms   1.30ms 
                median       False        6.25ms   6.73ms 
                 std          True        603μs    951μs  
                 std         False        3.26ms   3.67ms 
               ======== ================ ======== ========

$ asv dev -b ^rolling.DepreciatedRolling
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[100.00%] ··· Running rolling.DepreciatedRolling.time_method                 ok
[100.00%] ···· 
               ================ ========
                    method              
               ---------------- --------
                rolling_median   88.8ms 
                 rolling_mean    11.0ms 
                 rolling_min     12.4ms 
                 rolling_max     12.1ms 
                 rolling_var     12.9ms 
                 rolling_skew    16.2ms 
                 rolling_kurt    16.0ms 
                 rolling_std     14.2ms 
               ================ ========

[100.00%] ····· 
                
                For parameters: 'rolling_median'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/rolling.py:56: FutureWarning: pd.rolling_median is deprecated for ndarrays and will be removed in a future version
                  getattr(pd, method)(self.arr, self.win)
                
                For parameters: 'rolling_mean'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/rolling.py:56: FutureWarning: pd.rolling_mean is deprecated for ndarrays and will be removed in a future version
                  getattr(pd, method)(self.arr, self.win)
                
                For parameters: 'rolling_min'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/rolling.py:56: FutureWarning: pd.rolling_min is deprecated for ndarrays and will be removed in a future version
                  getattr(pd, method)(self.arr, self.win)
                
                For parameters: 'rolling_max'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/rolling.py:56: FutureWarning: pd.rolling_max is deprecated for ndarrays and will be removed in a future version
                  getattr(pd, method)(self.arr, self.win)
                
                For parameters: 'rolling_var'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/rolling.py:56: FutureWarning: pd.rolling_var is deprecated for ndarrays and will be removed in a future version
                  getattr(pd, method)(self.arr, self.win)
                
                For parameters: 'rolling_skew'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/rolling.py:56: FutureWarning: pd.rolling_skew is deprecated for ndarrays and will be removed in a future version
                  getattr(pd, method)(self.arr, self.win)
                
                For parameters: 'rolling_kurt'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/rolling.py:56: FutureWarning: pd.rolling_kurt is deprecated for ndarrays and will be removed in a future version
                  getattr(pd, method)(self.arr, self.win)
                
                For parameters: 'rolling_std'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/rolling.py:56: FutureWarning: pd.rolling_std is deprecated for ndarrays and will be removed in a future version
                  getattr(pd, method)(self.arr, self.win)

pep8speaks · 2018-01-03T05:41:29Z

Hello @mroeschke! Thanks for updating the PR.

In the file asv_bench/benchmarks/stat_ops.py, following are the PEP8 issues :

Line 21:9: E722 do not use bare except'
Line 59:9: E722 do not use bare except'

Comment last updated on January 06, 2018 at 05:37 Hours UTC

codecov · 2018-01-03T07:47:46Z

Codecov Report

Merging #19049 into master will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #19049      +/-   ##
==========================================
+ Coverage   91.51%   91.53%   +0.01%     
==========================================
  Files         148      148              
  Lines       48807    48688     -119     
==========================================
- Hits        44667    44566     -101     
+ Misses       4140     4122      -18

Flag	Coverage Δ
#multiple	`89.9% <ø> (+0.01%)`	⬆️
#single	`41.63% <ø> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/interval.py	`92.19% <0%> (-0.43%)`	⬇️
pandas/core/indexes/timedeltas.py	`90.92% <0%> (-0.17%)`	⬇️
pandas/util/testing.py	`84.41% <0%> (-0.04%)`	⬇️
pandas/core/ops.py	`91.89% <0%> (-0.02%)`	⬇️
pandas/core/panel.py	`96.83% <0%> (-0.01%)`	⬇️
pandas/core/frame.py	`97.62% <0%> (-0.01%)`	⬇️
pandas/core/strings.py	`98.46% <0%> (-0.01%)`	⬇️
pandas/tseries/offsets.py	`96.97% <0%> (ø)`	⬆️
pandas/core/groupby.py	`92.14% <0%> (ø)`	⬆️
pandas/core/generic.py	`95.9% <0%> (ø)`	⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3198b9d...d2593a3. Read the comment docs.

jreback · 2018-01-03T11:10:40Z

asv_bench/benchmarks/rolling.py

+class DepreciatedRolling(object):
+
+    sample_time = 0.2
+    params = ['rolling_median', 'rolling_mean', 'rolling_min', 'rolling_max',


i would prob remove these, we go back pretty far with the newer stuff already.

jreback · 2018-01-03T11:11:10Z

asv_bench/benchmarks/stat_ops.py

    param_names = ['op', 'use_bottleneck', 'dtype', 'axis']
-    params = [['mean', 'sum', 'median'],
+    params = [['mean', 'sum', 'median', 'std'],


you could make use_bottleneck a param here

use_bottleneck is a param here, over [True, False]

jreback

see comments

mroeschke · 2018-01-04T04:20:45Z

Removed the older rolling benchmarks.

jreback · 2018-01-04T15:30:50Z

asv_bench/benchmarks/stat_ops.py

-
-class stats_rank2d_axis1_average(object):
-    goal_time = 0.2
+    def setup(self, op, use_bottleneck, dtype):


you could put the bottleneck tests in a separate function / base class I think

mroeschke · 2018-01-05T06:06:49Z

Created a new class for the bottleneck benchmarks:

[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[ 12.50%] ··· Running stat_ops.Bottleneck.time_mean                          ok
[ 12.50%] ···· 
               ================ =========== ========
               --                   constructor     
               ---------------- --------------------
                use_bottleneck   DataFrame   Series 
               ================ =========== ========
                     True          2.43ms    2.37ms 
                    False          15.7ms    15.4ms 
               ================ =========== ========

[ 25.00%] ··· Running stat_ops.Correlation.time_corr                         ok
[ 25.00%] ···· 
               ========== ========
                 method           
               ---------- --------
                spearman   119ms  
                kendall    648ms  
                pearson    5.80ms 
               ========== ========

[ 37.50%] ··· Running stat_ops.FrameMultiIndexOps.time_op                    ok
[ 37.50%] ···· 
               ======== ======== ======== ========
               --                   op            
               -------- --------------------------
                level     mean     sum     median 
               ======== ======== ======== ========
                  0      8.42ms   8.51ms   22.2ms 
                  1      8.55ms   8.53ms   22.6ms 
                [0, 1]   17.2ms   17.1ms   31.9ms 
               ======== ======== ======== ========

[ 50.00%] ··· Running stat_ops.FrameOps.time_op                              ok
[ 50.00%] ···· 
               ======== =========== =========== ========= =========
               --                       dtype / axis               
               -------- -------------------------------------------
                  op     float / 0   float / 1   int / 0   int / 1 
               ======== =========== =========== ========= =========
                 mean      1.20ms      2.07ms     1.27ms    2.08ms 
                 sum       11.4ms      11.6ms     7.46ms    8.27ms 
                median     6.92ms      6.10ms     4.41ms    5.58ms 
                 std       1.95ms      4.49ms     3.42ms    6.10ms 
               ======== =========== =========== ========= =========

[ 62.50%] ··· Running stat_ops.Rank.time_average_old                         ok
[ 62.50%] ···· 
               ============= ======= =======
               --                  pct      
               ------------- ---------------
                constructor    True   False 
               ============= ======= =======
                 DataFrame    435ms   436ms 
                   Series     434ms   433ms 
               ============= ======= =======

[ 75.00%] ··· Running stat_ops.Rank.time_rank                                ok
[ 75.00%] ···· 
               ============= ======== ========
               --                   pct       
               ------------- -----------------
                constructor    True    False  
               ============= ======== ========
                 DataFrame    18.9ms   18.5ms 
                   Series     19.2ms   18.7ms 
               ============= ======== ========

[ 87.50%] ··· Running stat_ops.SeriesMultiIndexOps.time_op                   ok
[ 87.50%] ···· 
               ======== ======== ======== ========
               --                   op            
               -------- --------------------------
                level     mean     sum     median 
               ======== ======== ======== ========
                  0      20.7ms   20.7ms   24.9ms 
                  1      21.7ms   20.8ms   24.9ms 
                [0, 1]   15.5ms   15.5ms   19.1ms 
               ======== ======== ======== ========

[100.00%] ··· Running stat_ops.SeriesOps.time_op                             ok
[100.00%] ···· 
               ======== ======== ========
               --             dtype      
               -------- -----------------
                  op     float     int   
               ======== ======== ========
                 mean    415μs    432μs  
                 sum     2.03ms   2.11ms 
                median   2.19ms   1.33ms 
                 std     616μs    980μs  
               ======== ======== ========

jreback · 2018-01-05T14:04:45Z

asv_bench/benchmarks/stat_ops.py

    goal_time = 0.2
+    param_names = ['op', 'dtype']


what I meant about bottleneck was make it a parameter in ops, e.g. here

jreback · 2018-01-05T14:05:13Z

asv_bench/benchmarks/stat_ops.py

    goal_time = 0.2
+    param_names = ['op', 'dtype', 'axis']
+    params = [['mean', 'sum', 'median', 'std'],


and use bottleneck here, also expand these to all of the stat ops (min, max, var, kurt, etc)

mroeschke · 2018-01-06T05:40:04Z

Defined a list of ops at the top of the file that each benchmark params over. Also, FrameOps and SeriesOps now params over use_bottleneck in [True, False]

jreback reviewed Jan 3, 2018

View reviewed changes

jreback requested changes Jan 3, 2018

View reviewed changes

jreback added Benchmark Performance (ASV) benchmarks Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jan 3, 2018

mroeschke added 2 commits January 3, 2018 20:03

CLN: ASV stat_ops

c10c00e

Remove old rolling method benchmarks

5fd9420

mroeschke force-pushed the asv_clean_stats_ops branch from c9ec227 to 5fd9420 Compare January 4, 2018 04:17

jreback requested changes Jan 4, 2018

View reviewed changes

create bottleneck class

a44d059

jreback added this to the 0.23.0 milestone Jan 5, 2018

jreback approved these changes Jan 5, 2018

View reviewed changes

jreback requested changes Jan 5, 2018

View reviewed changes

jreback removed this from the 0.23.0 milestone Jan 5, 2018

Add bottleneck params and more ops

d2593a3

jreback approved these changes Jan 6, 2018

View reviewed changes

jreback added this to the 0.23.0 milestone Jan 6, 2018

jreback merged commit d539bdd into pandas-dev:master Jan 6, 2018

mroeschke deleted the asv_clean_stats_ops branch January 7, 2018 01:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLN: ASV stat_ops #19049

CLN: ASV stat_ops #19049

mroeschke commented Jan 3, 2018

pep8speaks commented Jan 3, 2018 •

edited

Loading

codecov bot commented Jan 3, 2018 •

edited

Loading

jreback Jan 3, 2018

jreback Jan 3, 2018

mroeschke Jan 4, 2018

jreback left a comment

mroeschke commented Jan 4, 2018

jreback Jan 4, 2018

mroeschke commented Jan 5, 2018

jreback Jan 5, 2018

jreback Jan 5, 2018

mroeschke commented Jan 6, 2018 •

edited

Loading

CLN: ASV stat_ops #19049

CLN: ASV stat_ops #19049

Conversation

mroeschke commented Jan 3, 2018

pep8speaks commented Jan 3, 2018 • edited Loading

Comment last updated on January 06, 2018 at 05:37 Hours UTC

codecov bot commented Jan 3, 2018 • edited Loading

Codecov Report

jreback Jan 3, 2018

Choose a reason for hiding this comment

jreback Jan 3, 2018

Choose a reason for hiding this comment

mroeschke Jan 4, 2018

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

mroeschke commented Jan 4, 2018

jreback Jan 4, 2018

Choose a reason for hiding this comment

mroeschke commented Jan 5, 2018

jreback Jan 5, 2018

Choose a reason for hiding this comment

jreback Jan 5, 2018

Choose a reason for hiding this comment

mroeschke commented Jan 6, 2018 • edited Loading

pep8speaks commented Jan 3, 2018 •

edited

Loading

codecov bot commented Jan 3, 2018 •

edited

Loading

mroeschke commented Jan 6, 2018 •

edited

Loading