BENCH: fix noisy asv benchmarks that were running on exhausted generators #26772

qwhelan · 2019-06-10T19:01:43Z

Generators get consumed on first use, yielding abnormally fast benchmark times on the n>1 iterations. Fortunately we can instruct asv to call setup() prior to every sample by setting number = 1 and repeat appropriately. My local runs suggest the typical number of samples is ~150-250, so an upper limit of 250 appears to be a good fit.

Here is current master with old benchmark version:

[ 75.00%] ··· Running (ctors.SeriesConstructors.time_series_constructor--).
[100.00%] ··· ctors.SeriesConstructors.time_series_constructor                                                                                                   4/40 failed
[100.00%] ··· ===================================== =============== ============= ============== =============
              --                                                        with_index / dtype
              ------------------------------------- ----------------------------------------------------------
                             data_fmt                False / float   False / int   True / float    True / int
              ===================================== =============== ============= ============== =============
                      <function gen_of_str>            130±0.9μs       116±6μs        failed         failed
                     <function gen_of_tuples>           113±3μs        112±20μs       failed         failed
              ===================================== =============== ============= ============== =============

And with the fixed benchmarks, we see the existing ones falsely report a 40x speedup:

[ 50.00%] ··· ===================================== =============== ============= ============== ============
              --                                                        with_index / dtype
              ------------------------------------- ---------------------------------------------------------
                             data_fmt                False / float   False / int   True / float   True / int
              ===================================== =============== ============= ============== ============
                      <function gen_of_str>            4.61±0.1ms    4.21±0.07ms       n/a           n/a
                     <function gen_of_tuples>          3.39±0.1ms    3.43±0.07ms       n/a           n/a
              ===================================== =============== ============= ============== ============

Additionally, this PR resolves a few failing asv tests that I introduced in the first iteration.

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

…tors

WillAyd · 2019-06-10T19:39:55Z

asv_bench/benchmarks/ctors.py

@@ -55,7 +55,14 @@ class SeriesConstructors:
              [False, True],
              ['float', 'int']]

+    # Generators get exhausted on use, so run setup before every call
+    number = 1
+    repeat = (3, 250, 10)


Why should we care about setting repeat for these? Can't we just use the default max_time?

The default repeat is (min_n=1, max_n=10, max_time=20.0), which assumes number can be much greater than 1. We're not trying to modify the number of samples (number * repeat), just how they're collected and setting repeat is required to do that.

I think I'd still prefer if we just used the default max_time here (assuming that to be bottleneck). Understood it may look weird historically with these but they were wrong anyway

IIUC, @qwhelan's approach is preferred since we should get more stable benchmark. With the default repeat, we'd observe higher variance in the timings.

codecov · 2019-06-10T19:40:47Z

Codecov Report

Merging #26772 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #26772      +/-   ##
==========================================
- Coverage   91.71%   91.71%   -0.01%     
==========================================
  Files         178      178              
  Lines       50771    50771              
==========================================
- Hits        46567    46563       -4     
- Misses       4204     4208       +4

Flag	Coverage Δ
#multiple	`90.3% <ø> (ø)`	⬆️
#single	`41.19% <ø> (-0.09%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`78.94% <0%> (-10.53%)`	⬇️
pandas/core/frame.py	`96.88% <0%> (-0.12%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d47fc0c...bd5b433. Read the comment docs.

codecov · 2019-06-10T19:40:50Z

Codecov Report

Merging #26772 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #26772      +/-   ##
==========================================
- Coverage   91.71%   91.71%   -0.01%     
==========================================
  Files         178      178              
  Lines       50771    50771              
==========================================
- Hits        46567    46563       -4     
- Misses       4204     4208       +4

Flag	Coverage Δ
#multiple	`90.3% <ø> (ø)`	⬆️
#single	`41.19% <ø> (-0.09%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`78.94% <0%> (-10.53%)`	⬇️
pandas/core/frame.py	`96.88% <0%> (-0.12%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d47fc0c...bd5b433. Read the comment docs.

jreback · 2019-06-21T02:03:08Z

thanks @qwhelan

BENCH: fix noisy asv benchmarks that were running on exhausted genera…

bd5b433

…tors

WillAyd added the Benchmark Performance (ASV) benchmarks label Jun 10, 2019

WillAyd reviewed Jun 10, 2019

View reviewed changes

TomAugspurger approved these changes Jun 20, 2019

View reviewed changes

jreback added this to the 0.25.0 milestone Jun 21, 2019

jreback merged commit 984514e into pandas-dev:master Jun 21, 2019

qwhelan deleted the asv_generators branch August 14, 2019 07:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BENCH: fix noisy asv benchmarks that were running on exhausted generators #26772

BENCH: fix noisy asv benchmarks that were running on exhausted generators #26772

qwhelan commented Jun 10, 2019

WillAyd Jun 10, 2019

qwhelan Jun 10, 2019

WillAyd Jun 12, 2019

TomAugspurger Jun 20, 2019

codecov bot commented Jun 10, 2019

codecov bot commented Jun 10, 2019 •

edited

Loading

jreback commented Jun 21, 2019

BENCH: fix noisy asv benchmarks that were running on exhausted generators #26772

BENCH: fix noisy asv benchmarks that were running on exhausted generators #26772

Conversation

qwhelan commented Jun 10, 2019

WillAyd Jun 10, 2019

Choose a reason for hiding this comment

qwhelan Jun 10, 2019

Choose a reason for hiding this comment

WillAyd Jun 12, 2019

Choose a reason for hiding this comment

TomAugspurger Jun 20, 2019

Choose a reason for hiding this comment

codecov bot commented Jun 10, 2019

Codecov Report

codecov bot commented Jun 10, 2019 • edited Loading

Codecov Report

jreback commented Jun 21, 2019

codecov bot commented Jun 10, 2019 •

edited

Loading