-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BENCH: fix noisy asv benchmarks that were running on exhausted generators #26772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -55,7 +55,14 @@ class SeriesConstructors: | |||
[False, True], | |||
['float', 'int']] | |||
|
|||
# Generators get exhausted on use, so run setup before every call | |||
number = 1 | |||
repeat = (3, 250, 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should we care about setting repeat for these? Can't we just use the default max_time
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default repeat
is (min_n=1, max_n=10, max_time=20.0)
, which assumes number
can be much greater than 1
. We're not trying to modify the number of samples (number * repeat
), just how they're collected and setting repeat
is required to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd still prefer if we just used the default max_time here (assuming that to be bottleneck). Understood it may look weird historically with these but they were wrong anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, @qwhelan's approach is preferred since we should get more stable benchmark. With the default repeat
, we'd observe higher variance in the timings.
Codecov Report
@@ Coverage Diff @@
## master #26772 +/- ##
==========================================
- Coverage 91.71% 91.71% -0.01%
==========================================
Files 178 178
Lines 50771 50771
==========================================
- Hits 46567 46563 -4
- Misses 4204 4208 +4
Continue to review full report at Codecov.
|
1 similar comment
Codecov Report
@@ Coverage Diff @@
## master #26772 +/- ##
==========================================
- Coverage 91.71% 91.71% -0.01%
==========================================
Files 178 178
Lines 50771 50771
==========================================
- Hits 46567 46563 -4
- Misses 4204 4208 +4
Continue to review full report at Codecov.
|
thanks @qwhelan |
Generators get consumed on first use, yielding abnormally fast benchmark times on the
n>1
iterations. Fortunately we can instructasv
to callsetup()
prior to every sample by settingnumber = 1
andrepeat
appropriately. My local runs suggest the typical number of samples is~150-250
, so an upper limit of250
appears to be a good fit.Here is current
master
with old benchmark version:And with the fixed benchmarks, we see the existing ones falsely report a
40x
speedup:Additionally, this PR resolves a few failing
asv
tests that I introduced in the first iteration.git diff upstream/master -u -- "*.py" | flake8 --diff