Skip to content

ASV Benchmark - Time Standards #29165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
WillAyd opened this issue Oct 22, 2019 · 6 comments
Open

ASV Benchmark - Time Standards #29165

WillAyd opened this issue Oct 22, 2019 · 6 comments
Labels
Benchmark Performance (ASV) benchmarks Performance Memory or execution speed performance

Comments

@WillAyd
Copy link
Member

WillAyd commented Oct 22, 2019

TLDR - I think we need to cap our benchmarks at a maximum of .2 seconds. That's a long way off though, so I think should start with a cap of 1 second per benchmark

Right now we have some very long running benchmarks:

https://pandas.pydata.org/speed/pandas/#summarylist?sort=1&dir=desc

I haven't seen a definitive answer, but I think ASV leverages the builtin timeit functionality to figure out how long a given benchmark should run.

https://docs.python.org/3.7/library/timeit.html#command-line-interface

Quoting what I thinks is important:

If -n is not given, a suitable number of loops is calculated by trying successive powers of 10 until the total time is at least 0.2 seconds.

So IIUC a particular statement is executed n times (where n is a power of 10) to the point where it reaches 0.2 seconds to run, and then is repeated repeat times to get a reading. asv continuous would do this 4 times (2 runs for each commit being compared). In Python 3.6 repeat is 3 (we currently pin ASVs to 3.6) but in future versions that gets bumped to 5.

We have a handful of benchmarks that are 20s a piece to run, so if we stick to the 3.6 timing these statements would run n=1 times repeated 3 times per benchmark session * 4 sessions per continuous run. 20s * 3 repeats * 4 sessions = 4 minutes for one benchmark alone

rolling.Apply.time_rolling is a serious offender here so I think can start with that. Would take community PRs to improve performance of any of these, though maybe should prioritize anything currently taking over 1 second

cc @qwhelan and @pv who may have additional insights

@WillAyd WillAyd added Performance Memory or execution speed performance Benchmark Performance (ASV) benchmarks labels Oct 22, 2019
@WillAyd WillAyd changed the title ASV Benchmark - Time Standard ASV Benchmark - Time Standards Oct 22, 2019
@pv
Copy link
Contributor

pv commented Oct 22, 2019 via email

@WillAyd
Copy link
Member Author

WillAyd commented Oct 22, 2019

Thanks for the link - reading through it definitely gives more guidance.

So we if we track something that itself takes more than 10 milliseconds to run do you know the number of times it is run within a sample? The documentation mentions that asv selects a number by approximation how many runs it will take to reach the sample_time, but its not clear what happens if one run exceeds sample_time altogether

Alternately do you have thoughts here on general best practices? Right now our benchmarks are pretty slow (ex: running the groupby module alone takes over an hour)

@pv
Copy link
Contributor

pv commented Oct 23, 2019

If it takes longer that sample_time, number = 1. You probably want to adjust repeat, as the default (2, 10, 20.0) runs until 10 samples are collected or 20 seconds elapsed --- you can e.g. make the max time shorter.

@qwhelan
Copy link
Contributor

qwhelan commented Oct 26, 2019

@WillAyd It appears there's a few issues:

I'll submit a PR shortly that pares down the test size so each iteration runs in under a second.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Oct 28, 2019 via email

@WillAyd
Copy link
Member Author

WillAyd commented Oct 28, 2019

@TomAugspurger lmk if you need help with that; might not be a bad idea to refresh knowledge on that env

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Benchmark Performance (ASV) benchmarks Performance Memory or execution speed performance
Projects
None yet
Development

No branches or pull requests

4 participants