-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CI: split Windows Azure tests in half #43468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
how'd this work out? |
For the Windows py38 tests, which are "not slow and not network", we get: Doesn't make sense that skipping more tests in the expression would yield less tests skipped. That test is running now.... The other issue is that it isn't clear to me how many cores are actually assigned to the Azure instances. Using |
38 and 39 might have different deps though |
Just noting from looking at this in #42236, seemed like the opposite was actually true based on looking at a few runs using different numbers of cores. |
Revised hypothesis. The number of cores that we get from Azure is variable. But the issue here is that if you only have 2 cores and use both of them, performance can get worse than if you use just 1 core. Here is some evidence. On my 16 core laptop, I ran
For "auto", it created 8 processes. Note that using all 16 cores doesn't help - that's because these are hyperthreaded cores, and using them usually doesn't help computational performance. Secondly, even going from 1 core to 8 only sped things up by a factor of 3. So if Azure gives us 2 cores, and we use both, that could be worse than just using 1 core. My latest commit is doing that, and also (hopefully) printing out the CPU info. |
oh so this really could be a timeout? also is there a way to request 4 cores (min) from azure for these builds? |
Yes. The last test confirms it. Here is the processor info that was obtained (which was the same for the
In the last test, I had it not do any parallelization, and the
I don't know the answer to that! Who is responsible for the relationship we have with azure? |
umm, don't really have anything specific |
Another possible solution is to split the tests for Azure. Here's an example from Microsoft on how to do that: |
maybe best is just to remove some tests (e.g. have a marker that we dont' use here) and/or move some to a slow build. |
Issue here is that the slowest test is 10 seconds. Our issue is volume (over 124K tests) and on Azure, we're running on a "slow" machine compared to Travis. |
right one way is for example to NOT run any pandas/tests/windows (which @mroeschke disabled for a while) |
If we cant get more cores/build, could we run more azure builds, e.g. for each of the current azure builds split it in two with one ignoring tests/window and the other doing only tests/window? |
Or maybe we can test a separate Github Actions Window runner for a subset of tests (seems to only be a 2 core CPU https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources) |
I'm trying something else here by setting up a |
looking good can u close and reopen to force a run again (just checking on the timeouts) |
Closing and reopening.. |
@Dr-Irv this looks good. i think splitting the macos tests along the same lines makes sense as well (in this PR or followon) |
@simonjayhawkins lmk if you want to backport (either way ok by me) |
@jreback all green We also have the option of speeding up some of the other checks by doing a similar split of the tests. Some of them are taking close to an hour. That would allow people to get quicker feedback on their PRs. If you think that's a good idea, I could do another PR for that. |
yep, we actually have a pretty large azure allocation, so a few more jobs per PR is fine. |
going to backport this (@simonjayhawkins can always not merge it on 1.3 if too many conflicts) |
thanks @Dr-Irv |
@meeseeksdev backport 1.3.x |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@Dr-Irv if you can push the backport as above (guess we have a conflict) |
Awesome, thanks for figuring this out @Dr-Irv |
Trying an experiment for Windows CI where
PYTEST_WORKERS
isauto
rather than 2.