CI: split Windows Azure tests in half #43468

Dr-Irv · 2021-09-09T03:19:52Z

Trying an experiment for Windows CI where PYTEST_WORKERS is auto rather than 2.

jreback · 2021-09-09T11:57:28Z

how'd this work out?

Dr-Irv · 2021-09-09T12:24:30Z

how'd this work out?

auto doesn't solve the problem. Current hypothesis is that the expression for skipping tests is wrong. Here's the evidence:

For the Windows py38 tests, which are "not slow and not network", we get: 122956 passed, 6965 skipped
For the Windows py39 tests, which are ""not slow and not network and not high_memory"", we get 124788 passed, 4943 skipped

Doesn't make sense that skipping more tests in the expression would yield less tests skipped.

That test is running now....

The other issue is that it isn't clear to me how many cores are actually assigned to the Azure instances. Using auto means use as many cores as are available. In the first set of tests under this PR, py38 and py39 both used two cores. In the latest push, py38 is using 4 cores and py39 is using 2. The more cores we can get, the faster these tests will run.

jreback · 2021-09-09T13:04:02Z

38 and 39 might have different deps though

mzeitlin11 · 2021-09-09T13:05:04Z

In the latest push, py38 is using 4 cores and py39 is using 2. The more cores we can get, the faster these tests will run.

Just noting from looking at this in #42236, seemed like the opposite was actually true based on looking at a few runs using different numbers of cores.

Dr-Irv · 2021-09-09T14:51:50Z

Revised hypothesis. The number of cores that we get from Azure is variable. But the issue here is that if you only have 2 cores and use both of them, performance can get worse than if you use just 1 core. Here is some evidence. On my 16 core laptop, I ran pytest --skip-slow -m "not single" -n XX pandas/tests/groupby for XX=1,2,4,8,16 and auto. Here are the timing results:

XX	Time (secs)
16	32.09
8	30.16
4	37.56
2	55.04
1	90.53
auto	28.87

For "auto", it created 8 processes.

Note that using all 16 cores doesn't help - that's because these are hyperthreaded cores, and using them usually doesn't help computational performance. Secondly, even going from 1 core to 8 only sped things up by a factor of 3.

So if Azure gives us 2 cores, and we use both, that could be worse than just using 1 core. My latest commit is doing that, and also (hopefully) printing out the CPU info.

jreback · 2021-09-09T15:03:05Z

oh so this really could be a timeout?

also is there a way to request 4 cores (min) from azure for these builds?

Dr-Irv · 2021-09-09T15:55:21Z

oh so this really could be a timeout?

Yes. The last test confirms it. Here is the processor info that was obtained (which was the same for the py38 and py39 runs):

Caption                            DeviceID  MaxClockSpeed  Name                                            NumberOfCores  

Intel64 Family 6 Model 85 Stepping 7  CPU0      2594           Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz  2

In the last test, I had it not do any parallelization, and the py38 run timed out, the py39 run did not. So I think this has to do with where the virtual container is sitting on the hardware. Most of the time before, I'd see py38 succeed and py39 time out.

also is there a way to request 4 cores (min) from azure for these builds?

I don't know the answer to that! Who is responsible for the relationship we have with azure?

jreback · 2021-09-09T16:18:45Z

I don't know the answer to that! Who is responsible for the relationship we have with azure?

umm, don't really have anything specific

Dr-Irv · 2021-09-09T17:57:34Z

Another possible solution is to split the tests for Azure. Here's an example from Microsoft on how to do that:
https://github.com/PBoraMSFT/ParallelTestingSample-Python

jreback · 2021-09-09T18:59:34Z

maybe best is just to remove some tests (e.g. have a marker that we dont' use here) and/or move some to a slow build.

Dr-Irv · 2021-09-09T21:54:01Z

maybe best is just to remove some tests (e.g. have a marker that we dont' use here) and/or move some to a slow build.

Issue here is that the slowest test is 10 seconds. Our issue is volume (over 124K tests) and on Azure, we're running on a "slow" machine compared to Travis.

jreback · 2021-09-09T21:55:26Z

maybe best is just to remove some tests (e.g. have a marker that we dont' use here) and/or move some to a slow build.

Issue here is that the slowest test is 10 seconds. Our issue is volume (over 124K tests) and on Azure, we're running on a "slow" machine compared to Travis.

right one way is for example to NOT run any pandas/tests/windows (which @mroeschke disabled for a while)
not great, but we do really test these

jbrockmendel · 2021-09-09T22:19:44Z

If we cant get more cores/build, could we run more azure builds, e.g. for each of the current azure builds split it in two with one ignoring tests/window and the other doing only tests/window?

mroeschke · 2021-09-10T01:41:06Z

Or maybe we can test a separate Github Actions Window runner for a subset of tests (seems to only be a 2 core CPU https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources)

Dr-Irv · 2021-09-10T02:22:20Z

I'm trying something else here by setting up a PYTEST_TARGET in the azure pipelines and then for Windows having separate jobs where I separate the tests into "pandas/tests/[a-i]*" and "pandas/tests/[j-z]*"

jreback · 2021-09-10T10:48:34Z

looking good

can u close and reopen to force a run again (just checking on the timeouts)

Dr-Irv · 2021-09-10T11:18:12Z

can u close and reopen to force a run again (just checking on the timeouts)

Closing and reopening..

jreback · 2021-09-10T12:15:27Z

@Dr-Irv this looks good. i think splitting the macos tests along the same lines makes sense as well (in this PR or followon)

jreback · 2021-09-10T12:15:56Z

@simonjayhawkins lmk if you want to backport (either way ok by me)

Dr-Irv · 2021-09-10T12:37:47Z

@jreback all green

We also have the option of speeding up some of the other checks by doing a similar split of the tests. Some of them are taking close to an hour. That would allow people to get quicker feedback on their PRs. If you think that's a good idea, I could do another PR for that.

jreback · 2021-09-10T12:38:45Z

@jreback all green

We also have the option of speeding up some of the other checks by doing a similar split of the tests. Some of them are taking close to an hour. That would allow people to get quicker feedback on their PRs. If you think that's a good idea, I could do another PR for that.

yep, we actually have a pretty large azure allocation, so a few more jobs per PR is fine.

jreback · 2021-09-10T12:39:17Z

going to backport this (@simonjayhawkins can always not merge it on 1.3 if too many conflicts)

jreback · 2021-09-10T12:39:32Z

thanks @Dr-Irv

jreback · 2021-09-10T12:39:40Z

@meeseeksdev backport 1.3.x

jreback · 2021-09-10T12:40:20Z

@Dr-Irv if you can push the backport as above (guess we have a conflict)

Dr-Irv · 2021-09-10T13:22:40Z

@Dr-Irv if you can push the backport as above (guess we have a conflict)

Created this PR #43496

Not sure which labels and milestones to put on it, and the backport PR is showing conflicts

Dr-Irv · 2021-09-10T13:49:38Z

@jreback Bad PR #43496 (was merging to master). New PR #43497 merges to 1.3.x .

jbrockmendel · 2021-09-10T14:55:46Z

Awesome, thanks for figuring this out @Dr-Irv

* Backport PR #43468: CI: split Windows Azure tests in half * split macos tests in 2 (#43517) * fix macos backport to use 3.7

try windows with auto workers

9bdfa01

Dr-Irv added CI Continuous Integration Windows Windows OS labels Sep 9, 2021

change pattern for skipping multiple tests

cf75504

Dr-Irv added 3 commits September 9, 2021 09:10

put back 2 workers

a7d65b6

Merge remote-tracking branch 'upstream/master' into wintests

f34a128

use one worker. Use wmic to get CPU info

70bdf08

try splitting test targets via REs

6c3075e

add target for github workflows

a5d56b9

Dr-Irv changed the title ~~CI: try windows with auto workers~~ CI: split Windows Azure tests in half Sep 10, 2021

Dr-Irv marked this pull request as ready for review September 10, 2021 04:09

Dr-Irv closed this Sep 10, 2021

Dr-Irv reopened this Sep 10, 2021

jreback added this to the 1.4 milestone Sep 10, 2021

jreback modified the milestones: 1.4, 1.3.3 Sep 10, 2021

jreback merged commit 6e19bdc into pandas-dev:master Sep 10, 2021

This comment has been minimized.

Sign in to view

lumberbot-app bot added the Still Needs Manual Backport label Sep 10, 2021

This comment has been minimized.

Sign in to view

Dr-Irv added a commit to Dr-Irv/pandas that referenced this pull request Sep 10, 2021

Backport PR pandas-dev#43468: CI: split Windows Azure tests in half

bf02177

lithomas1 removed the Still Needs Manual Backport label Sep 10, 2021

simonjayhawkins mentioned this pull request Sep 10, 2021

Backport PR #43468: CI: split Windows Azure tests in half #43497

Merged

jreback pushed a commit that referenced this pull request Sep 10, 2021

Backport PR #43468: CI: split Windows Azure tests in half (#43497)

6fec9a5

Dr-Irv mentioned this pull request Sep 11, 2021

split macos tests in 2 #43517

Merged

lithomas1 pushed a commit that referenced this pull request Sep 13, 2021

Backport PR #43517 on branch 1.3.x (split macos tests in 2) (#43535)

f4d1e90

* Backport PR #43468: CI: split Windows Azure tests in half * split macos tests in 2 (#43517) * fix macos backport to use 3.7

Dr-Irv mentioned this pull request Oct 26, 2021

CI: 310-dev build seems to be timing out #44173

Closed

Dr-Irv deleted the wintests branch February 13, 2023 20:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: split Windows Azure tests in half #43468

CI: split Windows Azure tests in half #43468

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

mzeitlin11 commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

jbrockmendel commented Sep 9, 2021

mroeschke commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

jreback commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

jreback commented Sep 10, 2021

jreback commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

jreback commented Sep 10, 2021

jreback commented Sep 10, 2021

jreback commented Sep 10, 2021

jreback commented Sep 10, 2021

This comment has been minimized.

This comment has been minimized.

jreback commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

jbrockmendel commented Sep 10, 2021

CI: split Windows Azure tests in half #43468

CI: split Windows Azure tests in half #43468

Conversation

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

mzeitlin11 commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

Dr-Irv commented Sep 9, 2021

jreback commented Sep 9, 2021

jbrockmendel commented Sep 9, 2021

mroeschke commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

jreback commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

jreback commented Sep 10, 2021

jreback commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

jreback commented Sep 10, 2021

jreback commented Sep 10, 2021

jreback commented Sep 10, 2021

jreback commented Sep 10, 2021

This comment has been minimized.

This comment has been minimized.

jreback commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

Dr-Irv commented Sep 10, 2021

jbrockmendel commented Sep 10, 2021