Skip to content

Commit 3701c52

Browse files
Merge pull request #1762 from verilog-to-routing/slurm-submission-script
Add a template submission script for SLURM-managed clusters
2 parents 2659048 + 4702d9f commit 3701c52

File tree

3 files changed

+64
-0
lines changed

3 files changed

+64
-0
lines changed

README.developers.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,9 @@ You can also run multiple regression tests together:
280280
#Run both the basic and strong regression, with up to 4 tests in parallel
281281
$ ./run_reg_test.py vtr_reg_basic vtr_reg_strong -j4
282282
```
283+
## Running in a large cluster using SLURM
284+
For the very large runs, you can submit your runs on a large cluster. A template of submission script to
285+
a Slurm-managed cluster can be found under vtr_flow/tasks/slurm/
283286

284287
## Odin Functionality Tests
285288

vtr_flow/scripts/slurm/README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
submission_template.sh is a template used to submit jobs using SLURM on a large cluster
2+
- This script is tested on computecanada cluster (Niagara) system
3+
4+
To use this script:
5+
- First, generate all the shell scripts to run a task using:
6+
* run_vtr_task.py -system scripts (Sec. 3.9.3 in the documentation)
7+
* this will create a new run directory under the specified task (e.g. run001)
8+
9+
- Second, edit the submission_template.sh script to match your run
10+
* All lines starting with #SBATCH go to SLURM
11+
* Edit all these line depending on the number of cores you want, number of parallel jobs you want,
12+
the job name, your email and the job time limit
13+
* Edit the relative path to your task (path_to_task) and the specific run directory where the
14+
generated shell scripts are (dir)
15+
16+
- Third, submit these jobs to SLURM using the following command
17+
* sbatch submission_template.sh
18+
19+
You can check the state of the submitted job (or all the jobs that you submitted) using
20+
squeue --me
21+
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
#! /bin/bash
2+
3+
#======================================================================================
4+
#Please Make a copy of this template in the same directory and edit it to fit your run.
5+
#Please do not submit your copy of this template to the repo.
6+
#======================================================================================
7+
8+
9+
#The following section controls the parameters of your run, feel free to edit:
10+
#=============================================================================
11+
#SBATCH --nodes=1 # Number of nodes in our pool
12+
#SBATCH --ntasks=80 # Number of parallel tasks
13+
#SBATCH --ntasks-per-node=40 # maximum number of tasks simultaneously running on a node
14+
#SBATCH --job-name=vtr_test # Job name
15+
#SBATCH --mail-type=FAIL,END # When to send an email
16+
#SBATCH [email protected] # The user's email
17+
#SBATCH --output=parallel_%j.log # Determine the output log file; (%j) is the jobid
18+
#SBATCH --error=error_%j.log # Determine the error log file
19+
#SBATCH --time=10:00:00 # The job time limit in hh:mm:ss
20+
21+
#You can also overwrite the values of some of these paramters using environment variables
22+
#========================================================================================
23+
# - SBATCH_JOB_NAME instead of --job-name
24+
# - SBATCH_TIMELIMIT instead of --time
25+
26+
#Preload the required environment modules
27+
module load gcc/9.2.0
28+
module load python/3.8.5
29+
30+
#Choose the run directory
31+
path_to_task="../../tasks/regression_tests/vtr_reg_nightly/vtr_reg_qor_chain_large"
32+
dir="run001"
33+
34+
35+
for script in ${path_to_task}/${dir}/*/*/common*/vtr_flow.sh
36+
do
37+
echo $script
38+
$script &
39+
done
40+
wait

0 commit comments

Comments
 (0)