Skip to content

Commit 67e2f1c

Browse files
authored
Parallelize & Partition Verification (#229)
Shard Kani verification workflow across multiple runners by: 1. Running `kani list --format json`, which outputs a JSON file like this: ```json { "kani-version": "0.56.0", "file-version": "0.1", "standard-harnesses": { "src/lib.rs": [ "proof3", "verify::proof2" ], "src/test.rs": [ "test::proof4", "test::proof5" ] }, "contract-harnesses": { "src/lib.rs": [ "proof" ] }, "contracts": [ { "function": "bar", "file": "src/lib.rs", "harnesses": [ "proof" ] } ], "totals": { "standard-harnesses": 4, "contract-harnesses": 1, "functions-under-contract": 1 } } ``` 2. Extracting the harnesses inside `"standard-harnesses"` and `"contract-harnesses"` into an array called `ALL_HARNESSES` and the length of that array into `HARNESS_COUNT`. (So in this example, `ALL_HARNESSES = [proof3, verify::proof2, test::proof4, test::proof5, proof]` and `HARNESS_COUNT=5`). 3. Dividing the harnesses evenly between four workers. For example, if worker 1's harnesses are `proof3` and `verify::proof2`, then we call `kani verify-std --harness proof3 --harness verify::proof2 --exact`. The `--exact` makes Kani look for the exact harness name. This is important so that we don't match on partial patterns, e.g., if there is a harness called "foo" and a harness called "foo_bar", passing `--harness foo` without `--exact` would match against both harnesses, and then `foo_bar` would run twice. 4. Also parallelize verification within a single runner by passing `-j` to Kani. I chose four workers somewhat arbitrarily--it makes each worker take about 45 minutes to an hour to finish. I thought it was good to have a balance between too few workers (which still makes us wait a while) and too many workers (which makes you look through more logs to find where a given harness is being verified). But happy to play with this number if people have opinions. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.
1 parent 866a195 commit 67e2f1c

File tree

2 files changed

+106
-18
lines changed

2 files changed

+106
-18
lines changed

.github/workflows/kani.yml

+17-3
Original file line numberDiff line numberDiff line change
@@ -17,25 +17,39 @@ defaults:
1717

1818
jobs:
1919
check-kani-on-std:
20-
name: Verify std library
20+
name: Verify std library (partition ${{ matrix.partition }})
2121
runs-on: ${{ matrix.os }}
2222
strategy:
2323
matrix:
2424
os: [ubuntu-latest, macos-latest]
25+
partition: [1, 2, 3, 4]
2526
include:
2627
- os: ubuntu-latest
2728
base: ubuntu
2829
- os: macos-latest
2930
base: macos
31+
fail-fast: false
32+
33+
env:
34+
# Define the index of this particular worker [1-WORKER_TOTAL]
35+
WORKER_INDEX: ${{ matrix.partition }}
36+
# Total number of workers running this step
37+
WORKER_TOTAL: 4
38+
3039
steps:
3140
# Step 1: Check out the repository
3241
- name: Checkout Repository
3342
uses: actions/checkout@v4
3443
with:
3544
path: head
3645
submodules: true
37-
38-
# Step 2: Run Kani on the std library (default configuration)
46+
47+
# Step 2: Install jq
48+
- name: Install jq
49+
if: matrix.os == 'ubuntu-latest'
50+
run: sudo apt-get install -y jq
51+
52+
# Step 3: Run Kani on the std library (default configuration)
3953
- name: Run Kani Verification
4054
run: head/scripts/run-kani.sh --path ${{github.workspace}}/head
4155

scripts/run-kani.sh

+89-15
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,22 @@ TOML_FILE=${KANI_TOML_FILE:-$DEFAULT_TOML_FILE}
7777
REPO_URL=${KANI_REPO_URL:-$DEFAULT_REPO_URL}
7878
BRANCH_NAME=${KANI_BRANCH_NAME:-$DEFAULT_BRANCH_NAME}
7979

80+
# Unstable arguments to pass to Kani
81+
unstable_args="-Z function-contracts -Z mem-predicates -Z float-lib -Z c-ffi -Z loop-contracts"
82+
83+
# Variables used for parallel harness verification
84+
# When we say "parallel," we mean two dimensions of parallelization:
85+
# 1. Sharding verification across multiple workers. The Kani workflow that calls this script defines WORKER_INDEX and WORKER_TOTAL for this purpose:
86+
# we shard verification across WORKER_TOTAL workers, where each worker has a unique WORKER_INDEX that it uses to derive its share of ALL_HARNESSES to verify.
87+
# 2. Within a single worker, we parallelize verification between multiple cores by invoking kani with -j.
88+
89+
# Array of all of the harnesses in the repository, set in get_harnesses()
90+
declare -a ALL_HARNESSES
91+
# Length of ALL_HARNESSES, set in get_harnesses()
92+
declare -i HARNESS_COUNT
93+
# `kani list` JSON FILE_VERSION that the parallel verification command expects
94+
EXPECTED_JSON_FILE_VERSION="0.1"
95+
8096
# Function to read commit ID from TOML file
8197
read_commit_from_toml() {
8298
local file="$1"
@@ -151,10 +167,50 @@ get_kani_path() {
151167
echo "$(realpath "$build_dir/scripts/kani")"
152168
}
153169

154-
run_kani_command() {
170+
# Run kani list with JSON format and process with jq to extract harness names and total number of harnesses.
171+
# Note: The code to extract ALL_HARNESSES is dependent on `kani list --format json` FILE_VERSION 0.1.
172+
# (The FILE_VERSION variable is defined in Kani in the list module's output code, current path kani-driver/src/list/output.rs)
173+
# If FILE_VERSION changes, first update the ALL_HARNESSES extraction logic to work with the new format, if necessary,
174+
# then update EXPECTED_JSON_FILE_VERSION.
175+
get_harnesses() {
155176
local kani_path="$1"
156-
shift
157-
"$kani_path" "$@"
177+
"$kani_path" list -Z list $unstable_args ./library --std --format json
178+
local json_file_version=$(jq -r '.["file-version"]' "$WORK_DIR/kani-list.json")
179+
if [[ $json_file_version != $EXPECTED_JSON_FILE_VERSION ]]; then
180+
echo "Error: The JSON file-version in kani-list.json does not equal $EXPECTED_JSON_FILE_VERSION"
181+
exit 1
182+
fi
183+
# Extract the harnesses inside "standard-harnesses" and "contract-harnesses"
184+
# into an array called ALL_HARNESSES and the length of that array into HARNESS_COUNT
185+
ALL_HARNESSES=($(jq -r '
186+
([.["standard-harnesses"] | to_entries | .[] | .value[]] +
187+
[.["contract-harnesses"] | to_entries | .[] | .value[]]) |
188+
.[]
189+
' $WORK_DIR/kani-list.json))
190+
HARNESS_COUNT=${#ALL_HARNESSES[@]}
191+
}
192+
193+
# Given an array of harness names, run verification for those harnesses
194+
run_verification_subset() {
195+
local kani_path="$1"
196+
local harnesses=("${@:2}") # All arguments after kani_path are harness names
197+
198+
# Build the --harness arguments
199+
local harness_args=""
200+
for harness in "${harnesses[@]}"; do
201+
harness_args="$harness_args --harness $harness"
202+
done
203+
204+
echo "Running verification for harnesses:"
205+
printf '%s\n' "${harnesses[@]}"
206+
"$kani_path" verify-std -Z unstable-options ./library \
207+
$unstable_args \
208+
$harness_args --exact \
209+
-j \
210+
--output-format=terse \
211+
$command_args \
212+
--enable-unstable \
213+
--cbmc-args --object-bits 12
158214
}
159215

160216
# Check if binary exists and is up to date
@@ -176,7 +232,6 @@ check_binary_exists() {
176232
return 1
177233
}
178234

179-
180235
main() {
181236
local build_dir="$WORK_DIR/kani_build"
182237

@@ -209,19 +264,38 @@ main() {
209264
"$kani_path" --version
210265

211266
if [[ "$run_command" == "verify-std" ]]; then
212-
echo "Running Kani verify-std command..."
213-
"$kani_path" verify-std -Z unstable-options ./library \
214-
-Z function-contracts \
215-
-Z mem-predicates \
216-
-Z loop-contracts \
217-
-Z float-lib \
218-
-Z c-ffi \
219-
$command_args \
220-
--enable-unstable \
221-
--cbmc-args --object-bits 12
267+
if [[ -n "$WORKER_INDEX" && -n "$WORKER_TOTAL" ]]; then
268+
echo "Running as parallel worker $WORKER_INDEX of $WORKER_TOTAL"
269+
get_harnesses "$kani_path"
270+
271+
echo "All harnesses:"
272+
printf '%s\n' "${ALL_HARNESSES[@]}"
273+
echo "Total number of harnesses: $HARNESS_COUNT"
274+
275+
# Calculate this worker's portion (add WORKER_TOTAL - 1 to force ceiling division)
276+
chunk_size=$(( (HARNESS_COUNT + WORKER_TOTAL - 1) / WORKER_TOTAL ))
277+
echo "Number of harnesses this worker will run: $chunk_size"
278+
279+
start_idx=$(( (WORKER_INDEX - 1) * chunk_size ))
280+
echo "Start index into ALL_HARNESSES is $start_idx"
281+
282+
# Extract this worker's harnesses
283+
worker_harnesses=("${ALL_HARNESSES[@]:$start_idx:$chunk_size}")
284+
285+
# Run verification for this subset
286+
run_verification_subset "$kani_path" "${worker_harnesses[@]}"
287+
else
288+
# Run verification for all harnesses (not in parallel)
289+
echo "Running Kani verify-std command..."
290+
"$kani_path" verify-std -Z unstable-options ./library \
291+
$unstable_args
292+
$command_args \
293+
--enable-unstable \
294+
--cbmc-args --object-bits 12
295+
fi
222296
elif [[ "$run_command" == "list" ]]; then
223297
echo "Running Kani list command..."
224-
"$kani_path" list -Z list -Z function-contracts -Z mem-predicates -Z float-lib -Z c-ffi ./library --std --format markdown
298+
"$kani_path" list -Z list $unstable_args ./library --std --format markdown
225299
fi
226300
}
227301

0 commit comments

Comments
 (0)