Skip to content

Commit cb1a606

Browse files
author
Yoshiki Takashima
authored
Script to run kani on top 100 crates (rust-lang#1327)
* Skeleton of top 100 tests done. No specific checks yet. Still need to implement core calls. * Added todo comments. * Implemented clone and kani call. Todo error printing. * Ubuntu issue also fixed. Implemented concurrent kani exec. Todo: finish analysis. * Fixed printing that was broken. * Made option to not print stdout, often repetition of stderr. * Added documentation. * Removed the manual indexes. Use awk to generate uuid from counter * Added license. * Fixed bad string comparison. * updated mode. * String comparison bug again. * Mitigated xargs -d not being available on OSX. * Changed to all spaces. * Moved list of target repos to a separate file. * Fixed error code counting * Implemented printing for error counts. * Changed script to use STDIN. * Renamed script * Updated documentation on many run script. * Moved to scripts to avoid formatting. * Moved list back to tests, excude with other method. * Exclude remote target list. * Renamed top 100 list. * Added list for top 1k. * Changed name of download directory to something more descriptive. * Implemented path arg. Moved recursion to exported function. * Fixed docs and messages. * Fixed zero target check. * Moved script to subdir. * Added warning. * Added filename logging to error message. * Added directory logging for error. * Updated docs on how to pick list. * Deleted file with 1k targets. * Linked with the Table of Contents. * Slight word change, force CI to run.
1 parent bcf7463 commit cb1a606

File tree

6 files changed

+324
-3
lines changed

6 files changed

+324
-3
lines changed

docs/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
- [Testing](./testing.md)
2626
- [Regression testing](./regression-testing.md)
2727
- [Book runner](./bookrunner.md)
28+
- [(Experimental) Testing with a Large Number of Repositories](./repo-crawl.md)
2829

2930
- [Limitations](./limitations.md)
3031
- [Undefined behaviour](./undefined-behaviour.md)

docs/src/repo-crawl.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# (Experimental) Testing with a Large Number of Repositories
2+
3+
This section explains how to run Kani on a large number of crates
4+
downloaded from git forges. You may want to do this if you are going
5+
to test Kani's ability to handle Rust features found in projects out
6+
in the wild.
7+
8+
For the first half, we will explain how to use data from crates.io to
9+
pick targets. Second half will explain how to use a script to run on a
10+
list of selected repositories.
11+
12+
## Picking Repositories
13+
14+
In picking repositories, you may want to select by metrics like
15+
popularity or by the presence of certain features. In this section, we
16+
will explain how to select top ripostes by download count.
17+
18+
We will use the `db-dump` method of getting data from crates.io as it
19+
is zero cost to their website and gives us SQL access. To start, have
20+
the following programs set up on your computer.
21+
- docker
22+
- docker-compose.
23+
24+
1. Start PostgreSQL. Paste in the following yaml file as
25+
`docker-compose.yaml`. `version: '3.3'` may need to change.
26+
```yaml
27+
version: '3.3'
28+
services:
29+
db:
30+
image: postgres:latest
31+
restart: always
32+
environment:
33+
- POSTGRES_USER=postgres
34+
- POSTGRES_PASSWORD=postgres
35+
volumes:
36+
- crates-data:/var/lib/postgresql/data
37+
logging:
38+
driver: "json-file"
39+
options:
40+
max-size: "50m"
41+
volumes:
42+
crates-data:
43+
driver: local
44+
```
45+
Then, run the following to start the setup.
46+
```bash
47+
docker-compose up -d
48+
```
49+
50+
Once set up, run `docker ls` to figure out the container's name. We
51+
will refer to the name as `$CONTAINER_NAME` from now on.
52+
53+
2. Download actual data from crates.io. First, run the following
54+
command to get a shell in the container: `docker exec -it --user
55+
postgres $CONTAINER_NAME bash`. Now, run the following to grab and
56+
install the data into the repository. Please note that this may
57+
take a while.
58+
59+
```bash
60+
wget https://static.crates.io/db-dump.tar.gz
61+
tar -xf db-dump.tar.gz
62+
psql postgres -f */schema.sql
63+
psql postgres -f */import.sql
64+
```
65+
66+
3. Extract the data. In the same docker shell, run the following to
67+
extract the top 1k repositories. Other SQL queries may be used if
68+
you want another criteria
69+
70+
```sql
71+
\copy
72+
(SELECT name, repository, downloads FROM crates
73+
WHERE repository LIKE 'http%' ORDER BY DOWNLOADS DESC LIMIT 1000)
74+
to 'top-1k.csv' csv header;
75+
```
76+
77+
4. Clean the data. The above query will capture duplicates paths that
78+
are deeper than the repository. You can clean these out.
79+
- URL from CSV: `cat top-1k.csv | awk -F ',' '{ print $2 }' | grep -v 'http.*'`
80+
- Remove long paths: `sed 's/tree\/master.*$//g'`
81+
- Once processed, you can dedup with `sort | uniq --unique`
82+
83+
## Running the List of Repositories
84+
In this step we will download the list of repositories using a script
85+
[kani-run-on-repos.sh](../../scripts/exps/kani-run-on-repos.sh)
86+
87+
Make sure to have Kani ready to run. If not, compile with `cargo build
88+
--workspace`.
89+
90+
From the repository root, you can run the script with
91+
`./scripts/exps/kani-run-on-repos.sh $URL_LIST_FILE` where
92+
`$URL_LIST_FILE` points to a line-delimited list of URLs you want to
93+
run Kani on. Repositories that give warnings or errors can be grepping
94+
for with "STDERR Warnings" and "Error exit in" respectively.

docs/src/testing.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ two very good reasons to do it:
1313
characteristics which are quantitative and countable. Metrics are
1414
particularly valuable for project management purposes.
1515

16-
We recommend reading our section on [Regression Testing](./regression-testing.md)
17-
if you're interested in Kani development. At present, we obtain metrics based
18-
on the [book runner](./bookrunner.md).
16+
We recommend reading our section on [Regression
17+
Testing](./regression-testing.md) if you're interested in Kani
18+
development. At present, we obtain metrics based on the [book
19+
runner](./bookrunner.md). To run kani on a large number of remotely
20+
hosted crates, please see [Repository Crawl](./repo-crawl.md).

scripts/ci/copyright-exclude

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,4 +15,5 @@ gitignore
1515
gitmodules
1616
ignore
1717
scripts/ci/copyright-exclude
18+
tests/remote-target-lists/.*
1819
tools/make-kani-release/license-notes.txt

scripts/exps/kani-run-on-repos.sh

Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
#!/bin/bash
2+
# Copyright Kani Contributors
3+
# SPDX-License-Identifier: Apache-2.0 OR MIT
4+
5+
6+
DOCUMENTATION=\
7+
'kani-run-on-repos.sh -- script to clone and compile many remote git repositories with Kani.
8+
9+
WARNING: Because this script clones repositories at the HEAD, the
10+
results may not be stable when the target code changes.
11+
12+
USAGE:
13+
./scripts/kani-run-on-repos.sh path/to/url-list
14+
15+
Download the top 100 crates and runs kani on them. Prints out the
16+
errors and warning when done. Xargs is required for this script to
17+
work.
18+
19+
url-list: A list of URLs to run Kani on. One per line.
20+
21+
ENV:
22+
- PRINT_STDOUT=1 forces this script to search for warning in
23+
STDOUT in addition to STDERR
24+
25+
EDITING:
26+
- To adjust the git clone or kani args, modify the function
27+
`clone_and_run_kani`.
28+
- To adjust the errors this script searches for, edit the function
29+
`print_errors_for_each_repo_result`
30+
31+
Copyright Kani Contributors
32+
SPDX-License-Identifier: Apache-2.0 OR MIT'
33+
34+
export SELF_SCRIPT=$0
35+
export SELF_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
36+
NPROC=$(nproc 2> /dev/null || sysctl -n hw.ncpu 2> /dev/null || echo 4) # Linux or Mac or hard-coded default of 4
37+
export WORK_DIRECTORY_PREFIX="$SELF_DIR/../../target/remote-repos"
38+
39+
40+
export STDOUT_SUFFIX='stdout.cargo-kani'
41+
export STDERR_SUFFIX='stderr.cargo-kani'
42+
export EXIT_CODE_SUFFIX='exit-code.cargo-kani'
43+
# worker function that clones target repos and runs kani over
44+
# them. This functions is called in parallel by
45+
# parallel_clone_and_run, and should not be run explicitly
46+
function clone_and_run_kani {
47+
WORK_NUMBER_ID=$(echo $1 | awk -F ',' '{ print $1}')
48+
REPOSITORY_URL=$(echo $1 | awk -F ',' '{ print $2}')
49+
REPO_DIRECTORY="$WORK_DIRECTORY_PREFIX/$WORK_NUMBER_ID"
50+
echo "work# $WORK_NUMBER_ID -- $REPOSITORY_URL"
51+
52+
# clone or update repository
53+
(git clone $REPOSITORY_URL $REPO_DIRECTORY 2> /dev/null || git -C $REPO_DIRECTORY pull)
54+
55+
# run cargo kani compile on repo. save results to file.
56+
PATH=$PATH:$(readlink -f $SELF_DIR/..)
57+
(cd $REPO_DIRECTORY; nice -n15 cargo kani --only-codegen) \
58+
1> $REPO_DIRECTORY/$STDOUT_SUFFIX \
59+
2> $REPO_DIRECTORY/$STDERR_SUFFIX
60+
echo $? > $REPO_DIRECTORY/$EXIT_CODE_SUFFIX
61+
}
62+
export -f clone_and_run_kani
63+
64+
OVERALL_EXIT_CODE='0'
65+
TARGET_ERROR_REGEX='warning:\sFound\sthe\sfollowing\sunsupported\sconstructs:\|WARN'
66+
# printing function that greps the error logs and exit code.
67+
function print_errors_for_each_repo_result {
68+
DIRECTORY=$1
69+
IS_FAIL='0'
70+
71+
error_code="$(cat $DIRECTORY/$EXIT_CODE_SUFFIX)"
72+
if [ "$error_code" != "0" ]; then
73+
echo -e "Error exit in $DIRECTORY: code $error_code\n"
74+
IS_FAIL='1'
75+
fi
76+
77+
STDERR_GREP=$(grep -A3 -n $TARGET_ERROR_REGEX $DIRECTORY/$STDERR_SUFFIX 2> /dev/null && echo 'STDERR has warnings')
78+
if [[ "$STDERR_GREP" =~ [a-zA-Z0-9] ]]; then
79+
echo -e "STDERR Warnings (Plus 3 lines after) $DIRECTORY/$STDERR_SUFFIX -----\n$STDERR_GREP"
80+
IS_FAIL='1'
81+
fi
82+
83+
STDOUT_GREP=$(grep -A3 -n $TARGET_ERROR_REGEX $DIRECTORY/$STDOUT_SUFFIX 2> /dev/null && echo 'STDOUT has warnings')
84+
if [[ "$STDOUT_GREP" =~ [a-zA-Z0-9] ]] && [ "$PRINT_STDOUT" = '1' ]; then
85+
echo -e "STDOUT Warnings (Plus 3 lines after) $DIRECTORY/$STDOUT_SUFFIX -----\n$STDOUT_GREP"
86+
IS_FAIL='1'
87+
fi
88+
89+
if [ "$IS_FAIL" -eq "0" ]; then
90+
echo 'Ok'
91+
fi
92+
}
93+
94+
if ! which xargs 1>&2 1> /dev/null; then
95+
echo "Need to have xargs installed. Please install with `apt-get install -y xargs`"
96+
exit -1
97+
elif [[ "$*" == *"--help"* ]]; then
98+
echo -e "$DOCUMENTATION"
99+
elif [ "$#" -eq "1" ]; then
100+
# top level logic that runs clone_and_run_kani in parallel with xargs.
101+
echo "Reading URLs from $1...";
102+
LIST_OF_CRATE_GIT_URLS=$(cat $1)
103+
if [[ -z "$(echo $LIST_OF_CRATE_GIT_URLS | sed 's/\s//g')" ]]; then
104+
echo 'No targets found.'
105+
exit -1
106+
fi
107+
108+
mkdir -p $WORK_DIRECTORY_PREFIX
109+
echo -e "$LIST_OF_CRATE_GIT_URLS" | \
110+
awk -F '\n' 'BEGIN{ a=0 }{ print a++ "," $1 }' | \
111+
xargs -n1 -I {} -P $NPROC bash -c "clone_and_run_kani {}"
112+
113+
# serially print out the ones that failed.
114+
num_failed="0"
115+
num_with_warning='0'
116+
for directory in $(ls $WORK_DIRECTORY_PREFIX); do
117+
REPOSITORY=$(git -C $WORK_DIRECTORY_PREFIX/$directory remote -v | awk '{ print $2 }' | head -1)
118+
echo "repository: $REPOSITORY"
119+
120+
ERROR_OUTPUTS=$(print_errors_for_each_repo_result $WORK_DIRECTORY_PREFIX/$directory)
121+
if [[ "$ERROR_OUTPUTS" =~ 'STDERR Warnings' ]]; then
122+
OVERALL_EXIT_CODE='1'
123+
num_with_warning=$(($num_with_warning + 1))
124+
fi
125+
if [[ "$ERROR_OUTPUTS" =~ 'Error exit in' ]]; then
126+
num_failed=$(($num_failed + 1))
127+
fi
128+
129+
echo -e "$ERROR_OUTPUTS" | sed 's/^/ /'
130+
done
131+
132+
echo -e '\n--- OVERALL STATS ---'
133+
echo "$num_failed crates failed to compile"
134+
echo "$num_with_warning crates had warning(s)"
135+
else
136+
echo -e 'Needs exactly 1 argument path/to/url-list.\n'
137+
echo -e "$DOCUMENTATION"
138+
fi
139+
140+
exit $OVERALL_EXIT_CODE
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
https://github.com/Amanieu/parking_lot
2+
https://github.com/Amanieu/thread_local-rs
3+
https://github.com/BurntSushi/aho-corasick
4+
https://github.com/BurntSushi/byteorder
5+
https://github.com/BurntSushi/memchr
6+
https://github.com/BurntSushi/termcolor
7+
https://github.com/Frommi/miniz_oxide
8+
https://github.com/Gilnaa/memoffset
9+
https://github.com/Kimundi/rustc-version-rs
10+
https://github.com/RustCrypto/traits
11+
https://github.com/RustCrypto/utils
12+
https://github.com/SergioBenitez/version_check
13+
https://github.com/SimonSapin/rust-std-candidates
14+
https://github.com/alexcrichton/cc-rs
15+
https://github.com/alexcrichton/cfg-if
16+
https://github.com/alexcrichton/toml-rs
17+
https://github.com/bitflags/bitflags
18+
https://github.com/bluss/arrayvec
19+
https://github.com/bluss/either
20+
https://github.com/bluss/indexmap
21+
https://github.com/bluss/scopeguard
22+
https://github.com/chronotope/chrono
23+
https://github.com/clap-rs/clap
24+
https://github.com/contain-rs/vec-map
25+
https://github.com/crossbeam-rs/crossbeam
26+
https://github.com/cryptocorrosion/cryptocorrosion
27+
https://github.com/cuviper/autocfg
28+
https://github.com/dguo/strsim-rs
29+
https://github.com/dtolnay/anyhow
30+
https://github.com/dtolnay/itoa
31+
https://github.com/dtolnay/proc-macro-hack
32+
https://github.com/dtolnay/proc-macro2
33+
https://github.com/dtolnay/quote
34+
https://github.com/dtolnay/ryu
35+
https://github.com/dtolnay/semver
36+
https://github.com/dtolnay/syn
37+
https://github.com/dtolnay/thiserror
38+
https://github.com/env-logger-rs/env_logger
39+
https://github.com/fizyk20/generic-array.git
40+
https://github.com/hyperium/h2
41+
https://github.com/hyperium/http
42+
https://github.com/hyperium/hyper
43+
https://github.com/marshallpierce/rust-base64
44+
https://github.com/matklad/once_cell
45+
https://github.com/mgeisler/textwrap
46+
https://github.com/ogham/rust-ansi-term
47+
https://github.com/paholg/typenum
48+
https://github.com/retep998/winapi-rs
49+
https://github.com/rust-itertools/itertools
50+
https://github.com/rust-lang-nursery/lazy-static.rs
51+
https://github.com/rust-lang/backtrace-rs
52+
https://github.com/rust-lang/futures-rs
53+
https://github.com/rust-lang/hashbrown
54+
https://github.com/rust-lang/libc
55+
https://github.com/rust-lang/log
56+
https://github.com/rust-lang/pkg-config-rs
57+
https://github.com/rust-lang/regex
58+
https://github.com/rust-lang/socket2
59+
https://github.com/rust-num/num-integer
60+
https://github.com/rust-num/num-traits
61+
https://github.com/rust-random/getrandom
62+
https://github.com/rust-random/rand
63+
https://github.com/seanmonstar/httparse
64+
https://github.com/seanmonstar/num_cpus
65+
https://github.com/serde-rs/json
66+
https://github.com/serde-rs/serde
67+
https://github.com/servo/rust-fnv
68+
https://github.com/servo/rust-smallvec
69+
https://github.com/servo/rust-url
70+
https://github.com/servo/unicode-bidi
71+
https://github.com/softprops/atty
72+
https://github.com/steveklabnik/semver-parser
73+
https://github.com/taiki-e/pin-project-lite
74+
https://github.com/time-rs/time
75+
https://github.com/tokio-rs/bytes
76+
https://github.com/tokio-rs/mio
77+
https://github.com/tokio-rs/slab
78+
https://github.com/tokio-rs/tokio
79+
https://github.com/unicode-rs/unicode-normalization
80+
https://github.com/unicode-rs/unicode-segmentation
81+
https://github.com/unicode-rs/unicode-width
82+
https://github.com/unicode-rs/unicode-xid
83+
https://github.com/withoutboats/heck

0 commit comments

Comments
 (0)