|
| 1 | +# Challenge 14: Contracts and Tests for SIMD Intrinsics |
| 2 | + |
| 3 | +- **Status:** Open |
| 4 | +- **Reward:** |
| 5 | +- **Solution:** |
| 6 | +- **Tracking Issue:** https://github.com/model-checking/verify-rust-std/issues/173 |
| 7 | +- **Start date:** 2025/02/01 |
| 8 | +- **End date:** 2025/08/01 |
| 9 | + |
| 10 | +------------------- |
| 11 | + |
| 12 | + |
| 13 | +## Goal |
| 14 | + |
| 15 | +A number of Rust projects rely on the SIMD intrinsics provided by |
| 16 | +[core::arch](https://doc.rust-lang.org/beta/core/arch/) for |
| 17 | +performance. This includes libraries like [hashbrown]() that are used |
| 18 | +in |
| 19 | +[HashMap](https://doc.rust-lang.org/std/collections/struct.HashMap.html), |
| 20 | +as well as third-party cryptographic libraries like |
| 21 | +[libcrux](https://github.com/cryspen/libcrux) and |
| 22 | +[Dalek](https://github.com/dalek-cryptography/curve25519-dalek) that |
| 23 | +are used in mainstream software projects. |
| 24 | + |
| 25 | +The goal of this project is to provide testable formal specifications |
| 26 | +for the 100 most commonly used intrinsics for x86_64 and aarch64 |
| 27 | +platforms, chosen specifically to cover all the intrinsics used in |
| 28 | +hashbrown and in popular cryptographic libraries. |
| 29 | + |
| 30 | +For each intrinsic, the main goal is to provide contracts in the form |
| 31 | +of pre- and post-conditions, and to verify whether these contracts |
| 32 | +hold when the intrinsics are executed in Rust. A secondary goal is to |
| 33 | +use these contracts as formal specifications of the intrinsics API |
| 34 | +when doing proofs of Rust programs. |
| 35 | + |
| 36 | + |
| 37 | +## Motivation |
| 38 | + |
| 39 | +Rust is the language of choice for new security-critical and |
| 40 | +performance-sensitive projects, and consequently a number of new |
| 41 | +cryptographic projects use Rust to build their infrastructure and |
| 42 | +trusted computing base. However, the SIMD intrinsics in Rust lack |
| 43 | +documentation and are easy to misuse, and so even the best Rust programmers |
| 44 | +need to wade through Intel or Arm assembly documentation to understand |
| 45 | +the functional behavior of these intrinsics. |
| 46 | + |
| 47 | +Separately, when formally verifying cryptographic libraries, each |
| 48 | +project needs to define its own semantics for SIMD instructions. |
| 49 | +Indeed such SIMD specifications have currently been defined for |
| 50 | +cryptographic verification projects in [F*](https://github.com/hacl-star/hacl-star), |
| 51 | +[EasyCrypt](https://github.com/jasmin-lang/jasmin), and [HOL Light](https://github.com/awslabs/s2n-bignum/tree/main). |
| 52 | +This specification work is both time-consuming and error-prone, |
| 53 | +there is also no guarantee of consistency between the instruction |
| 54 | +semantics used in these different tools. |
| 55 | + |
| 56 | +Consequently, we believe there is a strong need for a consistent, |
| 57 | +formal, testable specification of the SIMD intrinsics that can aid |
| 58 | +Rust developers. Furthermore, we believe that this |
| 59 | +specification should written in a way that can be used to aid formal |
| 60 | +verification of Rust programs using various proof assistants. |
| 61 | + |
| 62 | +## Description |
| 63 | + |
| 64 | +Consider the function [`_mm_blend_epi16`](https://doc.rust-lang.org/beta/core/arch/x86_64/fn._mm_blend_epi16.html) |
| 65 | +in [core::arch::x86_64](https://doc.rust-lang.org/beta/core/arch/x86_64/index.html): |
| 66 | + |
| 67 | +``` |
| 68 | +pub unsafe fn _mm_blend_epi16( |
| 69 | + a: __m128i, |
| 70 | + b: __m128i, |
| 71 | + const IMM8: i32, |
| 72 | +) -> __m128i |
| 73 | +``` |
| 74 | + |
| 75 | +Its description says: |
| 76 | +``` |
| 77 | +Blend packed 16-bit integers from a and b using the mask IMM8. |
| 78 | +
|
| 79 | +The mask bits determine the selection. A clear bit selects the corresponding element of a, and a set bit the corresponding element of b. |
| 80 | +``` |
| 81 | + |
| 82 | +It then points to [Intel's documentation](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_blend_epi16) for the C intrinsic which provides the pseudocode: |
| 83 | +``` |
| 84 | +FOR j := 0 to 7 |
| 85 | +i := j*16 |
| 86 | +IF imm8[j] |
| 87 | +dst[i+15:i] := b[i+15:i] |
| 88 | +ELSE |
| 89 | +dst[i+15:i] := a[i+15:i] |
| 90 | +FI |
| 91 | +ENDFOR |
| 92 | +``` |
| 93 | + |
| 94 | +We propose to reflect the behavior of the semantics as described in |
| 95 | +Intel's documentation directly as pre- and post-conditions in Rust. |
| 96 | + |
| 97 | +``` |
| 98 | +#[requires(IMM8 >= 0 && IMM8 <= 255)] |
| 99 | +#[ensures(|result| |
| 100 | + forall (|j| implies(j >= 0 && j < 8, |
| 101 | + if get_bit(IMM8,j) then |
| 102 | + get_lane(result, j) == get_lane(b,j) |
| 103 | + else |
| 104 | + get_lane(result, j) == get_lane(a,j))))] |
| 105 | +pub unsafe fn _mm_blend_epi16( |
| 106 | + a: __m128i, |
| 107 | + b: __m128i, |
| 108 | + const IMM8: i32, |
| 109 | +) -> __m128i |
| 110 | +``` |
| 111 | + |
| 112 | +This contract can then be used to automatically generate tests |
| 113 | +for the intrinsic, which can be put in CI. |
| 114 | + |
| 115 | +As a second layer of assurance, these contracts can be compiled |
| 116 | +to some verification framework and proved to be sound against |
| 117 | +a hand-written model of the intrinsics functions. |
| 118 | + |
| 119 | +Finally, Rust verification toolchains can also rely on this contract |
| 120 | +to model the intrinsics library within their analyses. This would |
| 121 | +enable the verification of Rust applications that rely on SIMD intrinsics. |
| 122 | + |
| 123 | + |
| 124 | +### Assumptions |
| 125 | + |
| 126 | +The contracts we write for the SIMD intrinsics should be well tested |
| 127 | +but, in the end, are hand-written based on the documentation |
| 128 | +of the intrinsics provided by Intel and ARM. Consequently, the |
| 129 | +user must trust that these semantics are correctly written. |
| 130 | + |
| 131 | +When using the contracts within a formal verification project, |
| 132 | +the user will, as usual, have to trust that the verification |
| 133 | +tool correctly encodes the semantics of Rust and performs |
| 134 | +a sound analysis within a clearly documented model. |
| 135 | + |
| 136 | +### Success Criteria |
| 137 | + |
| 138 | +The goal is to annotate >= 100 intrinsics in `core::arch::x86_64` and |
| 139 | +`core::arch::aarch64` with contracts, and all these contracts will be |
| 140 | +tested comprehensively in Rust. These functions should include all the |
| 141 | +intrinsics currently used in standard libraries like |
| 142 | +[hashbrown](https://github.com/rust-lang/hashbrown) (the basis |
| 143 | +of the Rust [HashMap](https://doc.rust-lang.org/std/collections/struct.HashMap.html) implementation). |
| 144 | + |
| 145 | +An additional success criterion is to show that these contracts can be |
| 146 | +used by verification tools to prove properties about example code that |
| 147 | +uses them. Of particular interest is code used in cryptographic |
| 148 | +libraries, but even other standalone examples using SIMD intrinsics |
| 149 | +would be considered valuable. |
0 commit comments