- Status: Open
- Tracking Issue: #278
- Start date: 2025-03-07
- End date: 2025-10-17
- Reward: 10000 USD
The following str library functions are generic over the Pattern
trait (https://doc.rust-lang.org/std/str/pattern/trait.Pattern.html):
contains
starts_with
ends_with
find
rfind
split
split_inclusive
rsplit
split_terminator
rsplit_terminator
splitn
rsplitn
split_once
rsplit_once
rmatches
match_indices
rmatch_indices
trim_matches
trim_start_matches
strip_prefix
strip_suffix
trim_end_matches
These functions accept a pattern as input, then call into_searcher to create a Searcher for the pattern. They use thisSearcher
to perform their desired operations (split, find, etc.). Those functions are implemented in (library/core/src/str/mod.rs), but the core of them is the searching algorithms which are implemented in (library/core/src/str/pattern.rs).
Important note: for this challenge, you can assume:
- The safety and functional correctness of all functions in
slice
module. - That all functions in (library/core/src/str/validations.rs) are functionally correct (consistent with the UTF8 encoding description in https://en.wikipedia.org/wiki/UTF-8).
- That all the Searchers in (library/core/src/str/iter.rs) are created by the into_searcher(_, haystack) with haystack is a valid utf8 string (str). You can assume any utf8 string property of haystack.
Verify the safety of the functions in (library/core/src/str/pattern.rs) listed in the next section.
The safety properties we are targeting are:
- There is no UB happens when calling the functions after the Searcher is created.
- The impls of unsafe traits
Searcher
andReverseSearcher
satisfies the SAFETY condition stated in the file:
/// The trait is marked unsafe because the indices returned by the
/// [`next()`][Searcher::next] methods are required to lie on valid utf8
/// boundaries in the haystack. This enables consumers of this trait to
/// slice the haystack without additional runtime checks.
This property should hold for next_back() of ReverseSearcher
too.
Although this challenge appears similar to Challenge 20 and only requires proving safety for the StrSearcher
implementation,
it warrants being a separate challenge due to its complexity.
The StrSearcher
implementation relies on sophisticated algorithms - TwoWay-Search and SIMD-search - making the safety proof significantly more complex.
Verify the safety of the following StrSearcher
functions in (library/core/src/str/pattern.rs):
next
next_match
next_back
next_match_back
next_reject
next_back_reject
The verification is considered successful if you can specify a condition (a "type invariant") C
and prove that:
- If the
StrSearcher
is created from any valid UTF-8 haystack, it satisfiesC
. - If the
StrSearcher
satisfiesC
, it ensures the two safety properties mentioned in the previous section. - If the
StrSearcher
satisfiesC
, after it calls any function above and gets modified, it still satisfiesC
.
Furthermore, you must prove the absence of undefined behaviors listed in the next section.
The verification must be unbounded---it must hold for inputs of arbitrary size.
All proofs must automatically ensure the absence of the following undefined behaviors ref:
- Accessing (loading from or storing to) a place that is dangling or based on a misaligned pointer.
- Reading from uninitialized memory except for padding or unions.
- Mutating immutable bytes.
- Producing an invalid value
Note: All solutions to verification challenges need to satisfy the criteria established in the challenge book in addition to the ones listed above.