From 7a8e791b0d42c6e136345b8a238d19363e17e964 Mon Sep 17 00:00:00 2001 From: Langston Barrett Date: Wed, 15 Mar 2023 22:31:50 -0400 Subject: [PATCH 1/7] Add chapter on fuzzing --- src/SUMMARY.md | 1 + src/fuzzing.md | 105 +++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 106 insertions(+) create mode 100644 src/fuzzing.md diff --git a/src/SUMMARY.md b/src/SUMMARY.md index b558fb2b2..3cd6989a5 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -46,6 +46,7 @@ - [Stabilizing Features](./stabilization_guide.md) - [Feature Gates](./feature-gates.md) - [Coding conventions](./conventions.md) +- [Fuzzing](./fuzzing.md) - [Notification groups](notification-groups/about.md) - [ARM](notification-groups/arm.md) - [Cleanup Crew](notification-groups/cleanup-crew.md) diff --git a/src/fuzzing.md b/src/fuzzing.md new file mode 100644 index 000000000..061ed16ae --- /dev/null +++ b/src/fuzzing.md @@ -0,0 +1,105 @@ +# Fuzzing + + + +For the purposes of this guide, *fuzzing* is any testing methodology that +involves compiling a wide variety of programs in an attempt to uncover bugs in +rustc. Fuzzing is often used to find internal compiler errors (ICEs). Fuzzing +can be beneficial, because it can find bugs before users run into them and +provide small, self-contained programs that make the bug easier to track down. +However, some common mistakes can reduce the helpfulness of fuzzing and end up +making contributors' lives harder. To maximize your positive impact on the Rust +project, please read this guide before reporting fuzzer-generated bugs! + +## Guidelines + +### In a nutshell + +*Please do:* + +- Ensure the bug is still present on the latest nightly rustc +- Include a reasonably minimal, standalone example along with any bug report +- Include all of the information requested in the bug report template +- Search for existing reports with the same message and query stack +- Format the test case with `rustfmt`, if it maintains the bug + +*Please don't:* + +- Report lots of bugs that use internal features, including but not limited to + `custom_mir`, `lang_items`, `no_std`, and `rustc_attrs`. +- Seed your fuzzer with inputs that are known to crash rustc (details below). + +### Discussion + +If you're not sure whether or not an ICE is a duplicate of one that's already +been reported, please go ahead and report it and link to issues you think might +be related. In general, ICEs on the same line but with different *query stacks* +are usually distinct bugs. + +## Building a corpus + +When building a corpus, be sure to avoid collecting tests that are already +known to crash rustc. A fuzzer that is seeded with such tests is more likely to +generate bugs with the same root cause, wasting everyone's time. The simplest +way to avoid this is to loop over each file in the corpus, see if it causes an +ICE, and remove it if so. + +To build a corpus, you may want to use: + +- The rustc/rust-analyzer/clippy test suites (or even source code) --- though avoid + tests that are already known to cause failures, which often begin with comments + like `// failure-status: 101` or `// known-bug: #NNN`. +- The already-fixed ICEs in [Glacier][glacier] --- though avoid the unfixed + ones in `ices/`! + +## Extra credit + +Here are a few things you can do to help the Rust project after filing an ICE. + +- Add the minimal test case to [Glacier][glacier] +- [Bisect][bisect] the bug to figure out when it was introduced +- Fix unrelated problems with the test case (things like syntax errors or + borrow-checking errors) +- Minimize the test case (see below) + +[bisect]: https://github.com/rust-lang/cargo-bisect-rustc/blob/master/TUTORIAL.md + +## Minimization + +It can be helpful to *minimize* the fuzzer-generated input. When minimizing, be +careful to preserve the original error, and avoid introducing distracting +problems such as syntax, type-checking, or borrow-checking errors. + +There are some tools that can help with minimization. If you're not sure how +to avoid introducing syntax, type-, and borrow-checking errors while using +these tools, post both the complete and minimized test cases. Generally, +*syntax-aware* tools give the best results in the least amount of time. +[`treereduce-rust`][treereduce] and [picireny][picireny] are syntax-aware. +`halfempty` is not, but is generally a high-quality tool. + +[halfempty]: https://github.com/googleprojectzero/halfempty +[picireny]: https://github.com/renatahodovan/picireny +[treereduce]: https://github.com/langston-barrett/treereduce + +## Effective fuzzing + +When fuzzing rustc, you may want to avoid generating code, since this is mostly +done by LLVM. Try `--emit=mir` instead. + +A variety of compiler flags can uncover different issues. + +If you're fuzzing a compiler you built, you may want to build it with `-C +target-cpu=native` to squeeze out a few more executions per second. + +## Existing projects + +- [fuzz-rustc][fuzz-rustc] demonstrates how to fuzz rustc with libfuzzer +- [icemaker][icemaker] runs rustc and other tools on a large number of source + files with a variety of flags to catch ICEs +- [tree-splicer][tree-splicer] generates new source files by combining existing + ones while maintaining correct syntax + +[glacier]: https://github.com/rust-lang/glacier +[fuzz-rustc]: https://github.com/dwrensha/fuzz-rustc +[icemaker]: https://github.com/matthiaskrgr/icemaker/ +[tree-splicer]: https://github.com/langston-barrett/tree-splicer/ \ No newline at end of file From 48864aba908eda1f21d9507ad09a83fdc6c27786 Mon Sep 17 00:00:00 2001 From: Langston Barrett Date: Thu, 16 Mar 2023 08:56:46 -0400 Subject: [PATCH 2/7] Address review comments --- src/fuzzing.md | 43 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 35 insertions(+), 8 deletions(-) diff --git a/src/fuzzing.md b/src/fuzzing.md index 061ed16ae..e081b354b 100644 --- a/src/fuzzing.md +++ b/src/fuzzing.md @@ -22,11 +22,12 @@ project, please read this guide before reporting fuzzer-generated bugs! - Include all of the information requested in the bug report template - Search for existing reports with the same message and query stack - Format the test case with `rustfmt`, if it maintains the bug +- Indicate that the bug was found by fuzzing *Please don't:* - Report lots of bugs that use internal features, including but not limited to - `custom_mir`, `lang_items`, `no_std`, and `rustc_attrs`. + `custom_mir`, `lang_items`, `no_core`, and `rustc_attrs`. - Seed your fuzzer with inputs that are known to crash rustc (details below). ### Discussion @@ -34,7 +35,30 @@ project, please read this guide before reporting fuzzer-generated bugs! If you're not sure whether or not an ICE is a duplicate of one that's already been reported, please go ahead and report it and link to issues you think might be related. In general, ICEs on the same line but with different *query stacks* -are usually distinct bugs. +are usually distinct bugs. For example, [#109020][#109202] and [#109129][#109129] +had similar error messages: + +``` +error: internal compiler error: compiler/rustc_middle/src/ty/normalize_erasing_regions.rs:195:90: Failed to normalize <[closure@src/main.rs:36:25: 36:28] as std::ops::FnOnce<(Emplacable<()>,)>>::Output, maybe try to call `try_normalize_erasing_regions` instead +``` +``` +error: internal compiler error: compiler/rustc_middle/src/ty/normalize_erasing_regions.rs:195:90: Failed to normalize <() as Project>::Assoc, maybe try to call `try_normalize_erasing_regions` instead +``` +but different query stacks: +``` +query stack during panic: +#0 [fn_abi_of_instance] computing call ABI of `<[closure@src/main.rs:36:25: 36:28] as core::ops::function::FnOnce<(Emplacable<()>,)>>::call_once - shim(vtable)` +end of query stack +``` +``` +query stack during panic: +#0 [check_mod_attrs] checking attributes in top-level module +#1 [analysis] running analysis passes on this crate +end of query stack +``` + +[#109020]: https://github.com/rust-lang/rust/issues/109020 +[#109129]: https://github.com/rust-lang/rust/issues/109129 ## Building a corpus @@ -56,19 +80,19 @@ To build a corpus, you may want to use: Here are a few things you can do to help the Rust project after filing an ICE. -- Add the minimal test case to [Glacier][glacier] - [Bisect][bisect] the bug to figure out when it was introduced - Fix unrelated problems with the test case (things like syntax errors or borrow-checking errors) - Minimize the test case (see below) +- Add the minimal test case to [Glacier][glacier] [bisect]: https://github.com/rust-lang/cargo-bisect-rustc/blob/master/TUTORIAL.md ## Minimization -It can be helpful to *minimize* the fuzzer-generated input. When minimizing, be -careful to preserve the original error, and avoid introducing distracting -problems such as syntax, type-checking, or borrow-checking errors. +It is helpful to carefully *minimize* the fuzzer-generated input. When +minimizing, be careful to preserve the original error, and avoid introducing +distracting problems such as syntax, type-checking, or borrow-checking errors. There are some tools that can help with minimization. If you're not sure how to avoid introducing syntax, type-, and borrow-checking errors while using @@ -86,10 +110,13 @@ these tools, post both the complete and minimized test cases. Generally, When fuzzing rustc, you may want to avoid generating code, since this is mostly done by LLVM. Try `--emit=mir` instead. -A variety of compiler flags can uncover different issues. +A variety of compiler flags can uncover different issues. `-Zmir-opt=4` will +turn on MIR optimization passes that are not run by default, potentially +uncovering interesting bugs. If you're fuzzing a compiler you built, you may want to build it with `-C -target-cpu=native` to squeeze out a few more executions per second. +target-cpu=native` or even PGO/BOLT to squeeze out a few more executions per +second. ## Existing projects From b941d220bc9ecbb49dc24a275b995a486913b1ae Mon Sep 17 00:00:00 2001 From: Langston Barrett Date: Thu, 16 Mar 2023 11:13:36 -0400 Subject: [PATCH 3/7] mir-opt*-level*, not mir-opt --- src/fuzzing.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/fuzzing.md b/src/fuzzing.md index e081b354b..56d55f2b9 100644 --- a/src/fuzzing.md +++ b/src/fuzzing.md @@ -110,8 +110,8 @@ these tools, post both the complete and minimized test cases. Generally, When fuzzing rustc, you may want to avoid generating code, since this is mostly done by LLVM. Try `--emit=mir` instead. -A variety of compiler flags can uncover different issues. `-Zmir-opt=4` will -turn on MIR optimization passes that are not run by default, potentially +A variety of compiler flags can uncover different issues. `-Zmir-opt-level=4` +will turn on MIR optimization passes that are not run by default, potentially uncovering interesting bugs. If you're fuzzing a compiler you built, you may want to build it with `-C From 7e6c2eaacb150980e9684cdfb8862b68a8fd2bc1 Mon Sep 17 00:00:00 2001 From: Langston Barrett Date: Thu, 16 Mar 2023 16:56:35 -0400 Subject: [PATCH 4/7] Address review comments --- src/fuzzing.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/src/fuzzing.md b/src/fuzzing.md index 56d55f2b9..1c6dcfe3a 100644 --- a/src/fuzzing.md +++ b/src/fuzzing.md @@ -26,9 +26,10 @@ project, please read this guide before reporting fuzzer-generated bugs! *Please don't:* -- Report lots of bugs that use internal features, including but not limited to - `custom_mir`, `lang_items`, `no_core`, and `rustc_attrs`. -- Seed your fuzzer with inputs that are known to crash rustc (details below). +- Don't report lots of bugs that use internal features, including but not + limited to `custom_mir`, `lang_items`, `no_core`, and `rustc_attrs`. +- Don't seed your fuzzer with inputs that are known to crash rustc (details + below). ### Discussion @@ -107,16 +108,17 @@ these tools, post both the complete and minimized test cases. Generally, ## Effective fuzzing -When fuzzing rustc, you may want to avoid generating code, since this is mostly -done by LLVM. Try `--emit=mir` instead. +When fuzzing rustc, you may want to avoid generating machine code, since this +is mostly done by LLVM. Try `--emit=mir` instead. A variety of compiler flags can uncover different issues. `-Zmir-opt-level=4` will turn on MIR optimization passes that are not run by default, potentially -uncovering interesting bugs. +uncovering interesting bugs. `-Zvalidate-mir` can help uncover such bugs. If you're fuzzing a compiler you built, you may want to build it with `-C target-cpu=native` or even PGO/BOLT to squeeze out a few more executions per -second. +second. Of course, it's best to try multiple build configurations and see +what actually results in superior throughput. ## Existing projects From 80b04f90d5dfe37e821fc849af129ff9041dc857 Mon Sep 17 00:00:00 2001 From: Langston Barrett Date: Thu, 16 Mar 2023 17:36:38 -0400 Subject: [PATCH 5/7] Mention debug assertions --- src/fuzzing.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/fuzzing.md b/src/fuzzing.md index 1c6dcfe3a..f687fbc92 100644 --- a/src/fuzzing.md +++ b/src/fuzzing.md @@ -120,6 +120,16 @@ target-cpu=native` or even PGO/BOLT to squeeze out a few more executions per second. Of course, it's best to try multiple build configurations and see what actually results in superior throughput. +You may want to build rustc from source with debug assertions to find +additional bugs, though this is a trade-off: it can slow down fuzzing by +requiring extra work for every execution. To enable debug assertions, add this +to `config.toml` when compiling rustc: + +```toml +[rust] +debug-assertions = true +``` + ## Existing projects - [fuzz-rustc][fuzz-rustc] demonstrates how to fuzz rustc with libfuzzer From 19cafc02309ae6bf57b9a0836df6da4e3410fac3 Mon Sep 17 00:00:00 2001 From: Langston Barrett Date: Thu, 16 Mar 2023 17:38:54 -0400 Subject: [PATCH 6/7] Mention debug assertions label --- src/fuzzing.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/fuzzing.md b/src/fuzzing.md index f687fbc92..ed75eef67 100644 --- a/src/fuzzing.md +++ b/src/fuzzing.md @@ -130,6 +130,11 @@ to `config.toml` when compiling rustc: debug-assertions = true ``` +ICEs that require debug assertions to reproduce should be tagged +[`requires-debug-assertions`][requires-debug-assertions]. + +[requires-debug-assertions]: https://github.com/rust-lang/rust/labels/requires-debug-assertions + ## Existing projects - [fuzz-rustc][fuzz-rustc] demonstrates how to fuzz rustc with libfuzzer From a38cd1027a30a94157b8ba919f4b465000cb02d2 Mon Sep 17 00:00:00 2001 From: Langston Barrett Date: Thu, 16 Mar 2023 17:51:11 -0400 Subject: [PATCH 7/7] Reword to include 'distractions' --- src/fuzzing.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/fuzzing.md b/src/fuzzing.md index ed75eef67..bc7f0a0a2 100644 --- a/src/fuzzing.md +++ b/src/fuzzing.md @@ -82,8 +82,8 @@ To build a corpus, you may want to use: Here are a few things you can do to help the Rust project after filing an ICE. - [Bisect][bisect] the bug to figure out when it was introduced -- Fix unrelated problems with the test case (things like syntax errors or - borrow-checking errors) +- Fix "distractions": problems with the test case that don't contribute to + triggering the ICE, such as syntax errors or borrow-checking errors - Minimize the test case (see below) - Add the minimal test case to [Glacier][glacier]