Skip to content

Commit a8add66

Browse files
authored
Extend debugging llvm section (rust-lang#1290)
1 parent c190ae3 commit a8add66

File tree

1 file changed

+94
-17
lines changed

1 file changed

+94
-17
lines changed

src/backend/debugging.md

+94-17
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ project on its own that probably needs to have its own debugging document (not
1111
that I could find one). But here are some tips that are important in a rustc
1212
context:
1313

14+
### Minimize the example
15+
1416
As a general rule, compilers generate lots of information from analyzing code.
1517
Thus, a useful first step is usually to find a minimal example. One way to do
1618
this is to
@@ -24,6 +26,13 @@ everything relevant to the new crate
2426
3. further minimize the issue by making the code shorter (there are tools that
2527
help with this like `creduce`)
2628

29+
For more discussion on methodology for steps 2 and 3 above, there is an
30+
[epic blog post][mcve-blog] from pnkfelix specifically about Rust program minimization.
31+
32+
[mcve-blog]: https://blog.pnkfx.org/blog/2019/11/18/rust-bug-minimization-patterns/
33+
34+
### Enable LLVM internal checks
35+
2736
The official compilers (including nightlies) have LLVM assertions disabled,
2837
which means that LLVM assertion failures can show up as compiler crashes (not
2938
ICEs but "real" crashes) and other sorts of weird behavior. If you are
@@ -34,12 +43,29 @@ anything turns up.
3443

3544
The rustc build process builds the LLVM tools into
3645
`./build/<host-triple>/llvm/bin`. They can be called directly.
46+
These tools include:
47+
* [`llc`], which compiles bitcode (`.bc` files) to executable code; this can be used to
48+
replicate LLVM backend bugs.
49+
* [`opt`], a bitcode transformer that runs LLVM optimization passes.
50+
* [`bugpoint`], which reduces large test cases to small, useful ones.
51+
* and many others, some of which are referenced in the text below.
52+
53+
[`llc`]: https://llvm.org/docs/CommandGuide/llc.html
54+
[`opt`]: https://llvm.org/docs/CommandGuide/opt.html
55+
[`bugpoint`]: https://llvm.org/docs/Bugpoint.html
56+
57+
By default, the Rust build system does not check for changes to the LLVM source code or
58+
its build configuration settings. So, if you need to rebuild the LLVM that is linked
59+
into `rustc`, first delete the file `llvm-finished-building`, which should be located
60+
in `build/<host-triple>/llvm/`.
3761

3862
The default rustc compilation pipeline has multiple codegen units, which is
3963
hard to replicate manually and means that LLVM is called multiple times in
4064
parallel. If you can get away with it (i.e. if it doesn't make your bug
4165
disappear), passing `-C codegen-units=1` to rustc will make debugging easier.
4266

67+
### Get your hands on raw LLVM input
68+
4369
For rustc to generate LLVM IR, you need to pass the `--emit=llvm-ir` flag. If
4470
you are building via cargo, use the `RUSTFLAGS` environment variable (e.g.
4571
`RUSTFLAGS='--emit=llvm-ir'`). This causes rustc to spit out LLVM IR into the
@@ -52,24 +78,22 @@ other useful options. Also, debug info in LLVM IR can clutter the output a lot:
5278
`RUSTFLAGS="-C debuginfo=0"` is really useful.
5379

5480
`RUSTFLAGS="-C save-temps"` outputs LLVM bitcode (not the same as IR) at
55-
different stages during compilation, which is sometimes useful. One just needs
56-
to convert the bitcode files to `.ll` files using `llvm-dis` which should be in
57-
the target local compilation of rustc.
81+
different stages during compilation, which is sometimes useful. The output LLVM
82+
bitcode will be in `.bc` files in the compiler's output directory, set via the
83+
`--out-dir DIR` argument to `rustc`.
5884

59-
If you are seeing incorrect behavior due to an optimization pass, a very handy
60-
LLVM option is `-opt-bisect-limit`, which takes an integer denoting the index
61-
value of the highest pass to run. Index values for taken passes are stable
62-
from run to run; by coupling this with software that automates bisecting the
63-
search space based on the resulting program, an errant pass can be quickly
64-
determined. When an `-opt-bisect-limit` is specified, all runs are displayed
65-
to standard error, along with their index and output indicating if the
66-
pass was run or skipped. Setting the limit to an index of -1 (e.g.,
67-
`RUSTFLAGS="-C llvm-args=-opt-bisect-limit=-1"`) will show all passes and
68-
their corresponding index values.
85+
* If you are hitting an assertion failure or segmentation fault from the LLVM
86+
backend when invoking `rustc` itself, it is a good idea to try passing each
87+
of these `.bc` files to the `llc` command, and see if you get the same
88+
failure. (LLVM developers often prefer a bug reduced to a `.bc` file over one
89+
that uses a Rust crate for its minimized reproduction.)
6990

70-
If you want to play with the optimization pipeline, you can use the `opt` tool
71-
from `./build/<host-triple>/llvm/bin/` with the LLVM IR emitted by rustc. Note
72-
that rustc emits different IR depending on whether `-O` is enabled, even
91+
* To get human readable versions of the LLVM bitcode, one just needs to convert
92+
the bitcode (`.bc`) files to `.ll` files using `llvm-dis`, which should be in
93+
the target local compilation of rustc.
94+
95+
96+
Note that rustc emits different IR depending on whether `-O` is enabled, even
7397
without LLVM's optimizations, so if you want to play with the IR rustc emits,
7498
you should:
7599

@@ -93,6 +117,18 @@ to some file. Also, if you are using neither `-filter-print-funcs` nor `-C
93117
codegen-units=1`, then, because the multiple codegen units run in parallel, the
94118
printouts will mix together and you won't be able to read anything.
95119

120+
* One caveat to the aforementioned methodology: the `-print` family of options
121+
to LLVM only prints the IR unit that the pass runs on (e.g., just a
122+
function), and does not include any referenced declarations, globals,
123+
metadata, etc. This means you cannot in general feed the output of `-print`
124+
into `llc` to reproduce a given problem.
125+
126+
* Within LLVM itself, calling `F.getParent()->dump()` at the beginning of
127+
`SafeStackLegacyPass::runOnFunction` will dump the whole module, which
128+
may provide better basis for reproduction. (However, you
129+
should be able to get that same dump from the `.bc` files dumped by
130+
`-C save-temps`.)
131+
96132
If you want just the IR for a specific function (say, you want to see why it
97133
causes an assertion or doesn't optimize correctly), you can use `llvm-extract`,
98134
e.g.
@@ -105,6 +141,45 @@ $ ./build/$TRIPLE/llvm/bin/llvm-extract \
105141
> extracted.ll
106142
```
107143

144+
### Investigate LLVM optimization passes
145+
146+
If you are seeing incorrect behavior due to an optimization pass, a very handy
147+
LLVM option is `-opt-bisect-limit`, which takes an integer denoting the index
148+
value of the highest pass to run. Index values for taken passes are stable
149+
from run to run; by coupling this with software that automates bisecting the
150+
search space based on the resulting program, an errant pass can be quickly
151+
determined. When an `-opt-bisect-limit` is specified, all runs are displayed
152+
to standard error, along with their index and output indicating if the
153+
pass was run or skipped. Setting the limit to an index of -1 (e.g.,
154+
`RUSTFLAGS="-C llvm-args=-opt-bisect-limit=-1"`) will show all passes and
155+
their corresponding index values.
156+
157+
If you want to play with the optimization pipeline, you can use the [`opt`] tool
158+
from `./build/<host-triple>/llvm/bin/` with the LLVM IR emitted by rustc.
159+
160+
When investigating the implementation of LLVM itself, you should be
161+
aware of its [internal debug infrastructure][llvm-debug].
162+
This is provided in LLVM Debug builds, which you enable for rustc
163+
LLVM builds by changing this setting in the config.toml:
164+
```
165+
[llvm]
166+
# Indicates whether the LLVM assertions are enabled or not
167+
assertions = true
168+
169+
# Indicates whether the LLVM build is a Release or Debug build
170+
optimize = false
171+
```
172+
The quick summary is:
173+
* Setting `assertions=true` enables coarse-grain debug messaging.
174+
* beyond that, setting `optimize=false` enables fine-grain debug messaging.
175+
* `LLVM_DEBUG(dbgs() << msg)` in LLVM is like `debug!(msg)` in `rustc`.
176+
* The `-debug` option turns on all messaging; it is like setting the
177+
environment variable `RUSTC_LOG=debug` in `rustc`.
178+
* The `-debug-only=<pass1>,<pass2>` variant is more selective; it is like
179+
setting the environment variable `RUSTC_LOG=path1,path2` in `rustc`.
180+
181+
[llvm-debug]: https://llvm.org/docs/ProgrammersManual.html#the-llvm-debug-macro-and-debug-option
182+
108183
### Getting help and asking questions
109184

110185
If you have some questions, head over to the [rust-lang Zulip] and
@@ -164,7 +239,9 @@ create a minimal working example with Godbolt. Go to
164239
optimizations transform it.
165240

166241
5. Once you have a godbolt link demonstrating the issue, it is pretty easy to
167-
fill in an LLVM bug. Just visit [bugs.llvm.org](https://bugs.llvm.org/).
242+
fill in an LLVM bug. Just visit their [github issues page][llvm-issues].
243+
244+
[llvm-issues]: https://github.com/llvm/llvm-project/issues
168245

169246
### Porting bug fixes from LLVM
170247

0 commit comments

Comments
 (0)