Skip to content

Commit d2aea1f

Browse files
committed
Organize and finish debugging chapters
1 parent 10fb45e commit d2aea1f

File tree

6 files changed

+241
-210
lines changed

6 files changed

+241
-210
lines changed

src/SUMMARY.md

+3
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
- [Name resolution](./name-resolution.md)
3838
- [The HIR (High-level IR)](./hir.md)
3939
- [Lowering AST to HIR](./lowering.md)
40+
- [Debugging](./hir-debugging.md)
4041
- [The `ty` module: representing types](./ty.md)
4142
- [Kinds](./kinds.md)
4243
- [Type inference](./type-inference.md)
@@ -67,6 +68,7 @@
6768
- [MIR visitor and traversal](./mir/visitor.md)
6869
- [MIR passes: getting the MIR for a function](./mir/passes.md)
6970
- [MIR optimizations](./mir/optimizations.md)
71+
- [Debugging](./mir/debugging.md)
7072
- [The borrow checker](./borrow_check.md)
7173
- [Tracking moves and initialization](./borrow_check/moves_and_initialization.md)
7274
- [Move paths](./borrow_check/moves_and_initialization/move_paths.md)
@@ -77,6 +79,7 @@
7779
- [Parameter Environments](./param_env.md)
7880
- [Code Generation](./codegen.md)
7981
- [Updating LLVM](./codegen/updating-llvm.md)
82+
- [Debugging LLVM](./codegen/debugging.md)
8083
- [Emitting Diagnostics](./diag.md)
8184

8285
---

src/codegen/debugging.md

+122
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
## Debugging LLVM
2+
3+
> NOTE: If you are looking for info about code generation, please see [this
4+
> chapter][codegen] instead.
5+
6+
[codegen]: codegen.html
7+
8+
This section is about debugging compiler bugs in code generation (e.g. why the
9+
compiler generated some piece of code or crashed in LLVM). LLVM is a big
10+
project on its own that probably needs to have its own debugging document (not
11+
that I could find one). But here are some tips that are important in a rustc
12+
context:
13+
14+
As a general rule, compilers generate lots of information from analyzing code.
15+
Thus, a useful first step is usually to find a minimal example. One way to do
16+
this is to
17+
18+
1. create a new crate that reproduces the issue (e.g. adding whatever crate is
19+
at fault as a dependency, and using it from there)
20+
21+
2. minimize the crate by removing external dependencies; that is, moving
22+
everything relevant to the new crate
23+
24+
3. further minimize the issue by making the code shorter (there are tools that
25+
help with this like `creduce`)
26+
27+
The official compilers (including nightlies) have LLVM assertions disabled,
28+
which means that LLVM assertion failures can show up as compiler crashes (not
29+
ICEs but "real" crashes) and other sorts of weird behavior. If you are
30+
encountering these, it is a good idea to try using a compiler with LLVM
31+
assertions enabled - either an "alt" nightly or a compiler you build yourself
32+
by setting `[llvm] assertions=true` in your config.toml - and see whether
33+
anything turns up.
34+
35+
The rustc build process builds the LLVM tools into
36+
`./build/<host-triple>/llvm/bin`. They can be called directly.
37+
38+
The default rustc compilation pipeline has multiple codegen units, which is
39+
hard to replicate manually and means that LLVM is called multiple times in
40+
parallel. If you can get away with it (i.e. if it doesn't make your bug
41+
disappear), passing `-C codegen-units=1` to rustc will make debugging easier.
42+
43+
To rustc to generate LLVM IR, you need to pass the `--emit=llvm-ir` flag. If
44+
you are building via cargo, use the `RUSTFLAGS` environment variable (e.g.
45+
`RUSTFLAGS='--emit=llvm-ir'`). This causes rustc to spit out LLVM IR into the
46+
target directory.
47+
48+
`cargo llvm-ir [options] path` spits out the LLVM IR for a particular function
49+
at `path`. (`cargo install cargo-asm` installs `cargo asm` and `cargo
50+
llvm-ir`). `--build-type=debug` emits code for debug builds. There are also
51+
other useful options. Also, debug info in LLVM IR can clutter the output a lot:
52+
`RUSTFLAGS="-C debuginfo=0"` is really useful.
53+
54+
`RUSTFLAGS="-C save-temps"` outputs LLVM bitcode (not the same as IR) at
55+
different stages during compilation, which is sometimes useful. One just needs
56+
to convert the bitcode files to `.ll` files using `llvm-dis` which should be in
57+
the target local compilation of rustc.
58+
59+
If you want to play with the optimization pipeline, you can use the `opt` tool
60+
from `./build/<host-triple>/llvm/bin/` with the LLVM IR emitted by rustc. Note
61+
that rustc emits different IR depending on whether `-O` is enabled, even
62+
without LLVM's optimizations, so if you want to play with the IR rustc emits,
63+
you should:
64+
65+
```bash
66+
$ rustc +local my-file.rs --emit=llvm-ir -O -C no-prepopulate-passes \
67+
-C codegen-units=1
68+
$ OPT=./build/$TRIPLE/llvm/bin/opt
69+
$ $OPT -S -O2 < my-file.ll > my
70+
```
71+
72+
If you just want to get the LLVM IR during the LLVM pipeline, to e.g. see which
73+
IR causes an optimization-time assertion to fail, or to see when LLVM performs
74+
a particular optimization, you can pass the rustc flag `-C
75+
llvm-args=-print-after-all`, and possibly add `-C
76+
llvm-args='-filter-print-funcs=EXACT_FUNCTION_NAME` (e.g. `-C
77+
llvm-args='-filter-print-funcs=_ZN11collections3str21_$LT$impl$u20$str$GT$\
78+
7replace17hbe10ea2e7c809b0bE'`).
79+
80+
That produces a lot of output into standard error, so you'll want to pipe that
81+
to some file. Also, if you are using neither `-filter-print-funcs` nor `-C
82+
codegen-units=1`, then, because the multiple codegen units run in parallel, the
83+
printouts will mix together and you won't be able to read anything.
84+
85+
If you want just the IR for a specific function (say, you want to see why it
86+
causes an assertion or doesn't optimize correctly), you can use `llvm-extract`,
87+
e.g.
88+
89+
```bash
90+
$ ./build/$TRIPLE/llvm/bin/llvm-extract \
91+
-func='_ZN11collections3str21_$LT$impl$u20$str$GT$7replace17hbe10ea2e7c809b0bE' \
92+
-S \
93+
< unextracted.ll \
94+
> extracted.ll
95+
```
96+
97+
### Filing LLVM bug reports
98+
99+
When filing an LLVM bug report, you will probably want some sort of minimal
100+
working example that demonstrates the problem. The Godbolt compiler explorer is
101+
really helpful for this.
102+
103+
1. Once you have some LLVM IR for the problematic code (see above), you can
104+
create a minimal working example with Godbolt. Go to
105+
[gcc.godbolt.org](https://gcc.godbolt.org).
106+
107+
2. Choose `LLVM-IR` as programming language.
108+
109+
3. Use `llc` to compile the IR to a particular target as is:
110+
- There are some useful flags: `-mattr` enables target features, `-march=`
111+
selects the target, `-mcpu=` selects the CPU, etc.
112+
- Commands like `llc -march=help` output all architectures available, which
113+
is useful because sometimes the Rust arch names and the LLVM names do not
114+
match.
115+
- If you have compiled rustc yourself somewhere, in the target directory
116+
you have binaries for `llc`, `opt`, etc.
117+
118+
4. If you want to optimize the LLVM-IR, you can use `opt` to see how the LLVM
119+
optimizations transform it.
120+
121+
5. Once you have a godbolt link demonstrating the issue, it is pretty easy to
122+
fill in an LLVM bug.

src/compiler-debugging.md

+27-134
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,21 @@
11
# Debugging the compiler
22
[debugging]: #debugging
33

4-
Here are a few tips to debug the compiler:
4+
This chapter contains a few tips to debug the compiler. These tips aim to be
5+
useful no matter what you are working on. Some of the other chapters have
6+
advice about specific parts of the compiler (e.g. the [Queries Debugging and
7+
Testing
8+
chapter](./incrcomp-debugging.html) or
9+
the [LLVM Debugging chapter](./codegen/debugging.md)).
10+
11+
## `-Z` flags
12+
13+
The compiler has a bunch of `-Z` flags. These are unstable flags that are only
14+
enabled on nightly. Many of them are useful for debugging. To get a full listing
15+
of `-Z` flags, use `-Z help`.
16+
17+
One useful flag is `-Z verbose`, which generally enables printing more info that
18+
could be useful for debugging.
519

620
## Getting a backtrace
721
[getting-a-backtrace]: #getting-a-backtrace
@@ -135,6 +149,9 @@ These crates are used in compiler for logging:
135149
* [log]
136150
* [env-logger]: check the link to see the full `RUST_LOG` syntax
137151
152+
[log]: https://docs.rs/log/0.4.6/log/index.html
153+
[env-logger]: https://docs.rs/env_logger/0.4.3/env_logger/
154+
138155
The compiler has a lot of `debug!` calls, which print out logging information
139156
at many points. These are very useful to at least narrow down the location of
140157
a bug if not to find it entirely, or just to orient yourself as to why the
@@ -189,17 +206,20 @@ I also think that in some cases just setting it will not trigger a rebuild,
189206
so if you changed it and you already have a compiler built, you might
190207
want to call `x.py clean` to force one.
191208
192-
### Logging etiquette
209+
### Logging etiquette and conventions
193210
194211
Because calls to `debug!` are removed by default, in most cases, don't worry
195212
about adding "unnecessary" calls to `debug!` and leaving them in code you
196213
commit - they won't slow down the performance of what we ship, and if they
197214
helped you pinning down a bug, they will probably help someone else with a
198215
different one.
199216
200-
However, there are still a few concerns that you might care about:
217+
A loosely followed convention is to use `debug!("foo(...)")` at the _start_ of
218+
a function `foo` and `debug!("foo: ...")` _within_ the function. Another
219+
loosely followed convention is to use the `{:?}` format specifier for debug
220+
logs.
201221
202-
### Expensive operations in logs
222+
One thing to be **careful** of is **expensive** operations in logs.
203223
204224
If in the module `rustc::foo` you have a statement
205225
@@ -210,9 +230,9 @@ debug!("{:?}", random_operation(tcx));
210230
Then if someone runs a debug `rustc` with `RUST_LOG=rustc::bar`, then
211231
`random_operation()` will run.
212232
213-
This means that you should not put anything too expensive or likely
214-
to crash there - that would annoy anyone who wants to use logging for their own
215-
module. No-one will know it until someone tries to use logging to find *another* bug.
233+
This means that you should not put anything too expensive or likely to crash
234+
there - that would annoy anyone who wants to use logging for their own module.
235+
No-one will know it until someone tries to use logging to find *another* bug.
216236
217237
## Formatting Graphviz output (.dot files)
218238
[formatting-graphviz-output]: #formatting-graphviz-output
@@ -229,133 +249,6 @@ $ dot -T pdf maybe_init_suffix.dot > maybe_init_suffix.pdf
229249
$ firefox maybe_init_suffix.pdf # Or your favorite pdf viewer
230250
```
231251
232-
## Debugging LLVM
233-
[debugging-llvm]: #debugging-llvm
234-
235-
> NOTE: If you are looking for info about code generation, please see [this
236-
> chapter][codegen] instead.
237-
238-
[codegen]: codegen.html
239-
240-
This section is about debugging compiler bugs in code generation (e.g. why the
241-
compiler generated some piece of code or crashed in LLVM). LLVM is a big
242-
project on its own that probably needs to have its own debugging document (not
243-
that I could find one). But here are some tips that are important in a rustc
244-
context:
245-
246-
As a general rule, compilers generate lots of information from analyzing code.
247-
Thus, a useful first step is usually to find a minimal example. One way to do
248-
this is to
249-
250-
1. create a new crate that reproduces the issue (e.g. adding whatever crate is
251-
at fault as a dependency, and using it from there)
252-
253-
2. minimize the crate by removing external dependencies; that is, moving
254-
everything relevant to the new crate
255-
256-
3. further minimize the issue by making the code shorter (there are tools that
257-
help with this like `creduce`)
258-
259-
The official compilers (including nightlies) have LLVM assertions disabled,
260-
which means that LLVM assertion failures can show up as compiler crashes (not
261-
ICEs but "real" crashes) and other sorts of weird behavior. If you are
262-
encountering these, it is a good idea to try using a compiler with LLVM
263-
assertions enabled - either an "alt" nightly or a compiler you build yourself
264-
by setting `[llvm] assertions=true` in your config.toml - and see whether
265-
anything turns up.
266-
267-
The rustc build process builds the LLVM tools into
268-
`./build/<host-triple>/llvm/bin`. They can be called directly.
269-
270-
The default rustc compilation pipeline has multiple codegen units, which is
271-
hard to replicate manually and means that LLVM is called multiple times in
272-
parallel. If you can get away with it (i.e. if it doesn't make your bug
273-
disappear), passing `-C codegen-units=1` to rustc will make debugging easier.
274-
275-
To rustc to generate LLVM IR, you need to pass the `--emit=llvm-ir` flag. If
276-
you are building via cargo, use the `RUSTFLAGS` environment variable (e.g.
277-
`RUSTFLAGS='--emit=llvm-ir'`). This causes rustc to spit out LLVM IR into the
278-
target directory.
279-
280-
`cargo llvm-ir [options] path` spits out the LLVM IR for a particular function
281-
at `path`. (`cargo install cargo-asm` installs `cargo asm` and `cargo
282-
llvm-ir`). `--build-type=debug` emits code for debug builds. There are also
283-
other useful options. Also, debug info in LLVM IR can clutter the output a lot:
284-
`RUSTFLAGS="-C debuginfo=0"` is really useful.
285-
286-
`RUSTFLAGS="-C save-temps"` outputs LLVM bitcode (not the same as IR) at
287-
different stages during compilation, which is sometimes useful. One just needs
288-
to convert the bitcode files to `.ll` files using `llvm-dis` which should be in
289-
the target local compilation of rustc.
290-
291-
If you want to play with the optimization pipeline, you can use the `opt` tool
292-
from `./build/<host-triple>/llvm/bin/` with the LLVM IR emitted by rustc. Note
293-
that rustc emits different IR depending on whether `-O` is enabled, even
294-
without LLVM's optimizations, so if you want to play with the IR rustc emits,
295-
you should:
296-
297-
```bash
298-
$ rustc +local my-file.rs --emit=llvm-ir -O -C no-prepopulate-passes \
299-
-C codegen-units=1
300-
$ OPT=./build/$TRIPLE/llvm/bin/opt
301-
$ $OPT -S -O2 < my-file.ll > my
302-
```
303-
304-
If you just want to get the LLVM IR during the LLVM pipeline, to e.g. see which
305-
IR causes an optimization-time assertion to fail, or to see when LLVM performs
306-
a particular optimization, you can pass the rustc flag `-C
307-
llvm-args=-print-after-all`, and possibly add `-C
308-
llvm-args='-filter-print-funcs=EXACT_FUNCTION_NAME` (e.g. `-C
309-
llvm-args='-filter-print-funcs=_ZN11collections3str21_$LT$impl$u20$str$GT$\
310-
7replace17hbe10ea2e7c809b0bE'`).
311-
312-
That produces a lot of output into standard error, so you'll want to pipe that
313-
to some file. Also, if you are using neither `-filter-print-funcs` nor `-C
314-
codegen-units=1`, then, because the multiple codegen units run in parallel, the
315-
printouts will mix together and you won't be able to read anything.
316-
317-
If you want just the IR for a specific function (say, you want to see why it
318-
causes an assertion or doesn't optimize correctly), you can use `llvm-extract`,
319-
e.g.
320-
321-
```bash
322-
$ ./build/$TRIPLE/llvm/bin/llvm-extract \
323-
-func='_ZN11collections3str21_$LT$impl$u20$str$GT$7replace17hbe10ea2e7c809b0bE' \
324-
-S \
325-
< unextracted.ll \
326-
> extracted.ll
327-
```
328-
329-
### Filing LLVM bug reports
330-
331-
When filing an LLVM bug report, you will probably want some sort of minimal
332-
working example that demonstrates the problem. The Godbolt compiler explorer is
333-
really helpful for this.
334-
335-
1. Once you have some LLVM IR for the problematic code (see above), you can
336-
create a minimal working example with Godbolt. Go to
337-
[gcc.godbolt.org](https://gcc.godbolt.org).
338-
339-
2. Choose `LLVM-IR` as programming language.
340-
341-
3. Use `llc` to compile the IR to a particular target as is:
342-
- There are some useful flags: `-mattr` enables target features, `-march=`
343-
selects the target, `-mcpu=` selects the CPU, etc.
344-
- Commands like `llc -march=help` output all architectures available, which
345-
is useful because sometimes the Rust arch names and the LLVM names do not
346-
match.
347-
- If you have compiled rustc yourself somewhere, in the target directory
348-
you have binaries for `llc`, `opt`, etc.
349-
350-
4. If you want to optimize the LLVM-IR, you can use `opt` to see how the LLVM
351-
optimizations transform it.
352-
353-
5. Once you have a godbolt link demonstrating the issue, it is pretty easy to
354-
fill in an LLVM bug.
355-
356-
[log]: https://docs.rs/log/0.4.6/log/index.html
357-
[env-logger]: https://docs.rs/env_logger/0.4.3/env_logger/
358-
359252
## Narrowing (Bisecting) Regressions
360253
361254
The [cargo-bisect-rustc][bisect] tool can be used as a quick and easy way to

src/hir-debugging.md

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# HIR Debugging
2+
3+
The `-Zunpretty=hir-tree` flag will dump out the HIR.
4+
5+
If you are trying to correlate `NodeId`s or `DefId`s with source code, the
6+
`--pretty expanded,identified` flag may be useful.
7+
8+
TODO: anything else?

0 commit comments

Comments
 (0)