@@ -11,6 +11,8 @@ project on its own that probably needs to have its own debugging document (not
11
11
that I could find one). But here are some tips that are important in a rustc
12
12
context:
13
13
14
+ ### Minimize the example
15
+
14
16
As a general rule, compilers generate lots of information from analyzing code.
15
17
Thus, a useful first step is usually to find a minimal example. One way to do
16
18
this is to
@@ -24,6 +26,13 @@ everything relevant to the new crate
24
26
3 . further minimize the issue by making the code shorter (there are tools that
25
27
help with this like ` creduce ` )
26
28
29
+ For more discussion on methodology for steps 2 and 3 above, there is an
30
+ [ epic blog post] [ mcve-blog ] from pnkfelix specifically about Rust program minimization.
31
+
32
+ [ mcve-blog ] : https://blog.pnkfx.org/blog/2019/11/18/rust-bug-minimization-patterns/
33
+
34
+ ### Enable LLVM internal checks
35
+
27
36
The official compilers (including nightlies) have LLVM assertions disabled,
28
37
which means that LLVM assertion failures can show up as compiler crashes (not
29
38
ICEs but "real" crashes) and other sorts of weird behavior. If you are
@@ -34,12 +43,29 @@ anything turns up.
34
43
35
44
The rustc build process builds the LLVM tools into
36
45
` ./build/<host-triple>/llvm/bin ` . They can be called directly.
46
+ These tools include:
47
+ * [ ` llc ` ] , which compiles bitcode (` .bc ` files) to executable code; this can be used to
48
+ replicate LLVM backend bugs.
49
+ * [ ` opt ` ] , a bitcode transformer that runs LLVM optimization passes.
50
+ * [ ` bugpoint ` ] , which reduces large test cases to small, useful ones.
51
+ * and many others, some of which are referenced in the text below.
52
+
53
+ [ `llc` ] : https://llvm.org/docs/CommandGuide/llc.html
54
+ [ `opt` ] : https://llvm.org/docs/CommandGuide/opt.html
55
+ [ `bugpoint` ] : https://llvm.org/docs/Bugpoint.html
56
+
57
+ By default, the Rust build system does not check for changes to the LLVM source code or
58
+ its build configuration settings. So, if you need to rebuild the LLVM that is linked
59
+ into ` rustc ` , first delete the file ` llvm-finished-building ` , which should be located
60
+ in ` build/<host-triple>/llvm/ ` .
37
61
38
62
The default rustc compilation pipeline has multiple codegen units, which is
39
63
hard to replicate manually and means that LLVM is called multiple times in
40
64
parallel. If you can get away with it (i.e. if it doesn't make your bug
41
65
disappear), passing ` -C codegen-units=1 ` to rustc will make debugging easier.
42
66
67
+ ### Get your hands on raw LLVM input
68
+
43
69
For rustc to generate LLVM IR, you need to pass the ` --emit=llvm-ir ` flag. If
44
70
you are building via cargo, use the ` RUSTFLAGS ` environment variable (e.g.
45
71
` RUSTFLAGS='--emit=llvm-ir' ` ). This causes rustc to spit out LLVM IR into the
@@ -52,24 +78,22 @@ other useful options. Also, debug info in LLVM IR can clutter the output a lot:
52
78
` RUSTFLAGS="-C debuginfo=0" ` is really useful.
53
79
54
80
` RUSTFLAGS="-C save-temps" ` outputs LLVM bitcode (not the same as IR) at
55
- different stages during compilation, which is sometimes useful. One just needs
56
- to convert the bitcode files to ` .ll ` files using ` llvm-dis ` which should be in
57
- the target local compilation of rustc.
81
+ different stages during compilation, which is sometimes useful. The output LLVM
82
+ bitcode will be in ` .bc ` files in the compiler's output directory, set via the
83
+ ` --out-dir DIR ` argument to ` rustc ` .
58
84
59
- If you are seeing incorrect behavior due to an optimization pass, a very handy
60
- LLVM option is ` -opt-bisect-limit ` , which takes an integer denoting the index
61
- value of the highest pass to run. Index values for taken passes are stable
62
- from run to run; by coupling this with software that automates bisecting the
63
- search space based on the resulting program, an errant pass can be quickly
64
- determined. When an ` -opt-bisect-limit ` is specified, all runs are displayed
65
- to standard error, along with their index and output indicating if the
66
- pass was run or skipped. Setting the limit to an index of -1 (e.g.,
67
- ` RUSTFLAGS="-C llvm-args=-opt-bisect-limit=-1" ` ) will show all passes and
68
- their corresponding index values.
85
+ * If you are hitting an assertion failure or segmentation fault from the LLVM
86
+ backend when invoking ` rustc ` itself, it is a good idea to try passing each
87
+ of these ` .bc ` files to the ` llc ` command, and see if you get the same
88
+ failure. (LLVM developers often prefer a bug reduced to a ` .bc ` file over one
89
+ that uses a Rust crate for its minimized reproduction.)
69
90
70
- If you want to play with the optimization pipeline, you can use the ` opt ` tool
71
- from ` ./build/<host-triple>/llvm/bin/ ` with the LLVM IR emitted by rustc. Note
72
- that rustc emits different IR depending on whether ` -O ` is enabled, even
91
+ * To get human readable versions of the LLVM bitcode, one just needs to convert
92
+ the bitcode (` .bc ` ) files to ` .ll ` files using ` llvm-dis ` , which should be in
93
+ the target local compilation of rustc.
94
+
95
+
96
+ Note that rustc emits different IR depending on whether ` -O ` is enabled, even
73
97
without LLVM's optimizations, so if you want to play with the IR rustc emits,
74
98
you should:
75
99
@@ -93,6 +117,18 @@ to some file. Also, if you are using neither `-filter-print-funcs` nor `-C
93
117
codegen-units=1`, then, because the multiple codegen units run in parallel, the
94
118
printouts will mix together and you won't be able to read anything.
95
119
120
+ * One caveat to the aforementioned methodology: the ` -print ` family of options
121
+ to LLVM only prints the IR unit that the pass runs on (e.g., just a
122
+ function), and does not include any referenced declarations, globals,
123
+ metadata, etc. This means you cannot in general feed the output of ` -print `
124
+ into ` llc ` to reproduce a given problem.
125
+
126
+ * Within LLVM itself, calling ` F.getParent()->dump() ` at the beginning of
127
+ ` SafeStackLegacyPass::runOnFunction ` will dump the whole module, which
128
+ may provide better basis for reproduction. (However, you
129
+ should be able to get that same dump from the ` .bc ` files dumped by
130
+ ` -C save-temps ` .)
131
+
96
132
If you want just the IR for a specific function (say, you want to see why it
97
133
causes an assertion or doesn't optimize correctly), you can use ` llvm-extract ` ,
98
134
e.g.
@@ -105,6 +141,45 @@ $ ./build/$TRIPLE/llvm/bin/llvm-extract \
105
141
> extracted.ll
106
142
```
107
143
144
+ ### Investigate LLVM optimization passes
145
+
146
+ If you are seeing incorrect behavior due to an optimization pass, a very handy
147
+ LLVM option is ` -opt-bisect-limit ` , which takes an integer denoting the index
148
+ value of the highest pass to run. Index values for taken passes are stable
149
+ from run to run; by coupling this with software that automates bisecting the
150
+ search space based on the resulting program, an errant pass can be quickly
151
+ determined. When an ` -opt-bisect-limit ` is specified, all runs are displayed
152
+ to standard error, along with their index and output indicating if the
153
+ pass was run or skipped. Setting the limit to an index of -1 (e.g.,
154
+ ` RUSTFLAGS="-C llvm-args=-opt-bisect-limit=-1" ` ) will show all passes and
155
+ their corresponding index values.
156
+
157
+ If you want to play with the optimization pipeline, you can use the [ ` opt ` ] tool
158
+ from ` ./build/<host-triple>/llvm/bin/ ` with the LLVM IR emitted by rustc.
159
+
160
+ When investigating the implementation of LLVM itself, you should be
161
+ aware of its [ internal debug infrastructure] [ llvm-debug ] .
162
+ This is provided in LLVM Debug builds, which you enable for rustc
163
+ LLVM builds by changing this setting in the config.toml:
164
+ ```
165
+ [llvm]
166
+ # Indicates whether the LLVM assertions are enabled or not
167
+ assertions = true
168
+
169
+ # Indicates whether the LLVM build is a Release or Debug build
170
+ optimize = false
171
+ ```
172
+ The quick summary is:
173
+ * Setting ` assertions=true ` enables coarse-grain debug messaging.
174
+ * beyond that, setting ` optimize=false ` enables fine-grain debug messaging.
175
+ * ` LLVM_DEBUG(dbgs() << msg) ` in LLVM is like ` debug!(msg) ` in ` rustc ` .
176
+ * The ` -debug ` option turns on all messaging; it is like setting the
177
+ environment variable ` RUSTC_LOG=debug ` in ` rustc ` .
178
+ * The ` -debug-only=<pass1>,<pass2> ` variant is more selective; it is like
179
+ setting the environment variable ` RUSTC_LOG=path1,path2 ` in ` rustc ` .
180
+
181
+ [ llvm-debug ] : https://llvm.org/docs/ProgrammersManual.html#the-llvm-debug-macro-and-debug-option
182
+
108
183
### Getting help and asking questions
109
184
110
185
If you have some questions, head over to the [ rust-lang Zulip] and
@@ -164,7 +239,9 @@ create a minimal working example with Godbolt. Go to
164
239
optimizations transform it.
165
240
166
241
5 . Once you have a godbolt link demonstrating the issue, it is pretty easy to
167
- fill in an LLVM bug. Just visit [ bugs.llvm.org] ( https://bugs.llvm.org/ ) .
242
+ fill in an LLVM bug. Just visit their [ github issues page] [ llvm-issues ] .
243
+
244
+ [ llvm-issues ] : https://github.com/llvm/llvm-project/issues
168
245
169
246
### Porting bug fixes from LLVM
170
247
0 commit comments