Skip to content

Commit 76a7f10

Browse files
amanjeevmark-i-m
authored andcommitted
Added Rustc Debugger Support Chapter
1 parent f55e97c commit 76a7f10

File tree

2 files changed

+322
-0
lines changed

2 files changed

+322
-0
lines changed

Diff for: src/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@
8484
- [Updating LLVM](./codegen/updating-llvm.md)
8585
- [Debugging LLVM](./codegen/debugging.md)
8686
- [Profile-guided Optimization](./profile-guided-optimization.md)
87+
- [Debugging Support in Rust Compiler](./debugging-support-in-rustc.md)
8788

8889
---
8990

Diff for: src/debugging-support-in-rustc.md

+321
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,321 @@
1+
# Debugging support in the Rust compiler
2+
3+
This document explains the state of debugging tools support in the Rust compiler (rustc).
4+
The document gives an overview of debugging tools like GDB, LLDB etc. and infrastrcture
5+
around Rust compiler to debug Rust code. If you want to learn how to debug the Rust compiler
6+
itself, then you must see [Debugging the Compiler] page.
7+
8+
The material is gathered from YouTube video [Tom Tromey discusses debugging support in rustc].
9+
10+
## Preliminaries
11+
12+
### Debuggers
13+
14+
According to Wikipedia
15+
16+
> A [debugger or debugging tool] is a computer program that is used to test and debug
17+
> other programs (the "target" program).
18+
19+
Writing a debugger from scratch for a language requires a lot of work, especially if
20+
debuggers have to be supported on various platforms. GDB and LLDB, however, can be
21+
extended to support debugging a language. This is the path that Rust has chosen.
22+
This document's main goal is to document the said debuggers support in Rust compiler.
23+
24+
### DWARF
25+
26+
According to the [DWARF] standard website
27+
28+
> DWARF is a debugging file format used by many compilers and debuggers to support source level
29+
> debugging. It addresses the requirements of a number of procedural languages,
30+
> such as C, C++, and Fortran, and is designed to be extensible to other languages.
31+
> DWARF is architecture independent and applicable to any processor or operating system.
32+
> It is widely used on Unix, Linux and other operating systems,
33+
> as well as in stand-alone environments.
34+
35+
DWARF reader is a program that consumes the DWARF format and creates debugger compatible output.
36+
This program may live in the compiler itself. DWARF uses a data structure called
37+
Debugging Information Entry (DIE) which stores the information as "tags" to denote functions,
38+
variables etc., e.g., `DW_TAG_variable`, `DW_TAG_pointer_type`, `DW_TAG_subprogram` etc.
39+
You can also invent your own tags and attributes.
40+
41+
## Supported debuggers
42+
43+
### GDB
44+
45+
We have our own fork of GDB - [https://github.com/rust-dev-tools/gdb]
46+
47+
#### Rust expression parser
48+
49+
To be able to show debug output we need an expression parser.
50+
This (GDB) expression parser is written in [Bison] and is only a subset of Rust expressions.
51+
This means that this parser can parse only a subset of Rust expressions.
52+
GDB parser was written from scratch and has no relation to any other parser.
53+
For example, this parser is not related to Rustc's parser.
54+
55+
GDB has Rust like value and type output. It can print values and types in a way
56+
that look like Rust syntax in the output. Or when you print a type as [ptype] in GDB,
57+
it also looks like Rust source code. Checkout the documentation in the [manual for GDB/Rust].
58+
59+
#### Parser extensions
60+
61+
Expression parser has a couple of extensions in it to facilitate features that you cannot do
62+
with Rust. Some limitations are listed in the [manual for GDB/Rust]. There is some special
63+
code in the DWARF reader in GDB to support the extensions.
64+
65+
A couple of examples of DWARF reader support needed are as follows -
66+
67+
1. Enum: Needed for support for enum types. The Rustc writes the information about enum into
68+
DWARF and GDB reads the DWARF to understand where is the tag field or is there a tag
69+
field or is the tag slot shared with non-zero optimization etc.
70+
71+
2. Dissect trait objects: DWARF extension where the trait object's description in the DWARF
72+
also points to a stub description of the corresponding vtable which in turn points to the
73+
concrete type for which this trait object exists. This means that you can do a `print *object`
74+
for that trait object, and GDB will understand how to find the correct type of the payload in
75+
the trait object.
76+
77+
**TODO**: Figure out if the following should be mentioned in the GDB-Rust document rather than
78+
this guide page so there is no duplication. This is regarding the following comments:
79+
80+
[This comment by Tom](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r284027340)
81+
> gdb's Rust extensions and limitations are documented in the gdb manual:
82+
https://sourceware.org/gdb/onlinedocs/gdb/Rust.html -- however, this neglects to mention that
83+
gdb convenience variables and registers follow the gdb $ convention, and that the Rust parser
84+
implements the gdb @ extension.
85+
86+
[This question by Aman](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r285401353)
87+
> @tromey do you think we should mention this part in the GDB-Rust document rather than this
88+
document so there is no duplication etc.?
89+
90+
#### Developer notes
91+
92+
* This work is now upstream. Bugs can be reported in [GDB Bugzilla].
93+
94+
### LLDB
95+
96+
We have our own fork of LLDB - [https://github.com/rust-lang/lldb]
97+
98+
Fork of LLVM project - [https://github.com/rust-lang/llvm-project]
99+
100+
LLDB currently only works on macOS because of a dependency issue. This issue was easier to
101+
solve for macOS as compared to Linux. However, Tom has a possible solution which can enable
102+
us to ship LLDB everywhere.
103+
104+
#### Rust expression parser
105+
106+
This expression parser is written in C++. It is a type of [Recursive Descent parser].
107+
Implements slightly less of the Rust language than GDB. LLDB has Rust like value and type output.
108+
109+
#### Parser extensions
110+
111+
There is some special code in the DWARF reader in LLDB to support the extensions.
112+
A couple of examples of DWARF reader support needed are as follows -
113+
114+
1. Enum: Needed for support for enum types. The Rustc writes the information about
115+
enum into DWARF and LLDB reads the DWARF to understand where is the tag field or
116+
is there a tag field or is the tag slot shared with non-zero optimization etc.
117+
In other words, it has enum support as well.
118+
119+
#### Developer notes
120+
121+
* None of the LLDB work is upstream. This [rust-lang/lldb wiki page] explains a few details.
122+
* The reason for forking LLDB is that LLDB recently removed all the other language plugins
123+
due to lack of maintenance.
124+
* LLDB has a plugin architecture but that does not work for language support.
125+
* LLDB is available via Rust build (`rustup`).
126+
* GDB generally works better on Linux.
127+
128+
## DWARF and Rustc
129+
130+
[DWARF] is the standard way compilers generate debugging information that debuggers read.
131+
It is _the_ debugging format on macOS and Linux. It is a multi-language, extensible format
132+
and is mostly good enough for Rust's purposes. Hence, the current implementation reuses DWARF's
133+
concepts. This is true even if some of the concepts in DWARF do not align with Rust
134+
semantically because generally there can be some kind of mapping between the two.
135+
136+
We have some DWARF extensions that the Rust compiler emits and the debuggers understand that
137+
are _not_ in the DWARF standard.
138+
139+
* Rust compiler will emit DWARF for a virtual table, and this `vtable` object will have a
140+
`DW_AT_containing_type` that points to the real type. This lets debuggers dissect a trait object
141+
pointer to correctly find the payload. E.g., here's such a DIE, from a test case in the gdb
142+
repository:
143+
144+
```asm
145+
<1><1a9>: Abbrev Number: 3 (DW_TAG_structure_type)
146+
<1aa> DW_AT_containing_type: <0x1b4>
147+
<1ae> DW_AT_name : (indirect string, offset: 0x23d): vtable
148+
<1b2> DW_AT_byte_size : 0
149+
<1b3> DW_AT_alignment : 8
150+
```
151+
152+
* The other extension is that the Rust compiler can emit a tagless discriminated union.
153+
See [DWARF feature request] for this item.
154+
155+
### Current limitations of DWARF
156+
157+
* Traits - require a bigger change than normal to DWARF, on how to represent Traits in DWARF.
158+
* DWARF provides no way to differentiate between Structs and Tuples. Rust compiler emits
159+
fields with `__0` and debuggers look for a sequence of such names to overcome this limitation.
160+
For example, in this case the debugger would look at a field via `x.__0` instead of `x.0`.
161+
This is resolved via the Rust parser in the debugger so now you can do `x.0`.
162+
163+
DWARF relies on debuggers to know some information about platform ABI.
164+
Rust does not do that all the time.
165+
166+
## Developer notes
167+
168+
This section is from the talk about certain aspects of development.
169+
170+
## What is missing
171+
172+
### Shipping GDB in Rustup
173+
174+
Tracking issue: [https://github.com/rust-lang/rust/issues/34457]
175+
176+
Shipping GDB requires change to Rustup delivery system. To manage Rustup build size and
177+
times we need to build GDB separately, on its own and somehow provide the artifacts produced
178+
to be included in the final build. However, if we can ship GDB with rustup, it will simplify
179+
the development process by having compiler emit new debug info which can be readily consumed.
180+
181+
Main issue in achieving this is setting up dependencies. One such dependency is Python. That
182+
is why we have our own fork of GDB because one of the drivers is patched on Rust's side to
183+
check the correct version of Python (Python 2.7 in this case. *Note: Python3 is not chosen
184+
for this purpose because Python's stable ABI is limited and is not sufficient for GDB's needs.
185+
See [https://docs.python.org/3/c-api/stable.html]*).
186+
187+
This is to keep updates to debugger as fast as possible as we make changes to the debugging symbols.
188+
In essence, to ship the debugger as soon as new debugging info is added. GDB only releases
189+
every six months or so. However, the changes that are
190+
not related to Rust itself should ideally be first merged to upstream eventually.
191+
192+
### Code signing for LLDB debug server on macOS
193+
194+
According to Wikipedia, [System Integrity Protection] is
195+
196+
> System Integrity Protection (SIP, sometimes referred to as rootless) is a security feature
197+
> of Apple's macOS operating system introduced in OS X El Capitan. It comprises a number of
198+
> mechanisms that are enforced by the kernel. A centerpiece is the protection of system-owned
199+
> files and directories against modifications by processes without a specific "entitlement",
200+
> even when executed by the root user or a user with root privileges (sudo).
201+
202+
It prevents processes using `ptrace` syscall. If a process wants to use `ptrace` it has to be
203+
code signed. The certificate that signs it has to be trusted on your machine.
204+
205+
See [Apple developer documentation for System Integrity Protection].
206+
207+
We may need to sign up with Apple and get the keys to do this signing. Tom has looked into if
208+
Mozilla cannot do this because it is at the maximum number of
209+
keys it is allowed to sign. Tom does not know if Mozilla could get more keys.
210+
211+
Alternatively, Tom suggests that maybe a Rust legal entity is needed to get the keys via Apple.
212+
This problem is not technical in nature. If we had such a key we could sign GDB as well and
213+
ship that.
214+
215+
### DWARF and Traits
216+
217+
Rust traits are not emitted into DWARF at all. The impact of this is calling a method `x.method()`
218+
does not work as is. The reason being that method is implemented by a trait, as opposed
219+
to a type. That information is not present so finding trait methods is missing.
220+
221+
DWARF has a notion of interface types (possibly added for Java). Tom's idea was to use this
222+
interface type as traits.
223+
224+
DWARF only deals with concrete names, not the reference types. So, a given implementation of a
225+
trait for a type would be one of these interfaces (`DW_tag_interface` type). Also, the type for
226+
which it is implemented would describe all the interfaces this type implements. This requires a
227+
DWARF extension.
228+
229+
Issue on Github: [https://github.com/rust-lang/rust/issues/33014]
230+
231+
## Typical process for a Debug Info change (LLVM)
232+
233+
LLVM has Debug Info (DI) builders. This is the primary thing that Rust calls into.
234+
This is why we need to change LLVM first because that is emitted first and not DWARF directly.
235+
This is a kind of metadata that you construct and hand-off to LLVM. For the Rustc/LLVM hand-off
236+
some LLVM DI builder methods are called to construct representation of a type.
237+
238+
The steps of this process are as follows -
239+
240+
1. LLVM needs changing.
241+
242+
LLVM does not emit Interface types at all, so this needs to be implemented in the LLVM first.
243+
244+
Get sign off on LLVM maintainers that this is a good idea.
245+
246+
2. Change the DWARF extension.
247+
248+
3. Update the debuggers.
249+
250+
Update DWARF readers, expression evaluators.
251+
252+
4. Update Rust compiler.
253+
254+
Change it to emit this new information.
255+
256+
### Procedural macro stepping
257+
258+
A deeply profound question is that how do you actually debug a procedural macro?
259+
What is the location you emit for a macro expansion? Consider some of the following cases -
260+
261+
* You can emit location of the invocation of the macro.
262+
* You can emit the location of the definition of the macro.
263+
* You can emit locations of the content of the macro.
264+
265+
RFC: [https://github.com/rust-lang/rfcs/pull/2117]
266+
267+
Focus is to let macros decide what to do. This can be achieved by having some kind of attribute
268+
that lets the macro tell the compiler where the line marker should be. This affects where you
269+
set the breakpoints and what happens when you step it.
270+
271+
## Future work
272+
273+
#### Name mangling changes
274+
275+
* New demangler in `libiberty` (gcc source tree).
276+
* New demangler in LLVM or LLDB.
277+
278+
**TODO**: Check the location of the demangler source.
279+
[Question on Github](https://github.com/rust-lang/rustc-guide/pull/316#discussion_r283062536).
280+
281+
#### Reuse Rust compiler for expressions
282+
283+
This is an important idea because debuggers by and large do not try to implement type
284+
inference. You need to be much more explicit when you type into the debugger than your
285+
actual source code. So, you cannot just copy and paste an expression from your source
286+
code to debugger and expect the same answer but this would be nice. This can be helped
287+
by using compiler.
288+
289+
It is certainly doable but it is a large project. You certainly need a bridge to the
290+
debugger because the debugger alone has access to the memory. Both GDB (gcc) and LLDB (clang)
291+
have this feature. LLDB uses Clang to compile code to JIT and GDB can do the same with GCC.
292+
293+
Both debuggers expression evaluation implement both a superset and a subset of Rust.
294+
They implement just the expression language but they also add some extensions like GDB has
295+
convenience variables. Therefore, if you are taking this route then you not only need
296+
to do this bridge but may have to add some mode to let the compiler understand some extensions.
297+
298+
#### Windows debugging (PDB) is missing
299+
300+
This is a complete unknown.
301+
302+
[Tom Tromey discusses debugging support in rustc]: https://www.youtube.com/watch?v=elBxMRSNYr4
303+
[Debugging the Compiler]: compiler-debugging.md
304+
[debugger or debugging tool]: https://en.wikipedia.org/wiki/Debugger
305+
[Bison]: https://www.gnu.org/software/bison/
306+
[ptype]: https://ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_109.html
307+
[rust-lang/lldb wiki page]: https://github.com/rust-lang/lldb/wiki
308+
[DWARF]: http://dwarfstd.org
309+
[manual for GDB/Rust]: https://sourceware.org/gdb/onlinedocs/gdb/Rust.html
310+
[GDB Bugzilla]: https://sourceware.org/bugzilla/
311+
[Recursive Descent parser]: https://en.wikipedia.org/wiki/Recursive_descent_parser
312+
[System Integrity Protection]: https://en.wikipedia.org/wiki/System_Integrity_Protection
313+
[https://github.com/rust-dev-tools/gdb]: https://github.com/rust-dev-tools/gdb
314+
[DWARF feature request]: http://dwarfstd.org/ShowIssue.php?issue=180517.2
315+
[https://docs.python.org/3/c-api/stable.html]: https://docs.python.org/3/c-api/stable.html
316+
[https://github.com/rust-lang/rfcs/pull/2117]: https://github.com/rust-lang/rfcs/pull/2117
317+
[https://github.com/rust-lang/rust/issues/33014]: https://github.com/rust-lang/rust/issues/33014
318+
[https://github.com/rust-lang/rust/issues/34457]: https://github.com/rust-lang/rust/issues/34457
319+
[Apple developer documentation for System Integrity Protection]: https://developer.apple.com/library/archive/releasenotes/MacOSX/WhatsNewInOSX/Articles/MacOSX10_11.html#//apple_ref/doc/uid/TP40016227-SW11
320+
[https://github.com/rust-lang/lldb]: https://github.com/rust-lang/lldb
321+
[https://github.com/rust-lang/llvm-project]: https://github.com/rust-lang/llvm-project

0 commit comments

Comments
 (0)