|
1 | 1 | # Memory Management in Rustc
|
2 | 2 |
|
3 |
| -Generally `rustc` tries to be pretty careful how it manages memory. The |
4 |
| -compiler allocates _a lot_ of data structures throughout compilation, and if we |
5 |
| -are not careful, it will take a lot of time and space to do so. |
| 3 | +Generally rustc tries to be pretty careful how it manages memory. |
| 4 | +The compiler allocates _a lot_ of data structures throughout compilation, |
| 5 | +and if we are not careful, it will take a lot of time and space to do so. |
6 | 6 |
|
7 |
| -One of the main way the compiler manages this is using [`arena`]s and [interning]. |
| 7 | +One of the main way the compiler manages this is using [arena]s and [interning]. |
8 | 8 |
|
9 |
| -[`arena`]: https://en.wikipedia.org/wiki/Region-based_memory_management |
| 9 | +[arena]: https://en.wikipedia.org/wiki/Region-based_memory_management |
10 | 10 | [interning]: https://en.wikipedia.org/wiki/String_interning
|
11 | 11 |
|
12 | 12 | ## Arenas and Interning
|
13 | 13 |
|
14 | 14 | Since A LOT of data structures are created during compilation, for performance
|
15 |
| -reasons, we allocate them from a global memory pool. Each are allocated once |
16 |
| -from a long-lived *`arena`*. This is called _arena allocation_. This system |
17 |
| -reduces allocations/deallocations of memory. It also allows for easy comparison |
18 |
| -of types (more on types [here](./ty.md)) for equality: for each interned |
19 |
| -type `X`, we implemented [`PartialEq` for X][peqimpl], so we can just compare |
20 |
| -pointers. The [`CtxtInterners`] type contains a bunch of maps of interned types |
21 |
| -and the `arena` itself. |
| 15 | +reasons, we allocate them from a global memory pool. |
| 16 | +Each are allocated once from a long-lived *arena*. |
| 17 | +This is called _arena allocation_. |
| 18 | +This system reduces allocations/deallocations of memory. |
| 19 | +It also allows for easy comparison of types (more on types [here](./ty.md)) for equality: |
| 20 | +for each interned type `X`, we implemented [`PartialEq` for X][peqimpl], |
| 21 | +so we can just compare pointers. |
| 22 | +The [`CtxtInterners`] type contains a bunch of maps of interned types and the arena itself. |
22 | 23 |
|
23 | 24 | [`CtxtInterners`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.CtxtInterners.html#structfield.arena
|
24 | 25 | [peqimpl]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Ty.html#implementations
|
25 | 26 |
|
26 | 27 | ### Example: `ty::TyKind`
|
27 | 28 |
|
28 |
| -Take [`ty::TyKind`] which represents a type in the compiler. Each time we want |
29 |
| -to construct a type, the compiler doesn’t naively allocate from the buffer. |
30 |
| -Instead, we check if that type was already constructed. If it was, we just get |
31 |
| -the same pointer we had before, otherwise we make a fresh pointer. With this |
32 |
| -schema if we want to know if two types are the same, all we need to do is |
33 |
| -compare pointers, which is efficient. [`TyKind`] should never be constructed on |
34 |
| -the stack, and it would be unusable if done so. You always allocate them from |
35 |
| -this `arena` and you always intern them so they are unique. |
36 |
| - |
37 |
| -At the beginning of the compilation we make a buffer and each time we need to |
38 |
| -allocate a type we use some of this memory buffer. If we run out of space we |
39 |
| -get another one. The lifetime of that buffer is `'tcx`. Our types are tied to |
40 |
| -that lifetime, so when compilation finishes all the memory related to that |
41 |
| -buffer is freed and our `'tcx` references are invalidated. |
42 |
| - |
43 |
| -In addition to types, there are a number of other `arena`-allocated data |
44 |
| -structures that you can allocate, and which are found in this module. Here are |
45 |
| -a few examples: |
| 29 | +Taking the example of [`ty::TyKind`] which represents a type in the compiler (you |
| 30 | +can read more [here](./ty.md)). Each time we want to construct a type, the |
| 31 | +compiler doesn’t naively allocate from the buffer. Instead, we check if that |
| 32 | +type was already constructed. If it was, we just get the same pointer we had |
| 33 | +before, otherwise we make a fresh pointer. With this schema if we want to know |
| 34 | +if two types are the same, all we need to do is compare the pointers which is |
| 35 | +efficient. [`TyKind`] should never be constructed on the stack, and it would be unusable |
| 36 | +if done so. |
| 37 | +You always allocate them from this arena and you always intern them so they are |
| 38 | +unique. |
| 39 | + |
| 40 | +At the beginning of the compilation we make a buffer and each time we need to allocate a type we use |
| 41 | +some of this memory buffer. If we run out of space we get another one. The lifetime of that buffer |
| 42 | +is `'tcx`. Our types are tied to that lifetime, so when compilation finishes all the memory related |
| 43 | +to that buffer is freed and our `'tcx` references would be invalid. |
| 44 | + |
| 45 | +In addition to types, there are a number of other arena-allocated data structures that you can |
| 46 | +allocate, and which are found in this module. Here are a few examples: |
46 | 47 |
|
47 | 48 | - [`GenericArgs`], allocated with [`mk_args`] – this will intern a slice of types, often used
|
48 |
| - to specify the values to be substituted for generics args (e.g. `HashMap<i32, u32>` would be |
49 |
| - represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`). |
50 |
| -- [`TraitRef`], typically passed by value – a **trait reference** consists of a |
51 |
| - reference to a trait along with its various type parameters (including |
52 |
| - `Self`), like `i32: Display` (here, the def-id would reference the `Display` |
53 |
| - trait, and the args would contain `i32`). Note that [`def-id`] is defined and |
54 |
| - discussed in depth in the [`AdtDef` and `DefId`] section. |
| 49 | +to specify the values to be substituted for generics args (e.g. `HashMap<i32, u32>` would be |
| 50 | +represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`). |
| 51 | +- [`TraitRef`], typically passed by value – a **trait reference** consists of a reference to a trait |
| 52 | + along with its various type parameters (including `Self`), like `i32: Display` (here, the def-id |
| 53 | + would reference the `Display` trait, and the args would contain `i32`). Note that `def-id` is |
| 54 | + defined and discussed in depth in the `AdtDef and DefId` section. |
55 | 55 | - [`Predicate`] defines something the trait system has to prove (see [traits] module).
|
56 | 56 |
|
57 | 57 | [`AdtDef` and `DefId`]: ./ty.md#adts-representation
|
@@ -86,7 +86,7 @@ the arenas, anyhow).
|
86 | 86 | ### A Note On Lifetimes
|
87 | 87 |
|
88 | 88 | The Rust compiler is a fairly large program containing lots of big data
|
89 |
| -structures (e.g. the [Abstract Syntax Tree (`AST`)][ast], [High-Level Intermediate |
| 89 | +structures (e.g. the [Abstract Syntax Tree (AST)][ast], [High-Level Intermediate |
90 | 90 | Representation (`HIR`)][hir], and the type system) and as such, arenas and
|
91 | 91 | references are heavily relied upon to minimize unnecessary memory use. This
|
92 | 92 | manifests itself in the way people can plug into the compiler (i.e. the
|
|
0 commit comments