Skip to content

Commit 761119e

Browse files
committed
move over the ty README
1 parent dfa328f commit 761119e

File tree

2 files changed

+166
-2
lines changed

2 files changed

+166
-2
lines changed

src/SUMMARY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
- [Macro expansion](./macro-expansion.md)
1010
- [Name resolution](./name-resolution.md)
1111
- [HIR lowering](./hir-lowering.md)
12-
- [Representing types (`ty` module in depth)](./ty.md)
12+
- [The `ty` module: representing types](./ty.md)
1313
- [Type inference](./type-inference.md)
1414
- [Trait resolution](./trait-resolution.md)
1515
- [Type checking](./type-checking.md)

src/ty.md

Lines changed: 165 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,165 @@
1-
# Representing types (`ty` module in depth)
1+
# The `ty` module: representing types
2+
3+
The `ty` module defines how the Rust compiler represents types
4+
internally. It also defines the *typing context* (`tcx` or `TyCtxt`),
5+
which is the central data structure in the compiler.
6+
7+
## The tcx and how it uses lifetimes
8+
9+
The `tcx` ("typing context") is the central data structure in the
10+
compiler. It is the context that you use to perform all manner of
11+
queries. The struct `TyCtxt` defines a reference to this shared context:
12+
13+
```rust
14+
tcx: TyCtxt<'a, 'gcx, 'tcx>
15+
// -- ---- ----
16+
// | | |
17+
// | | innermost arena lifetime (if any)
18+
// | "global arena" lifetime
19+
// lifetime of this reference
20+
```
21+
22+
As you can see, the `TyCtxt` type takes three lifetime parameters.
23+
These lifetimes are perhaps the most complex thing to understand about
24+
the tcx. During Rust compilation, we allocate most of our memory in
25+
**arenas**, which are basically pools of memory that get freed all at
26+
once. When you see a reference with a lifetime like `'tcx` or `'gcx`,
27+
you know that it refers to arena-allocated data (or data that lives as
28+
long as the arenas, anyhow).
29+
30+
We use two distinct levels of arenas. The outer level is the "global
31+
arena". This arena lasts for the entire compilation: so anything you
32+
allocate in there is only freed once compilation is basically over
33+
(actually, when we shift to executing LLVM).
34+
35+
To reduce peak memory usage, when we do type inference, we also use an
36+
inner level of arena. These arenas get thrown away once type inference
37+
is over. This is done because type inference generates a lot of
38+
"throw-away" types that are not particularly interesting after type
39+
inference completes, so keeping around those allocations would be
40+
wasteful.
41+
42+
Often, we wish to write code that explicitly asserts that it is not
43+
taking place during inference. In that case, there is no "local"
44+
arena, and all the types that you can access are allocated in the
45+
global arena. To express this, the idea is to use the same lifetime
46+
for the `'gcx` and `'tcx` parameters of `TyCtxt`. Just to be a touch
47+
confusing, we tend to use the name `'tcx` in such contexts. Here is an
48+
example:
49+
50+
```rust
51+
fn not_in_inference<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, def_id: DefId) {
52+
// ---- ----
53+
// Using the same lifetime here asserts
54+
// that the innermost arena accessible through
55+
// this reference *is* the global arena.
56+
}
57+
```
58+
59+
In contrast, if we want to code that can be usable during type inference, then you
60+
need to declare a distinct `'gcx` and `'tcx` lifetime parameter:
61+
62+
```rust
63+
fn maybe_in_inference<'a, 'gcx, 'tcx>(tcx: TyCtxt<'a, 'gcx, 'tcx>, def_id: DefId) {
64+
// ---- ----
65+
// Using different lifetimes here means that
66+
// the innermost arena *may* be distinct
67+
// from the global arena (but doesn't have to be).
68+
}
69+
```
70+
71+
### Allocating and working with types
72+
73+
Rust types are represented using the `Ty<'tcx>` defined in the `ty`
74+
module (not to be confused with the `Ty` struct from [the HIR]). This
75+
is in fact a simple type alias for a reference with `'tcx` lifetime:
76+
77+
```rust
78+
pub type Ty<'tcx> = &'tcx TyS<'tcx>;
79+
```
80+
81+
[the HIR]: ../hir/README.md
82+
83+
You can basically ignore the `TyS` struct -- you will basically never
84+
access it explicitly. We always pass it by reference using the
85+
`Ty<'tcx>` alias -- the only exception I think is to define inherent
86+
methods on types. Instances of `TyS` are only ever allocated in one of
87+
the rustc arenas (never e.g. on the stack).
88+
89+
One common operation on types is to **match** and see what kinds of
90+
types they are. This is done by doing `match ty.sty`, sort of like this:
91+
92+
```rust
93+
fn test_type<'tcx>(ty: Ty<'tcx>) {
94+
match ty.sty {
95+
ty::TyArray(elem_ty, len) => { ... }
96+
...
97+
}
98+
}
99+
```
100+
101+
The `sty` field (the origin of this name is unclear to me; perhaps
102+
structural type?) is of type `TypeVariants<'tcx>`, which is an enum
103+
defining all of the different kinds of types in the compiler.
104+
105+
> NB: inspecting the `sty` field on types during type inference can be
106+
> risky, as there may be inference variables and other things to
107+
> consider, or sometimes types are not yet known that will become
108+
> known later.).
109+
110+
To allocate a new type, you can use the various `mk_` methods defined
111+
on the `tcx`. These have names that correpond mostly to the various kinds
112+
of type variants. For example:
113+
114+
```rust
115+
let array_ty = tcx.mk_array(elem_ty, len * 2);
116+
```
117+
118+
These methods all return a `Ty<'tcx>` -- note that the lifetime you
119+
get back is the lifetime of the innermost arena that this `tcx` has
120+
access to. In fact, types are always canonicalized and interned (so we
121+
never allocate exactly the same type twice) and are always allocated
122+
in the outermost arena where they can be (so, if they do not contain
123+
any inference variables or other "temporary" types, they will be
124+
allocated in the global arena). However, the lifetime `'tcx` is always
125+
a safe approximation, so that is what you get back.
126+
127+
> NB. Because types are interned, it is possible to compare them for
128+
> equality efficiently using `==` -- however, this is almost never what
129+
> you want to do unless you happen to be hashing and looking for
130+
> duplicates. This is because often in Rust there are multiple ways to
131+
> represent the same type, particularly once inference is involved. If
132+
> you are going to be testing for type equality, you probably need to
133+
> start looking into the inference code to do it right.
134+
135+
You can also find various common types in the `tcx` itself by accessing
136+
`tcx.types.bool`, `tcx.types.char`, etc (see `CommonTypes` for more).
137+
138+
### Beyond types: Other kinds of arena-allocated data structures
139+
140+
In addition to types, there are a number of other arena-allocated data
141+
structures that you can allocate, and which are found in this
142+
module. Here are a few examples:
143+
144+
- `Substs`, allocated with `mk_substs` -- this will intern a slice of types, often used to
145+
specify the values to be substituted for generics (e.g., `HashMap<i32, u32>`
146+
would be represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`).
147+
- `TraitRef`, typically passed by value -- a **trait reference**
148+
consists of a reference to a trait along with its various type
149+
parameters (including `Self`), like `i32: Display` (here, the def-id
150+
would reference the `Display` trait, and the substs would contain
151+
`i32`).
152+
- `Predicate` defines something the trait system has to prove (see `traits` module).
153+
154+
### Import conventions
155+
156+
Although there is no hard and fast rule, the `ty` module tends to be used like so:
157+
158+
```rust
159+
use ty::{self, Ty, TyCtxt};
160+
```
161+
162+
In particular, since they are so common, the `Ty` and `TyCtxt` types
163+
are imported directly. Other types are often referenced with an
164+
explicit `ty::` prefix (e.g., `ty::TraitRef<'tcx>`). But some modules
165+
choose to import a larger or smaller set of names explicitly.

0 commit comments

Comments
 (0)