Skip to content

Commit 066a32c

Browse files
authored
The first approximation of name resolution (#22)
* The first approximation of name resolution The first attempt to write something useful about the name resolution. As the TODO section says, his is not finished thing, but it might hopefully be useful to someone already.
1 parent 688d1b0 commit 066a32c

File tree

1 file changed

+106
-0
lines changed

1 file changed

+106
-0
lines changed

Diff for: src/name-resolution.md

+106
Original file line numberDiff line numberDiff line change
@@ -1 +1,107 @@
11
# Name resolution
2+
3+
The name resolution is a separate pass in the compiler. Its input is the syntax
4+
tree, produced by parsing input files. It produces links from all the names in
5+
the source to relevant places where the name was introduced. It also generates
6+
helpful error messages, like typo suggestions or traits to import.
7+
8+
The name resolution lives in the `librustc_resolve` crate, with the meat in
9+
`lib.rs` and some helpers or symbol-type specific logic in the other modules.
10+
11+
## Namespaces
12+
13+
Different kind of symbols live in different namespaces ‒ eg. types don't
14+
clash with variables. This usually doesn't happen, because variables start with
15+
lower-case letter while types with upper case one, but this is only a
16+
convention. This is legal Rust code that'll compile (with warnings):
17+
18+
```rust
19+
type x = u32;
20+
let x: x = 1;
21+
let y: x = 2; // See? x is still a type here.
22+
```
23+
24+
To cope with this, and with slightly different scoping rules for these
25+
namespaces, the resolver keeps them separated and builds separate structures for
26+
them.
27+
28+
In other words, when the code talks about namespaces, it doesn't mean the module
29+
hierarchy, it's types vs. values vs. macros.
30+
31+
## Scopes and ribs
32+
33+
A name is visible only in certain area in the source code. This forms a
34+
hierarchical structure, but not necessarily a simple one ‒ if one scope is part
35+
of another, it doesn't mean the name visible in the outer one is also visible in
36+
the inner one, or that it refers to the same thing.
37+
38+
To cope with that, the compiler introduces the concept of Ribs. This is
39+
abstraction of a scope. Every time the set of visible names potentially changes,
40+
a new rib is pushed onto a stack. The places where this can happen includes for
41+
example:
42+
43+
* The obvious places ‒ curly braces enclosing a block, function boundaries,
44+
modules.
45+
* Introducing a let binding ‒ this can shadow another binding with the same
46+
name.
47+
* Macro expansion border ‒ to cope with macro hygiene.
48+
49+
When searching for a name, the stack of ribs is traversed from the innermost
50+
outwards. This helps to find the closest meaning of the name (the one not
51+
shadowed by anything else). The transition to outer rib may also change the
52+
rules what names are usable ‒ if there are nested functions (not closures), the
53+
inner one can't access parameters and local bindings of the outer one, even
54+
though they should be visible by ordinary scoping rules. An example:
55+
56+
```rust
57+
fn do_something<T: Default>(val: T) { // <- New rib in both types and values (1)
58+
// `val` is accessible, as is the helper function
59+
// `T` is accessible
60+
let helper = || { // New rib on `helper` (2) and another on the block (3)
61+
// `val` is accessible here
62+
}; // End of (3)
63+
// `val` is accessible, `helper` variable shadows `helper` function
64+
fn helper() { // <- New rib in both types and values (4)
65+
// `val` is not accessible here, (4) is not transparent for locals)
66+
// `T` is not accessible here
67+
} // End of (4)
68+
let val = T::default(); // New rib (5)
69+
// `val` is the variable, not the parameter here
70+
} // End of (5), (2) and (1)
71+
```
72+
73+
Because the rules for different namespaces are a bit different, each namespace
74+
has its own independent rib stack that is constructed in parallel to the others.
75+
76+
## Overall strategy
77+
78+
To perform the name resolution of the whole crate, the syntax tree is traversed
79+
top-down and every encountered name is resolved. This works for most kinds of
80+
names, because at the point of use of a name it is already introduced in the Rib
81+
hierarchy.
82+
83+
There are some exceptions to this. Items are bit tricky, because they can be
84+
used even before encountered ‒ therefore every block needs to be first scanned
85+
for items to fill in its Rib.
86+
87+
Other, even more problematic ones, are imports which need recursive fixed-point
88+
resolution and macros, that need to be resolved and expanded before the rest of
89+
the code can be processed.
90+
91+
Therefore, the resolution is performed in multiple stages.
92+
93+
## TODO:
94+
95+
This is a result of the first pass of learning the code. It is definitely
96+
incomplete and not detailed enough. It also might be inaccurate in places.
97+
Still, it probably provides useful first guidepost to what happens in there.
98+
99+
* What exactly does it link to and how is that published and consumed by
100+
following stages of compilation?
101+
* Who calls it and how it is actually used.
102+
* Is it a pass and then the result is only used, or can it be computed
103+
incrementally (eg. for RLS)?
104+
* The overall strategy description is a bit vague.
105+
* Where does the name `Rib` come from?
106+
* Does this thing have its own tests, or is it tested only as part of some e2e
107+
testing?

0 commit comments

Comments
 (0)