|
1 | 1 | # Name resolution
|
| 2 | + |
| 3 | +The name resolution is a separate pass in the compiler. Its input is the syntax |
| 4 | +tree, produced by parsing input files. It produces links from all the names in |
| 5 | +the source to relevant places where the name was introduced. It also generates |
| 6 | +helpful error messages, like typo suggestions or traits to import. |
| 7 | + |
| 8 | +The name resolution lives in the `librustc_resolve` crate, with the meat in |
| 9 | +`lib.rs` and some helpers or symbol-type specific logic in the other modules. |
| 10 | + |
| 11 | +## Namespaces |
| 12 | + |
| 13 | +Different kind of symbols live in different namespaces ‒ eg. types don't |
| 14 | +clash with variables. This usually doesn't happen, because variables start with |
| 15 | +lower-case letter while types with upper case one, but this is only a |
| 16 | +convention. This is legal Rust code that'll compile (with warnings): |
| 17 | + |
| 18 | +```rust |
| 19 | +type x = u32; |
| 20 | +let x: x = 1; |
| 21 | +let y: x = 2; // See? x is still a type here. |
| 22 | +``` |
| 23 | + |
| 24 | +To cope with this, and with slightly different scoping rules for these |
| 25 | +namespaces, the resolver keeps them separated and builds separate structures for |
| 26 | +them. |
| 27 | + |
| 28 | +In other words, when the code talks about namespaces, it doesn't mean the module |
| 29 | +hierarchy, it's types vs. values vs. macros. |
| 30 | + |
| 31 | +## Scopes and ribs |
| 32 | + |
| 33 | +A name is visible only in certain area in the source code. This forms a |
| 34 | +hierarchical structure, but not necessarily a simple one ‒ if one scope is part |
| 35 | +of another, it doesn't mean the name visible in the outer one is also visible in |
| 36 | +the inner one, or that it refers to the same thing. |
| 37 | + |
| 38 | +To cope with that, the compiler introduces the concept of Ribs. This is |
| 39 | +abstraction of a scope. Every time the set of visible names potentially changes, |
| 40 | +a new rib is pushed onto a stack. The places where this can happen includes for |
| 41 | +example: |
| 42 | + |
| 43 | +* The obvious places ‒ curly braces enclosing a block, function boundaries, |
| 44 | + modules. |
| 45 | +* Introducing a let binding ‒ this can shadow another binding with the same |
| 46 | + name. |
| 47 | +* Macro expansion border ‒ to cope with macro hygiene. |
| 48 | + |
| 49 | +When searching for a name, the stack of ribs is traversed from the innermost |
| 50 | +outwards. This helps to find the closest meaning of the name (the one not |
| 51 | +shadowed by anything else). The transition to outer rib may also change the |
| 52 | +rules what names are usable ‒ if there are nested functions (not closures), the |
| 53 | +inner one can't access parameters and local bindings of the outer one, even |
| 54 | +though they should be visible by ordinary scoping rules. An example: |
| 55 | + |
| 56 | +```rust |
| 57 | +fn do_something<T: Default>(val: T) { // <- New rib in both types and values (1) |
| 58 | + // `val` is accessible, as is the helper function |
| 59 | + // `T` is accessible |
| 60 | + let helper = || { // New rib on `helper` (2) and another on the block (3) |
| 61 | + // `val` is accessible here |
| 62 | + }; // End of (3) |
| 63 | + // `val` is accessible, `helper` variable shadows `helper` function |
| 64 | + fn helper() { // <- New rib in both types and values (4) |
| 65 | + // `val` is not accessible here, (4) is not transparent for locals) |
| 66 | + // `T` is not accessible here |
| 67 | + } // End of (4) |
| 68 | + let val = T::default(); // New rib (5) |
| 69 | + // `val` is the variable, not the parameter here |
| 70 | +} // End of (5), (2) and (1) |
| 71 | +``` |
| 72 | + |
| 73 | +Because the rules for different namespaces are a bit different, each namespace |
| 74 | +has its own independent rib stack that is constructed in parallel to the others. |
| 75 | + |
| 76 | +## Overall strategy |
| 77 | + |
| 78 | +To perform the name resolution of the whole crate, the syntax tree is traversed |
| 79 | +top-down and every encountered name is resolved. This works for most kinds of |
| 80 | +names, because at the point of use of a name it is already introduced in the Rib |
| 81 | +hierarchy. |
| 82 | + |
| 83 | +There are some exceptions to this. Items are bit tricky, because they can be |
| 84 | +used even before encountered ‒ therefore every block needs to be first scanned |
| 85 | +for items to fill in its Rib. |
| 86 | + |
| 87 | +Other, even more problematic ones, are imports which need recursive fixed-point |
| 88 | +resolution and macros, that need to be resolved and expanded before the rest of |
| 89 | +the code can be processed. |
| 90 | + |
| 91 | +Therefore, the resolution is performed in multiple stages. |
| 92 | + |
| 93 | +## TODO: |
| 94 | + |
| 95 | +This is a result of the first pass of learning the code. It is definitely |
| 96 | +incomplete and not detailed enough. It also might be inaccurate in places. |
| 97 | +Still, it probably provides useful first guidepost to what happens in there. |
| 98 | + |
| 99 | +* What exactly does it link to and how is that published and consumed by |
| 100 | + following stages of compilation? |
| 101 | +* Who calls it and how it is actually used. |
| 102 | +* Is it a pass and then the result is only used, or can it be computed |
| 103 | + incrementally (eg. for RLS)? |
| 104 | +* The overall strategy description is a bit vague. |
| 105 | +* Where does the name `Rib` come from? |
| 106 | +* Does this thing have its own tests, or is it tested only as part of some e2e |
| 107 | + testing? |
0 commit comments