diff --git a/src/SUMMARY.md b/src/SUMMARY.md index bb3aaf177..4272cfe71 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -67,7 +67,8 @@ # High-level Compiler Architecture - [Prologue](./part-2-intro.md) -- [Overview of the compiler](./overview.md) +- [Overview of the compiler](./overview/overview.md) + - [Compiling a Program: A Walkthrough](./overview/walkthrough.md) - [The compiler source code](./compiler-src.md) - [Queries: demand-driven compilation](./query.md) - [The Query Evaluation Model in Detail](./queries/query-evaluation-model-in-detail.md) diff --git a/src/overview/img/cat-meme.jpg b/src/overview/img/cat-meme.jpg new file mode 100644 index 000000000..feb7093e0 Binary files /dev/null and b/src/overview/img/cat-meme.jpg differ diff --git a/src/overview/img/high-level-overview.svg b/src/overview/img/high-level-overview.svg new file mode 100644 index 000000000..ec92e2019 --- /dev/null +++ b/src/overview/img/high-level-overview.svg @@ -0,0 +1,355 @@ + + +rustc_areas + + +cluster_querified + +Querified + + +cluster_trait_solving_dependent + +Trait-solving dependent + + +cluster_non_querified + +Non-querified + + +cluster_legend + +Legend + + + +Executable + +Executables, cdylibs, staticlibs + + + +LLVM + +LLVM + + + +Executable->LLVM + + + + +GCC + +GCC + + + +Executable->GCC + + + + +Cranelift + +Cranelift + + + +Executable->Cranelift + + + + +Libraries + +rlibs, dylibs + + + +Libraries->LLVM + + + + +Libraries->GCC + + + + +Libraries->Cranelift + + + + +Metadata Encoding + +Metadata Encoding + + + +Libraries->Metadata Encoding + + + + +Source Code + +Source Code + + + +SSA Code Generation + +SSA Code Generation + + + +LLVM->SSA Code Generation + + + + +GCC->SSA Code Generation + + + + +Monomorphization Collection + +Monomorphization Collection + + + +Cranelift->Monomorphization Collection + + + + +SSA Code Generation->Monomorphization Collection + + + + + +Trait System + +Trait System + + + +SSA Code Generation->Trait System + + + + + +MIR Building, Transformations and Analysis + +MIR Building, Transformations and Analysis + + + +Monomorphization Collection->MIR Building, Transformations and Analysis + + + + + +Metadata Encoding->MIR Building, Transformations and Analysis + + + + + +THIR Building and Analysis + +THIR Building and Analysis + + + +MIR Building, Transformations and Analysis->THIR Building and Analysis + + + + + +HIR Body Type-checking + +HIR Body Type-checking + + + +THIR Building and Analysis->HIR Body Type-checking + + + + + +HIR Well-formedness Checking + +HIR Well-formedness Checking + + + +HIR Body Type-checking->HIR Well-formedness Checking + + + + + +HIR Type Lowering + +HIR Type Lowering + + + +HIR Well-formedness Checking->HIR Type Lowering + + + + + +Trait System->MIR Building, Transformations and Analysis + + + + + +Trait System->THIR Building and Analysis + + + + + +Trait System->HIR Body Type-checking + + + + + +Trait System->HIR Well-formedness Checking + + + + + +HIR Building + +HIR Building + + + +HIR Type Lowering->HIR Building + + + + + +Late Name Resolution + +Late Name Resolution + + + +HIR Building->Late Name Resolution + + + + + +AST Validation + +AST Validation + + + +Late Name Resolution->AST Validation + + + + + +Macro Expansion, Metadata Loading and Early Name Resolution + +Macro Expansion, Metadata Loading and Early Name Resolution + + + +AST Validation->Macro Expansion, Metadata Loading and Early Name Resolution + + + + + +Parsing + +Parsing + + + +Macro Expansion, Metadata Loading and Early Name Resolution->Parsing + + + + + +Parsing->Macro Expansion, Metadata Loading and Early Name Resolution + + + + + +Tokenization + +Tokenization + + + +Parsing->Tokenization + + + + + +Tokenization->Source Code + + + + + +key +Pull-based Dependency  +Push-based Dependency  +Code Generation Pathway  + + + +key2 +  +  +  + + + +key:e->key2:w + + + + + +key:e->key2:w + + + + + +key:e->key2:w + + + + + \ No newline at end of file diff --git a/src/overview/img/mir-dialects-phases.svg b/src/overview/img/mir-dialects-phases.svg new file mode 100644 index 000000000..9bf96889a --- /dev/null +++ b/src/overview/img/mir-dialects-phases.svg @@ -0,0 +1,110 @@ + + +mir_transformations + +Mid-level Intermediate Representation (MIR) Dialects and Phases + +cluster_built + +Built Dialect + + +cluster_analysis + +Analysis Dialect + + +cluster_runtime + +Runtime Dialect + + +cluster_legend + +Legend + + + +built + +No Distinct Phases + + + +analysis_initial + +Initial + + + +built->analysis_initial + + + + + +analysis_post_cleanup + +Post Cleanup + + + +analysis_initial->analysis_post_cleanup + + + + + +runtime_initial + +Initial + + + +analysis_post_cleanup->runtime_initial + + + + + +runtime_post_cleanup + +Post Cleanup + + + +runtime_initial->runtime_post_cleanup + + + + + +runtime_optimized + +Optimized + + + +runtime_post_cleanup->runtime_optimized + + + + + +key +Transformation  + + + +key2 +  + + + +key:e->key2:w + + + + + \ No newline at end of file diff --git a/src/overview/overview.md b/src/overview/overview.md new file mode 100644 index 000000000..522f15056 --- /dev/null +++ b/src/overview/overview.md @@ -0,0 +1,629 @@ +# Overview of the compiler + +In this chapter, we aim to describe `rustc` at a very high-level to show how the +various areas work together. We introduce the _demand-driven architecture_ and +the organization of `rustc` areas by their information dependency into a graph. +For each area, we describe their purpose and motivation, and provide a small +summary with links to chapters that go into more detail. + +> **UNDER CONSTRUCTION** +> +> - This section lacks motivation and reasoning for the query architecture. +> +> **UNRESOLVED QUESTIONS** How do we describe the relationship between trait +> solving and the areas that it intertwines with? How do we describe error +> handling and `rustc_interface`? Where/how do we describe distinction between +> early lints and late lints? Do we talk about them briefly in this section +> already? + +Conventional compiler texts often describe traditional compilers as being +organized into linear _passes_, where a pass eagerly produces information for a +subsequent pass to consume. `rustc` is organized a bit differently. `rustc` for +the most part adopts an "demand-driven" system, where one area provides +information for another area that demands it via lazily-computed _queries_. If +we view the compiler's knowledge about a crate as a database, then queries are +functions which asks the database for information. Initially, this database of +information is empty. It can be populated on-demand by area requests for +information by queries. Queries can demand information from other sub-queries, +which in turn can demand information from area providers. This information +dependency flow forms a directed acyclic graph (DAG) at the item granularity +whose nodes are queries and leaves are areas which provide the demanded +information. + +> **UNDER CONSTRUCTION** Insert a query diagram to illustrate the above point. + +Ideally, the entire compiler would be organized as queries. However, as of the +time of writing (March 2024), `rustc` still has several areas which are not yet +querified and work like conventional passes. The following diagram illustrates +this distinction as well as the information dependency between areas. + +> **FIXME** What do we want the arrows to **actually** mean? + +![High-level overview](./img/high-level-overview.svg) + + + +To produce the final executable, we take one of the several code generation +pathways. For example, to produce an executable following the LLVM pathway, we +rely on the synthesis of LLVM IR from SSA code generation. Code generation in +turn relies on information produced by Monomorphization Collection, and so on. +We keep following the dependencies for the querified areas until we reach HIR +Lowering, which depends on information from non-querified areas of the compiler. +At that boundary, we make a distinction between the pull-based information +dependency of querified areas versus push-based information dependency of the +non-querified areas. + +To pass information between areas, the compiler uses various _Intermediate +Representations_ (IRs), which provide common interfaces to allow areas to +communicate with each other. These include but are not limited to: + +- Tokens +- Abstract Syntax Tree (AST) +- High-Level Intermediate Representation (HIR) +- Mid-Level Intermediate Representation (MIR) + +These various IRs gradually transform surface Rust syntax by simplications of +language constructs and enriches the information available to lower-level areas +by many analyses. + +We will explain each area in more detail in the next section. + +## Brief Summaries of the Areas + +### The Compiler Driver and the Interface + +The compiler driver (`rustc_driver`) is responsible for exercising the unstable +interface exposed by the compiler interface `rustc_interface` to orchestrate the +compilation phases and queries in the correct order. Splitting the driver and +the interface allows third parties to use rustc's internals as a library (such +as `rustdoc`) through the interface. + +### Non-Querified Areas + +#### Tokenization + +The compiler takes the input, a stream of Unicode characters, and transforms it +into a sequence of +[`Token`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/token/index.html)s +called a +[`TokenStream`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html). +This has multiple benefits: + +- Separation of concerns: the lexer can be concerned with lower-level details + such as interning identifiers, collecting numeric literals, identifying + keywords and operators, associating source locations with the token, and + generate useful error reports for these cases. This simplifies the job of the + parser as it can then work with tokens, which is a higher level construct than + Unicode characters. +- Simplifying the grammar: with a separate lexer, the parser can focus on + accepting a valid program according to a grammar defined in terms of _tokens_ + instead of individual Unicode characters. + +See [Lexing and Parsing](https://rustc-dev-guide.rust-lang.org/the-parser.html) +for more details. + +#### Parsing + +Given a token stream, the compiler builds a Abstract Syntax Tree (AST) that has +a tree structure to enable tree-based reasoning and effectively represent the +syntactical structure of the program. Rust has a macro system, which means that +macros will need to be expanded in order to construct the full AST. Initially, a +pre-expansion AST is constructed with placeholders for macros that are pending +expansion. + +See [Lexing and Parsing](https://rustc-dev-guide.rust-lang.org/the-parser.html) +for more details. + +#### Macro Expansion, Metadata Loading, Early Name Resolution and Late Name Resolution + +> **TODO**: this section needs a lot of work because I don't understand this +> area too well so far + +After the pre-expansion AST is constructed, the compiler tries to build the full +AST for a crate which has all macros expanded and all modules inlined. +Unresolved macros are iteratively expanded, which requires resolving imports and +macro names early (i.e. Early Name Resolution), but not other names yet. + +For import suggestions, the compiler may try to perform speculative crate +loading via reading crate metadata that does not produce diagnostics for the +speculatively loaded crate for import suggestion candidates. + +See [Macro +Expansion](https://rustc-dev-guide.rust-lang.org/macro-expansion.html) and [Name +Resolution](https://rustc-dev-guide.rust-lang.org/name-resolution.html) for more +details. + +#### AST Validation + +Some basic sanity checks are performed on the AST, such as not having more than +`u16::MAX` parameters, to catch some obvious user errors as well as errors in +AST construction / macro expansion. + +See [AST Validation](https://rustc-dev-guide.rust-lang.org/ast-validation.html) +for more details. + +### Querified Areas + +#### HIR Building + +High-level Intermediate Representation (HIR) is a lower-level IR than AST. When +compared to AST, HIR has desugared constructs (such as desugaring `for` loops +into `loop`s) to make analysis easier. When compared to MIR, HIR resembles the +surface Rust syntax more closely. + +HIR building is the lowering from Abstract Syntax Tree (AST) into HIR. HIR is +further desugared from AST to simplify analysis. There are sanity checks on the +lowered HIR to catch obvious lowering mistakes. + +See [AST Lowering](https://rustc-dev-guide.rust-lang.org/ast-lowering.html) for +more details. + +#### HIR Type Lowering + +HIR Type Lowering converts the more surface-syntax-like HIR type system entities +(types, lifetimes/regions, constants in type positions) into +[`rustc_middle::ty`](https://rustc-dev-guide.rust-lang.org/ty.html) +representations to express the semantics of the type. + +See the [The `ty` module: representing +types](https://rustc-dev-guide.rust-lang.org/ty.html) for more information. + +#### HIR Well-formedness Checking + +A type is _well-formed_ (WF) when it satisfies a set of requirements which are +fundamental to the type's validity. The requirements are type-specific, for +example: + +- For a tuple to be WF, all elements except for the last needs to be `Sized`. +- For an algebraic data type (struct/enum/union) to be WF, its generics need to + satisfy the `where` clauses on the ADT. + +The notion of type well-formedness can be inductively extended for entities that +contain types, such as trait predicates. A trait predicate `T0: Trait` +is well-formed if `T0` and `P1, ...` are well-formed in addition to `Trait`'s +`where` clauses being satisfied. + +HIR well-formedness checking is an early collection of checks on mostly surface +Rust syntax (such as parameters, local variables and return types) which are run +before further type-checking and inference, for two purposes: + +1. For better error-reporting: a program which is not WF is more likely to give + hard-to-understand or spurious errors. +2. To enable making simplifying assumptions in later HIR typeck: such as not + having to check if a tuple's fields are `Sized` when type checking a + `tuple.0` expression. + +The HIR is _well-formed_ (WF) when it satisfies a set of requirements that +include but are not limited to: + +- well-formedness of types in type signatures +- object safety is respected +- function call ABI requirements (e.g. for "rust-call") are satisfied +- receiver type can be used as a self type +- variances are compatible +- and many more + +Note that HIR WF checking are also not the only place where WF checks take +place: there are further WF checks in THIR/MIR to complement HIR WF checks. + +#### HIR Body Type-checking + +> 🚧 **TODO** 🚧 (remove the cat meme when complete with draft) + + + +The goal of HIR body typechecking is to catch mistakes that violate the rules of +the type system, and infer types and other information necessary for THIR/MIR +building. HIR body type-checking recursively walks and checks each expression. +This process produces information such as the type of each expression. During +the process, type inference, implicit cocercions, trait bound checking and +method resolution is performed. Notably, + +- Method resolution is not performed in (early/late) Name Resolution because + type information is required. +- Type inference is performed on HIR and not MIR because HIR maps closer to the + surface Rust syntax (and retains the tree-based structure), which enables more + descriptive errors and suggestions to be produced. +- MIR building requires method resolution information from HIR type-checking to + guide the lowering. + +When compared to MIR typeck where all types are specified, HIR typeck has to +infer types. HIR typeck also checks all closures together with their parent body +while MIR typeck/borrowck checks them separately. + +See [Type checking](https://rustc-dev-guide.rust-lang.org/type-checking.html) +for more details. + +#### Typed High-level Intermediate Representation (THIR) Building and Analysis + +Typed High-level Intermediate Representation (THIR) is a further desugared +version of High-level Intermediate Representation (HIR) with full type +information embedded. THIR is the last IR where lexical structure is meaningful +for analysis in the chain of lowering from HIR to THIR to MIR and beyond, where +lower-level IRs transition from being lexically-aware to being dataflow-aware. +For example, unsafeck is performed on THIR and not MIR because we care about the +lexical structure -- whether something exists inside an `unsafe` block +lexically. To further simplify tree-based reasoning, THIR has several +simplifications over HIR: + +- Explicit auto-deref/auto-ref and coercions. +- Method calls are desugared into regular function calls. +- Overloaded operators like `+` are lowered into function calls or builtin + operations. +- Match ergonomics are desugared. + +With the aforementioned properties, the construction of THIR synthesizes enough +information for MIR building, and makes lowering to MIR easier and less +error-prone than if HIR was attempted to be directly lowered to MIR. + +See [The Typed High-level Intermediate +Representation](https://rustc-dev-guide.rust-lang.org/thir.html) for more +details. + +#### Mid-level Intermediate Representation (MIR) Building, Transformations and Analysis + +Mid-level Intermediate Representation (MIR) is an Intermediate Representation +that is lowered from Typed High-level Intermediate Representation (THIR). +Compared to its precursor THIR, MIR has three key properties that are distinct: + +1. _Simplified constructs_: MIR simplifies a lot of the surface language + syntactical constructs. For example, matches are lowered into a series of + [`SwitchInt`]s and branches. +2. _Explicit semantic constructs_: MIR makes various semantic aspects of Rust + that does not appear in the surface syntax explicit. This includes things + like drops, drop flags, unwinding, codegen intrinsics retags, and others. +3. _Dataflow-aware_: MIR is designed to be dataflow-aware, and removes lexical + information still present in THIR to enable dataflow-based analysis. + +A series of transformations are performed on MIR to prepare it for use by +codegen, Compile-Time Function Evaluation (CTFE), and other analyses. MIR has +three dialects which are distinguished by semantic changes. Within each dialect, +MIR has phases which represent additional constraints on the well-formedness of +MIR within a particular dialect. Dialects and phases are made explicit so that +different consumers of MIR can be provided with a flavor of MIR that has +desirable properties suitable for each of them. Some of the analyses include +borrow-checking, drop elaboration, coroutine transformations, const promotion, +further well-formedness checking and some late type-checking. + +![MIR Dialects and Phases](./img/mir-dialects-phases.svg) + + + +MIR optimizations generally are opportunistic: because we're before +monomorphization, there are e.g. constants related to generic parameters whose +value we cannot know. When an opportunity does arise, however, MIR optimizations +try to achieve one or both of the following objectives: + +1. to reduce the amount of work for the codegen backends, and +2. to produce simpler or more optimizable backend output. + +See [Mid-level Intermediate +Representation](https://rustc-dev-guide.rust-lang.org/mir/index.html) for more +details. + +#### Metadata Encoding + +Rust libraries are compiled to archives which consists of object code and +_metadata_. Metadata is used by downstream crates to understand the interface of +upstream crates: including (but are not limited to) exported items, function +signatures, type definitions and MIR for generic functions[^proc-macro]. In this +sense, metadata can be compared to precompiled C headers. Metadata Encoding is +the area responsible for serializing this information into a common binary +format that can be understood by both upstream and downstream crates. + +Metadata is serialized via types which implement [`Encodable`], some of which +are derived while others are hand-written implementations. Symmetrically, +metadata can be read back by [`Decodable`] derives and manual deserialization +implementations. + +[^proc-macro]: proc macro metadata has a trimmed down version which _only_ + contains the exported macros without anything else defined inside the proc + macro. + +See [Libraries and +Metadata](https://rustc-dev-guide.rust-lang.org/backend/libs-and-metadata.html) +for more details. + +#### Monomorphization Collection + +Monomorphization collection is the area responsible for collecting _Mono Items_ +which contribute to code generation: functions, methods, closures, statics, drop +glue, constants, VTables and object shims. Functions need to be _monomorphized_ +-- they might have generic type parameters, in which case the compiler has to +instantiate the functions with the provided type arguments. + +See +[Monomorphization](https://rustc-dev-guide.rust-lang.org/backend/monomorph.html) +for more details. + +#### The Trait System + +> 🚧 **TODO** 🚧 I don't know how to describe the Trait System at all here. + +#### Static-Single Assignment (SSA) Code Generation + +Static Single Assignment (SSA) code generation is responsible for lowering +Mid-Level Intermediate Representation (MIR) into SSA form (another kind of IR) +which has the property that each variable is assigned to exactly once and is +defined before use. Lowering into SSA form enables more optimizations since it +simplifies the properties of variables. It also makes it easier for code +generation backends to handle. This area abstracts the common code generation +logic and interfaces which code generation backends will need to implement. + +In order to lower MIR to SSA, we depend on Monomohprization Collection to +collect _Mono Items_ such as functions, methods and closures which contribute to +code generation. + +See [Code +Generation](https://rustc-dev-guide.rust-lang.org/backend/codegen.html) for more +details. + +#### Codegen backends: LLVM, GCC and Cranelift + +> 🚧 **TODO** 🚧 need more information. + +## Additional References and Resources + +> **UNDER CONSTRUCTION** This section lacks additional references + +### The Query System + +- Detailed description of the query system: + . +- Design document of On-Demand Compilation and Incremental Compilation: + . + +### Areas + +> **UNDER CONSTRUCTION** This should retain _key entry points_ from the previous +> overview version. Can be very useful. We should expand it a bit. This section +> lacks additional references. +> +> **FIXME**: audit these references and links, they are likely very outdated + +- Command line parsing + + - Guide: [The Rustc Driver and Interface](../rustc-driver.md) + - Driver definition: + [`rustc_driver`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_driver/) + - Main entry point: + [`rustc_session::config::build_session_options`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_session/config/fn.build_session_options.html) + +- Lexical Analysis: Lex the user program to a stream of tokens + + - Guide: [Lexing and Parsing](../the-parser.md) + - Lexer definition: + [`rustc_lexer`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html) + - Main entry point: + [`rustc_lexer::cursor::Cursor::advance_token`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/cursor/struct.Cursor.html#method.advance_token) + +- Parsing: Parse the stream of tokens to an Abstract Syntax Tree (AST) + + - Guide: [Lexing and Parsing](../the-parser.md) + - Guide: [Macro Expansion](../macro-expansion.md) + - Guide: [Name Resolution](../name-resolution.md) + - Parser definition: + [`rustc_parse`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html) + - Main entry points: + + - [Entry point for first file in + crate](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/passes/fn.parse.html) + - [Entry point for outline module + parsing](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/module/fn.parse_external_mod.html) + - [Entry point for macro fragments][parse_nonterminal] + + - `AST` definition: + [`rustc_ast`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/index.html) + - Feature gating: **TODO** + - Early linting: **TODO** + +- The High Level Intermediate Representation (HIR) + + - Guide: [The HIR](../hir.md) + - Guide: [Identifiers in the HIR](../hir.md#identifiers-in-the-hir) + - Guide: [The `HIR` Map](../hir.md#the-hir-map) + - Guide: [Lowering `AST` to `HIR`](../ast-lowering.md) + - How to view `HIR` representation for your code `cargo rustc -- -Z + unpretty=hir-tree` + - Rustc `HIR` definition: + [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html) + - Main entry point: **TODO** + - Late linting: **TODO** + +- Type Inference + + - Guide: [Type Inference](../type-inference.md) + - Guide: [The ty Module: Representing Types](../ty.md) (semantics) + - Main entry point (type inference): + [`InferCtxtBuilder::enter`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxtBuilder.html#method.enter) + - Main entry point (type checking bodies): [the `typeck` + query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html#method.typeck) + + - These two functions can't be decoupled. + +- The Mid Level Intermediate Representation (MIR) + + - Guide: [The `MIR` (Mid level IR)](../mir/index.md) + - Definition: + [`rustc_middle/src/mir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/index.html) + - Definition of sources that manipulates the MIR: + [`rustc_mir_build`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/index.html), + [`rustc_mir_dataflow`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_dataflow/index.html), + [`rustc_mir_transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html) + +- The Borrow Checker + + - Guide: [MIR Borrow Check](../borrow_check.md) + - Definition: + [`rustc_borrowck`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/index.html) + - Main entry point: [`mir_borrowck` + query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/fn.mir_borrowck.html) + +- `MIR` Optimizations + + - Guide: [MIR Optimizations](../mir/optimizations.md) + - Definition: + [`rustc_mir_transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html) + - Main entry point: [`optimized_mir` + query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/fn.optimized_mir.html) + +- Code Generation + + - Guide: [Code Generation](backend/codegen.md) + - Generating Machine Code from `LLVM-IR` with LLVM - **TODO: reference?** + - Main entry point: + [`rustc_codegen_ssa::base::codegen_crate`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html) + + - This monomorphizes and produces `LLVM-IR` for one codegen unit. It then + starts a background thread to run LLVM, which must be joined later. + - Monomorphization happens lazily via + [`FunctionCx::monomorphize`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.monomorphize) + and + [`rustc_codegen_ssa::base::codegen_instance`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_instance.html) + +[`Decodable`]: + https://doc.rust-lang.org/nightly/nightly-rustc/rustc_serialize/trait.Decodable.html +[`Encodable`]: + https://doc.rust-lang.org/nightly/nightly-rustc/rustc_serialize/trait.Encodable.html +[`SwitchInt`]: + https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.TerminatorKind.html#variant.SwitchInt diff --git a/src/overview.md b/src/overview/walkthrough.md similarity index 81% rename from src/overview.md rename to src/overview/walkthrough.md index 36de212d6..132a61520 100644 --- a/src/overview.md +++ b/src/overview/walkthrough.md @@ -1,6 +1,4 @@ -# Overview of the compiler - - +# Compiling a Program: A Walkthrough This chapter is about the overall process of compiling a program -- how everything fits together. @@ -131,9 +129,9 @@ with additional low-level types and annotations added (e.g. an ELF object or `WASM`). The different libraries/binaries are then linked together to produce the final binary. -[*trait solving*]: traits/resolution.md -[*type checking*]: type-checking.md -[*type inference*]: type-inference.md +[*trait solving*]: ../traits/resolution.md +[*type checking*]: ../type-checking.md +[*type inference*]: ../type-inference.md [`bump`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.bump [`check`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.check [`Crate`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_ast/ast/struct.Crate.html @@ -146,7 +144,7 @@ the final binary. [`Parser`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html [`Pat`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_ast/ast/struct.Pat.html [`rustc_ast::ast`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_ast/index.html -[`rustc_driver`]: rustc-driver.md +[`rustc_driver`]: ../rustc-driver.md [`rustc_interface::Config`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/interface/struct.Config.html [`rustc_lexer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html [`rustc_parse::lexer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/index.html @@ -155,12 +153,12 @@ the final binary. [`simplify_try`]: https://github.com/rust-lang/rust/pull/66282 [`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html [`Ty<'tcx>`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Ty.html -[borrow checking]: borrow_check.md -[codegen]: backend/codegen.md +[borrow checking]: ../borrow_check.md +[codegen]: ../backend/codegen.md [hir]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html -[lex]: the-parser.md -[mir-opt]: mir/optimizations.md -[mir]: mir/index.md +[lex]: ../the-parser.md +[mir-opt]: ../mir/optimizations.md +[mir]: ../mir/index.md [parse_crate_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_crate_mod [parse_external_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/module/fn.parse_external_mod.html [parse_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_mod @@ -370,7 +368,7 @@ For more details on bootstrapping, see [the bootstrapping section of the guide][rustc-bootstrap]. [_bootstrapping_]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) -[rustc-bootstrap]: building/bootstrapping/intro.md +[rustc-bootstrap]: ../building/bootstrapping/intro.md - -# References - -- Command line parsing - - Guide: [The Rustc Driver and Interface](rustc-driver.md) - - Driver definition: [`rustc_driver`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_driver/) - - Main entry point: [`rustc_session::config::build_session_options`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_session/config/fn.build_session_options.html) -- Lexical Analysis: Lex the user program to a stream of tokens - - Guide: [Lexing and Parsing](the-parser.md) - - Lexer definition: [`rustc_lexer`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html) - - Main entry point: [`rustc_lexer::cursor::Cursor::advance_token`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/cursor/struct.Cursor.html#method.advance_token) -- Parsing: Parse the stream of tokens to an Abstract Syntax Tree (AST) - - Guide: [Lexing and Parsing](the-parser.md) - - Guide: [Macro Expansion](macro-expansion.md) - - Guide: [Name Resolution](name-resolution.md) - - Parser definition: [`rustc_parse`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html) - - Main entry points: - - [Entry point for first file in crate](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/passes/fn.parse.html) - - [Entry point for outline module parsing](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/module/fn.parse_external_mod.html) - - [Entry point for macro fragments][parse_nonterminal] - - `AST` definition: [`rustc_ast`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/index.html) - - Feature gating: **TODO** - - Early linting: **TODO** -- The High Level Intermediate Representation (HIR) - - Guide: [The HIR](hir.md) - - Guide: [Identifiers in the HIR](hir.md#identifiers-in-the-hir) - - Guide: [The `HIR` Map](hir.md#the-hir-map) - - Guide: [Lowering `AST` to `HIR`](ast-lowering.md) - - How to view `HIR` representation for your code `cargo rustc -- -Z unpretty=hir-tree` - - Rustc `HIR` definition: [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html) - - Main entry point: **TODO** - - Late linting: **TODO** -- Type Inference - - Guide: [Type Inference](type-inference.md) - - Guide: [The ty Module: Representing Types](ty.md) (semantics) - - Main entry point (type inference): [`InferCtxtBuilder::enter`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxtBuilder.html#method.enter) - - Main entry point (type checking bodies): [the `typeck` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html#method.typeck) - - These two functions can't be decoupled. -- The Mid Level Intermediate Representation (MIR) - - Guide: [The `MIR` (Mid level IR)](mir/index.md) - - Definition: [`rustc_middle/src/mir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/index.html) - - Definition of sources that manipulates the MIR: [`rustc_mir_build`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/index.html), [`rustc_mir_dataflow`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_dataflow/index.html), [`rustc_mir_transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html) -- The Borrow Checker - - Guide: [MIR Borrow Check](borrow_check.md) - - Definition: [`rustc_borrowck`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/index.html) - - Main entry point: [`mir_borrowck` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/fn.mir_borrowck.html) -- `MIR` Optimizations - - Guide: [MIR Optimizations](mir/optimizations.md) - - Definition: [`rustc_mir_transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html) - - Main entry point: [`optimized_mir` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/fn.optimized_mir.html) -- Code Generation - - Guide: [Code Generation](backend/codegen.md) - - Generating Machine Code from `LLVM-IR` with LLVM - **TODO: reference?** - - Main entry point: [`rustc_codegen_ssa::base::codegen_crate`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html) - - This monomorphizes and produces `LLVM-IR` for one codegen unit. It then - starts a background thread to run LLVM, which must be joined later. - - Monomorphization happens lazily via [`FunctionCx::monomorphize`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.monomorphize) and [`rustc_codegen_ssa::base::codegen_instance `](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_instance.html)