Skip to content

Commit b349cf7

Browse files
author
bors-servo
authored
Auto merge of #988 - fitzgen:overview, r=emilio
Add an architectural overview of `bindgen` to CONTRIBUTING.md This should help new contributors who are coming to the code base for the first time get up and running. r? @emilio
2 parents 37af44d + 4089b3a commit b349cf7

File tree

1 file changed

+66
-0
lines changed

1 file changed

+66
-0
lines changed

CONTRIBUTING.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ out to us in a GitHub issue, or stop by
1919
- [Testing a Single Header's Bindings Generation and Compiling its Bindings](#testing-a-single-headers-bindings-generation-and-compiling-its-bindings)
2020
- [Authoring New Tests](#authoring-new-tests)
2121
- [Test Expectations and `libclang` Versions](#test-expectations-and-libclang-versions)
22+
- [Code Overview](#code-overview)
2223
- [Pull Requests and Code Reviews](#pull-requests-and-code-reviews)
2324
- [Generating Graphviz Dot Files](#generating-graphviz-dot-files)
2425
- [Debug Logging](#debug-logging)
@@ -192,6 +193,71 @@ Where `$VERSION` is one of:
192193

193194
depending on which version of `libclang` you have installed.
194195

196+
## Code Overview
197+
198+
`bindgen` takes C and C++ header files as input and generates corresponding Rust
199+
`#[repr(C)]` type definitions and `extern` foreign function declarations.
200+
201+
First, we use `libclang` to parse the input headers. See `src/clang.rs` for our
202+
Rust-y wrappers over the raw C `libclang` API that the `clang-sys` crate
203+
exposes. We walk over `libclang`'s AST and construct our own internal
204+
representation (IR). The `ir` module and submodules (`src/ir/*`) contain the IR
205+
type definitions and `libclang` AST into IR parsing code.
206+
207+
The umbrella IR type is the `Item`. It contains various nested `enum`s that let
208+
us drill down and get more specific about the kind of construct that we're
209+
looking at. Here is a summary of the IR types and their relationships:
210+
211+
* `Item` contains:
212+
* An `ItemId` to uniquely identify it.
213+
* An `ItemKind`, which is one of:
214+
* A `Module`, which is originally a C++ namespace and becomes a Rust
215+
module. It contains the set of `ItemId`s of `Item`s that are defined
216+
within it.
217+
* A `Type`, which contains:
218+
* A `Layout`, describing the type's size and alignment.
219+
* A `TypeKind`, which is one of:
220+
* Some integer type.
221+
* Some float type.
222+
* A `Pointer` to another type.
223+
* A function pointer type, with `ItemId`s of its parameter types
224+
and return type.
225+
* An `Alias` to another type (`typedef` or `using X = ...`).
226+
* A fixed size `Array` of `n` elements of another type.
227+
* A `Comp` compound type, which is either a `struct`, `class`,
228+
or `union`. This is potentially a template definition.
229+
* A `TemplateInstantiation` referencing some template definition
230+
and a set of template argument types.
231+
* Etc...
232+
* A `Function`, which contains:
233+
* An ABI
234+
* A mangled name
235+
* a `FunctionKind`, which describes whether this function is a plain
236+
function, method, static method, constructor, destructor, etc.
237+
* The `ItemId` of its function pointer type.
238+
* A `Var` representing a static variable or `#define` constant, which
239+
contains:
240+
* Its type's `ItemId`
241+
* Optionally, a mangled name
242+
* Optionally, a value
243+
244+
The IR forms a graph of interconnected and inter-referencing types and
245+
functions. The `ir::traversal` module provides IR graph traversal
246+
infrastructure: edge kind definitions (base member vs field type vs function
247+
parameter, etc...), the `Trace` trait to enumerate an IR thing's outgoing edges,
248+
various traversal types.
249+
250+
After constructing the IR, we run a series of analyses on it. These analyses do
251+
everything from allocate logical bitfields into physical units, compute for
252+
which types we can `#[derive(Debug)]`, to determining which implicit template
253+
parameters a given type uses. The analyses are defined in
254+
`src/ir/analysis/*`. They are implemented as fixed-point algorithms, using the
255+
`ir::analysis::MonotoneFramework` trait.
256+
257+
The final phase is generating Rust source text from the analyzed IR, and it is
258+
defined in `src/codegen/*`. We use the `quote` crate, which provides the `quote!
259+
{ ... }` macro for quasi-quoting Rust forms.
260+
195261
## Pull Requests and Code Reviews
196262

197263
Ensure that each commit stands alone, and passes tests. This enables better `git

0 commit comments

Comments
 (0)