Skip to content

Commit 8eb0010

Browse files
author
bors-servo
authored
Auto merge of #99 - emilio:docs-for-real, r=fitzgen
ir: A bit more documentation and cleanup in parts of the `ir` module. r? @fitzgen or @nox
2 parents 2d359fa + f423873 commit 8eb0010

File tree

4 files changed

+245
-89
lines changed

4 files changed

+245
-89
lines changed

src/ir/item.rs

Lines changed: 145 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ pub trait ItemCanonicalPath {
4242

4343
/// A single identifier for an item.
4444
///
45-
/// TODO: Build stronger abstractions on top of this, like TypeId(ItemId), ...
45+
/// TODO: Build stronger abstractions on top of this, like TypeId(ItemId)?
4646
#[derive(Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
4747
pub struct ItemId(usize);
4848

@@ -78,6 +78,20 @@ impl ItemCanonicalPath for ItemId {
7878
}
7979
}
8080

81+
/// An item is the base of the bindgen representation, it can be either a
82+
/// module, a type, a function, or a variable (see `ItemKind` for more
83+
/// information).
84+
///
85+
/// Items form a tree, and each item only stores the id of the parent.
86+
///
87+
/// The root of this tree is the "root module", a meta-item used to hold all the
88+
/// top-level items.
89+
///
90+
/// An item may have a comment, and annotations (see the `annotations` module).
91+
///
92+
/// Note that even though we parse all the types of annotations in comments, not
93+
/// all of them apply to every item. Those rules are described in the
94+
/// `annotations` module.
8195
#[derive(Debug)]
8296
pub struct Item {
8397
/// This item's id.
@@ -133,6 +147,24 @@ impl Item {
133147
&mut self.kind
134148
}
135149

150+
/// Returns whether this item is a top-level item, from the point of view of
151+
/// bindgen.
152+
///
153+
/// This point of view changes depending on whether namespaces are enabled
154+
/// or not. That way, in the following example:
155+
///
156+
/// ```c++
157+
/// namespace foo {
158+
/// static int var;
159+
/// }
160+
/// ```
161+
///
162+
/// `var` would be a toplevel item if namespaces are disabled, but won't if
163+
/// they aren't.
164+
///
165+
/// This function is used to determine when the codegen phase should call
166+
/// `codegen` on an item, since it's assumed that any item that is not
167+
/// top-level will be generated by its parent.
136168
pub fn is_toplevel(&self, ctx: &BindgenContext) -> bool {
137169
// FIXME: Workaround for some types falling behind when parsing weird
138170
// stl classes, for example.
@@ -167,12 +199,55 @@ impl Item {
167199
self.kind().expect_function()
168200
}
169201

170-
// This check is needed because even though the type might not contain the
171-
// applicable template args itself, they might apply transitively via, for
172-
// example, the parent.
173-
//
174-
// It's kind of unfortunate (in the sense that it's a sort of complex
175-
// process, but I think it gets all the cases).
202+
/// Checks whether an item contains in its "type signature" some named type.
203+
///
204+
/// This function is used to avoid unused template parameter errors in Rust
205+
/// when generating typedef declarations, and also to know whether we need
206+
/// to generate a PhantomData member for a template parameter.
207+
///
208+
/// For example, in code like the following:
209+
///
210+
/// ```c++
211+
/// template<typename T, typename U>
212+
/// struct Foo {
213+
/// T bar;
214+
///
215+
/// struct Baz {
216+
/// U bas;
217+
/// };
218+
/// };
219+
/// ```
220+
///
221+
/// Both Foo and Baz contain both `T` and `U` template parameters in their
222+
/// signature:
223+
///
224+
/// * `Foo<T, U>`
225+
/// * `Bar<T, U>`
226+
///
227+
/// But the structure for `Foo` would look like:
228+
///
229+
/// ```rust
230+
/// struct Foo<T, U> {
231+
/// bar: T,
232+
/// _phantom0: ::std::marker::PhantomData<U>,
233+
/// }
234+
/// ```
235+
///
236+
/// because non of its member fields contained the `U` type in the
237+
/// signature. Similarly, `Bar` would contain a `PhantomData<T>` type, for
238+
/// the same reason.
239+
///
240+
/// Note that this is somewhat similar to `applicable_template_args`, but
241+
/// this also takes into account other kind of types, like arrays,
242+
/// (`[T; 40]`), pointers: `*mut T`, etc...
243+
///
244+
/// Normally we could do this check just in the `Type` kind, but we also
245+
/// need to check the `applicable_template_args` more generally, since we
246+
/// could need a type transitively from our parent, see the test added in
247+
/// <https://github.com/servo/rust-bindgen/pull/85/commits/2a3f93074dd2898669dbbce6e97e5cc4405d7cb1>
248+
///
249+
/// It's kind of unfortunate (in the sense that it's a sort of complex
250+
/// process), but I think it should get all the cases.
176251
fn signature_contains_named_type(&self, ctx: &BindgenContext, ty: &Type) -> bool {
177252
debug_assert!(ty.is_named());
178253
self.expect_type().signature_contains_named_type(ctx, ty) ||
@@ -181,6 +256,39 @@ impl Item {
181256
})
182257
}
183258

259+
/// Returns the template arguments that apply to a struct. This is a concept
260+
/// needed because of type declarations inside templates, for example:
261+
///
262+
/// ```c++
263+
/// template<typename T>
264+
/// class Foo {
265+
/// typedef T element_type;
266+
/// typedef int Bar;
267+
///
268+
/// template<typename U>
269+
/// class Baz {
270+
/// };
271+
/// };
272+
/// ```
273+
///
274+
/// In this case, the applicable template arguments for the different types
275+
/// would be:
276+
///
277+
/// * `Foo`: [`T`]
278+
/// * `Foo::element_type`: [`T`]
279+
/// * `Foo::Bar`: [`T`]
280+
/// * `Foo::Baz`: [`T`, `U`]
281+
///
282+
/// You might notice that we can't generate something like:
283+
///
284+
/// ```rust,ignore
285+
/// type Foo_Bar<T> = ::std::os::raw::c_int;
286+
/// ```
287+
///
288+
/// since that would be invalid Rust. Still, conceptually, `Bar` *could* use
289+
/// the template parameter type `T`, and that's exactly what this method
290+
/// represents. The unused template parameters get stripped in the
291+
/// `signature_contains_named_type` check.
184292
pub fn applicable_template_args(&self, ctx: &BindgenContext) -> Vec<ItemId> {
185293
let ty = match *self.kind() {
186294
ItemKind::Type(ref ty) => ty,
@@ -275,6 +383,15 @@ impl Item {
275383

276384
/// Get the canonical name without taking into account the replaces
277385
/// annotation.
386+
///
387+
/// This is the base logic used to implement hiding and replacing via
388+
/// annotations, and also to implement proper name mangling.
389+
///
390+
/// The idea is that each generated type in the same "level" (read: module
391+
/// or namespace) has a unique canonical name.
392+
///
393+
/// This name should be derived from the immutable state contained in the
394+
/// type and the parent chain, since it should be consistent.
278395
fn real_canonical_name(&self,
279396
ctx: &BindgenContext,
280397
count_namespaces: bool,
@@ -424,12 +541,6 @@ impl ClangItemParser for Item {
424541
let comment = cursor.raw_comment();
425542
let annotations = Annotations::new(&cursor);
426543

427-
// FIXME: The current_module logic is not really accurate. We should be
428-
// able to index modules by their Cursor, and locate the proper module
429-
// for a given item.
430-
//
431-
// We don't support modules properly though, so there's no rush for
432-
// this.
433544
let current_module = context.current_module();
434545
macro_rules! try_parse {
435546
($what:ident) => {
@@ -486,7 +597,8 @@ impl ClangItemParser for Item {
486597
if cursor.kind() == clangll::CXCursor_UnexposedDecl {
487598
Err(ParseError::Recurse)
488599
} else {
489-
error!("Unhandled cursor kind: {} ({})", ::clang::kind_to_str(cursor.kind()), cursor.kind());
600+
error!("Unhandled cursor kind: {} ({})",
601+
::clang::kind_to_str(cursor.kind()), cursor.kind());
490602
Err(ParseError::Continue)
491603
}
492604
}
@@ -498,6 +610,17 @@ impl ClangItemParser for Item {
498610
Self::from_ty_or_ref_with_id(ItemId::next(), ty, location, parent_id, context)
499611
}
500612

613+
/// Parse a type, if we know it before hand, or otherwise store it as an
614+
/// `UnresolvedTypeRef`, which means something like "a reference to a type
615+
/// we still don't know".
616+
///
617+
/// This logic is needed to avoid parsing items with the incorrect parent
618+
/// and it's sort of complex to explain, so I'll just point to
619+
/// `tests/headers/typeref.hpp` to see the kind of constructs that forced
620+
/// this.
621+
///
622+
/// Typerefs are resolved once parsing is completely done, see
623+
/// `BindgenContext::resolve_typerefs`.
501624
fn from_ty_or_ref_with_id(potential_id: ItemId,
502625
ty: clang::Type,
503626
location: Option<clang::Cursor>,
@@ -537,6 +660,14 @@ impl ClangItemParser for Item {
537660
Self::from_ty_with_id(ItemId::next(), ty, location, parent_id, context)
538661
}
539662

663+
/// This is one of the trickiest methods you'll find (probably along with
664+
/// some of the ones that handle templates in `BindgenContext`).
665+
///
666+
/// This method parses a type, given the potential id of that type (if
667+
/// parsing it was correct), an optional location we're scanning, which is
668+
/// critical some times to obtain information, an optional parent item id,
669+
/// that will, if it's `None`, become the current module id, and the
670+
/// context.
540671
fn from_ty_with_id(id: ItemId,
541672
ty: &clang::Type,
542673
location: Option<clang::Cursor>,

src/ir/item_kind.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ pub enum ItemKind {
1515

1616
/// A function or method declaration.
1717
Function(Function),
18+
1819
/// A variable declaration, most likely a static.
1920
Var(Var),
2021
}

src/ir/mod.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
//! The module where the Intermediate Representation bindgen uses, and the
2+
//! parsing code that generates it lives.
3+
14
pub mod annotations;
25
pub mod comp;
36
pub mod context;

0 commit comments

Comments
 (0)