Skip to content

Added representation of pointer types. #51

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 20, 2018
53 changes: 53 additions & 0 deletions reference/src/representation/pointers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Representation of reference and pointer types

### Terminology

Reference types are types of the form `&T`, `&mut T` or `&dyn T`.

Raw pointer types are types of the form `*const T` or `*mut T`.

### Representation

The alignment of `&T`, `&mut T`, `*const T` and `*mut T` are the same,
and are at least the word size.

* If `T` is a trait, then the alignment of `&dyn T` is the word size.
* If `T` is a sized type then the alignment of `&T` is the word size.
* The alignment of `&[T]` is the word size.
* The alignment of `&str` is the word size.

The sizes of `&T`, `&mut T`, `*const T` and `*mut T` are the same,
and are at least one word.

* If `T` is a trait, then the size of `&dyn T` is two words.
* If `T` is a sized type then the size of `&T` is one word.
* The size of `&[T]` is two words.
* The size of `&str` is two words.

### Notes

The representations of `&T` and `&mut T` are the same.

We do not make any guarantees about the representation of
multi-trait objects `&(dyn T + U)` or references to other dynamically sized types,
other than that they are at least word-aligned, and have size at least one word.

The representation of `&dyn T` when `T` is a trait is the same as that of:
```rust
#[repr(C)]
struct DynObject {
data: &u8,
vtable: &u8,
}
```

The representation of `&[T]` is the same as that of:
```rust
#[repr(C)]
struct Slice<T> {
ptr: &T,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: if the length is zero, then what is ptr? Perhaps *const T is a better choice -- certainly it will fit the "validity invariants" better (eg, I don't think there is any requirement that it be "dereferenceable" in the event that length is zero). However, from_raw_parts does state:

data must be non-null and aligned, even for zero-length slices. One reason for this is that enum layout optimizations may rely on references (including slices of any length) being aligned and non-null to distinguish them from other data. You can obtain a pointer that is usable as data for zero-length slices using NonNull::dangling().

It seems worth linking to slice::from_raw_parts and reproducing some of that text (although arguably those are things best left for the next discussion?).

Copy link
Contributor

@gnzlbg gnzlbg Dec 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't like the use of references here much [0] the guarantee here is that "the representation of &[T] is the same as of Slice<T>", - that is, the Slice<T> struct only applies to the representation / layout of &[T] and not to its validity.

Ideally, we would use the same examples to show validity and representation, but we haven't settled the validity of these yet (it might well be that we need different "examples" for the &[T] and the *[T] cases). I think it is worth it to wait until the validity of these is settled before trying to "unify" all the examples.


[0] Personally, I don't like the use of &T here much, it has too many connotations, e.g., w.r.t. validity and safety, and we don't really care about them when just talking about the representation. It might be worth it to just say:

The representation of &[T] is the same as that of:

struct Slice {
    ptr: *const (),
    len: usize,
}

AFAICT there is no need to make Slice<T> generic since T: Sized, but maybe there is something else going on here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I said above, we could also use

struct Slice<T> {
  ptr: &[T;N],
  len: usize,
}

and in practice, N and len are the same.

All these defns are equivalent, it's just a case of which is the more human-readable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment that *T means *const T or *mut T when the mutability is unimportant, and used *T rather than &T. On zulip, @gnzlbg was okay with this, have we finally closed the last issue?

len: usize,
}
```

The representation of `&str` is the same as that of `&[u8]`.