Skip to content

Be more explicit about the layout guarantees of integer and floating-point types #98

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 11, 2019
4 changes: 3 additions & 1 deletion reference/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@

- [Data layout](./layout.md)
- [Structs and tuples](./layout/structs-and-tuples.md)
- [Integers and Floating Points](./layout/integers-floatingpoint.md)
- [Scalars](./layout/scalars.md)
- [Enums](./layout/enums.md)
- [Unions](./layout/unions.md)
- [Pointers](./layout/pointers.md)
- [Function pointers](./layout/function-pointers.md)
- [Arrays and Slices](./layout/arrays-and-slices.md)
- [Packed SIMD vectors](./layout/packed-simd-vectors.md)
- [Optimizations](./optimizations.md)
Expand Down
1 change: 0 additions & 1 deletion reference/src/layout.md

This file was deleted.

10 changes: 6 additions & 4 deletions reference/src/layout/function-pointers.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,11 @@ bool for_all(struct Cons const *self, bool (*func)(int, void *), void *thunk);
```

```rust
# use std::{
# ffi::c_void,
# os::raw::c_int,
# };
#
pub struct Cons {
data: c_int,
next: Option<Box<Cons>>,
Expand Down Expand Up @@ -117,9 +122,6 @@ pub extern "C" fn for_all(
}
it = node.next.as_ref().map(|x| &**x);
}
true
}
```

### Unresolved Questions

- dunno
61 changes: 0 additions & 61 deletions reference/src/layout/integers-floatingpoint.md

This file was deleted.

4 changes: 2 additions & 2 deletions reference/src/layout/pointers.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ multi-trait objects `&(dyn T + U)` or references to other dynamically sized type
other than that they are at least word-aligned, and have size at least one word.

The layout of `&dyn T` when `T` is a trait is the same as that of:
```rust
```rust,ignore
#[repr(C)]
struct DynObject {
data: *u8,
Expand All @@ -45,7 +45,7 @@ struct DynObject {
```

The layout of `&[T]` is the same as that of:
```rust
```rust,ignore
#[repr(C)]
struct Slice<T> {
ptr: *T,
Expand Down
114 changes: 114 additions & 0 deletions reference/src/layout/scalars.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Layout of scalar types

This chapter represents the consensus from issue [#9]. It documents the memory
layout and considerations for `bool`, `char`, floating point types (`f{32, 64}`), and integral types (`{i,u}{8,16,32,64,128,size}`).

These types are all scalar types, representing a single value, and have no
layout `#[repr()]` flags.

[#9]: https://github.com/rust-rfcs/unsafe-code-guidelines/issues/9

## `bool`

Rust's `bool` has the same layout as C17's` _Bool`, that is, its size and
alignment are implementation-defined. Any `bool` can be cast into an integer,
taking on the values 1 (`true`) or 0 (`false`).

> **Note**: on all platforms that Rust's currently supports, its size and
> alignment are 1, and its ABI class is `INTEGER` - see [Rust Layout and ABIs].

[Rust Layout and ABIs]: https://gankro.github.io/blah/rust-layouts-and-abis/#the-layoutsabis-of-builtins

## `char`

Rust char is 32-bit wide and represents an [unicode scalar value]. The alignment
of `char` is _implementation-defined_.

[unicode scalar value]: http://www.unicode.org/glossary/#unicode_scalar_value

> **Note**: Rust `char` type is not layout compatible with C / C++ `char` types.
> The C / C++ `char` types correspond to either Rust's `i8` or `u8` types on all
> currently supported platforms, depending on their signedness. Rust does not
> support C platforms in which C `char` is not 8-bit wide.

## `isize` and `usize`

The `isize` and `usize` types are pointer-sized signed and unsigned integers.
They have the same layout as the [pointer types] for which the pointee is
`Sized`, and are layout compatible with C's `uintptr_t` and `intptr_t` types.

> **Note**: Rust's `usize` and C's `unsigned` types are **not** equivalent. C's
> `unsigned` is at least as large as a short, allowed to have padding bits, etc.
> but it is not necessarily pointer-sized.

> **Note**: in the current Rust implementation, the layouts of `isize` and
> `usize` determine the following:
>
> * the maximum size of Rust _allocations_ is limited to `isize::max_value()`.
> The LLVM `getelementptr` instruction uses signed-integer field offsets. Rust
> calls `getelementptr` with the `inbounds` flag which assumes that field
> offsets do not overflow,
>
> * the maximum number of elements in an array is `usize::max_value()` (`[T; N:
> usize]`. Only ZST arrays can probably be this large in practice, non-ZST
> arrays are bound by the maximum size of Rust values,
>
> * the maximum value by which a pointer can be offseted using `ptr.add(count:
> usize)` is `usize::max_value()`.
>
> These limits have not gone through the RFC process and are not guaranteed to
> hold.

[pointer types]: ./pointers.md

## Fixed-width integer types

Rust's signed and unsigned fixed-width integer types `{i,u}{8,16,32,64}` have
the same layout as the C fixed-width integer types from the `<stdint.h>` header
`{u,}int{8,16,32,64}_t`. That is:

* these types have no padding bits,
* their size exactly matches their bit-width,
* negative values of signed integer types are represented using 2's complement.

This properties also hold for Rust's 128-bit wide `{i,u}128` integer types, but
C does not expose equivalent types in `<stdint.h>`.

Rust fixed-width integer types are therefore safe to use directly in C FFI where
the corresponding C fixed-width integer types are expected.

### Layout compatibility with C native integer types

The specification of native C integer types, `char`, `short`, `int`, `long`,
... as well as their `unsigned` variants, guarantees a lower bound on their size,
e.g., `short` is _at least_ 16-bit wide and _at least_ as wide as `char`.

Their exact sizes are _implementation-defined_.

Libraries like `libc` use knowledge of this _implementation-defined_ behavior on
each platform to select a layout-compatible Rust fixed-width integer type when
interfacing with native C integer types (e.g. `libc::c_int`).

> **Note**: Rust does not support C platforms on which the C native integer type
> are not compatible with any of Rust's fixed-width integer type (e.g. because
> of padding-bits, lack of 2's complement, etc.).

## Fixed-width floating point types

Rust's `f32` and `f64` single (32-bit) and double (64-bit) precision
floating-point types have [IEEE-754] `binary32` and `binary64` floating-point
layouts, respectively.

When the platforms' `"math.h"` header defines the `__STDC_IEC_559__` macro,
Rust's floating-point types are safe to use directly in C FFI where the
appropriate C types are expected (`f32` for `float`, `f64` for `double`).

If the C platform's `"math.h"` header does not define the `__STDC_IEC_559__`
macro, whether using `f32` and `f64` in C FFI is safe or not for which C type is
_implementation-defined_.

> **Note**: the `libc` crate uses knowledge of each platform's
> _implementation-defined_ behavior to provide portable `libc::c_float` and
> `libc::c_double` types that can be used to safely interface with C via FFI.

[IEEE-754]: https://en.wikipedia.org/wiki/IEEE_754
1 change: 0 additions & 1 deletion reference/src/optimizations.md

This file was deleted.