|
1 | 1 | # Layout of Boolean, Floating Point, and Integral Types
|
2 |
| -This chapter represents the consensus from issue [#9]. It documents the memory layout and considerations for `bool`, `usize`, `isize`, floating point types, and integral types. |
| 2 | + |
| 3 | +This chapter represents the consensus from issue [#9]. It documents the memory |
| 4 | +layout and considerations for `bool`, floating point types (`f{32, 64}`), and |
| 5 | +integral types (`{i,u}{8,16,32,64,128,size}`). |
| 6 | + |
| 7 | +These types are all scalar types, representing a single value, and have no |
| 8 | +layout `#[repr()]` flags. |
3 | 9 |
|
4 | 10 | [#9]: https://github.com/rust-rfcs/unsafe-code-guidelines/issues/9
|
5 | 11 |
|
6 |
| -## Overview |
7 |
| -These are all scalar types, representing a single value. These types have no layout variants (no `#[repr(C)]` or `#[repr(Rust)]`). Their size is fixed and well-defined across FFI boundaries and map to their corresponding integral types in the C ABI. |
8 |
| -- `bool`: 1 byte |
9 |
| - - any `bool` can be cast into an integer, taking on the values 1 (true) or 0 (false) |
10 |
| -- `usize`, `isize`: pointer-sized unsigned/signed integer type |
11 |
| -- `u8` .. `u128`, `i8` .. `i128` |
12 |
| - - {8, 16, 32, 64, 128}-bit unsigned integer |
13 |
| - - {8, 16, 32, 64, 128}-bit signed integer |
14 |
| -- `f32`, `f64` |
15 |
| - - IEEE floats |
16 |
| - - 32-bit or 64-bit |
17 |
| -- `char` |
18 |
| - - C++ char: equivalent to either `i8`/`u8` |
19 |
| - - Rust char: 32-bit |
20 |
| - - not ABI compatible |
21 |
| - - represents [Unicode scalar value](http://www.unicode.org/glossary/#unicode_scalar_value) |
| 12 | +## `bool` |
| 13 | + |
| 14 | +Rust's `bool` has the same layout as C17's` _Bool`, that is, its size and |
| 15 | +alignment are implementation-defined. Any `bool` can be cast into an integer, |
| 16 | +taking on the values 1 (`true`) or 0 (`false`). |
| 17 | + |
| 18 | +> **Note**: on all platforms that Rust's currently supports, its size and |
| 19 | +> alignment are 1, and its ABI class is `INTEGER` - see [Rust Layout and ABIs]. |
| 20 | +
|
| 21 | +[Rust Layout and ABIs]: https://gankro.github.io/blah/rust-layouts-and-abis/#the-layoutsabis-of-builtins |
| 22 | + |
| 23 | +## `char` |
| 24 | + |
| 25 | +Rust char is 32-bit wide and represents [Unicode scalar value](http://www.unicode.org/glossary/#unicode_scalar_value). |
| 26 | + |
| 27 | +> **Note**: Rust `char` type is not layout compatible with C / C++ `char` types. |
| 28 | +> The C / C++ `char` types correspond to either Rust's `i8` or `u8` types on all |
| 29 | +> currently supported platforms, depending on their signedness. Rust does not |
| 30 | +> support C platforms in which C `char` is not 8-bit wide. |
22 | 31 |
|
23 | 32 | ## `usize`/`isize`
|
24 |
| -Types `usize` and `isize` are committed to having the same size as a native pointer on the platform. The layout of `usize` determines the following: |
25 |
| -- how much a pointer of a certain type can be offseted, |
26 |
| -- the maximum size of Rust objects (because size_of/size_of_val return `usize`), |
27 |
| -- the maximum number of elements in an array (`[T; N: usize]`), |
28 |
| -- `usize`/`isize` in C FFI are compatible with C's `uintptr_t` / `intptr_t` (and have the same size and alignment). |
29 | 33 |
|
30 |
| -The maximum size of any single value must fit within `usize` to [ensure that pointer diff is representable](https://github.com/rust-rfcs/unsafe-code-guidelines/pull/5#discussion_r212703192). |
| 34 | +The `usize` and `isize` types are pointer-sized signed and unsigned integers. |
| 35 | +They have the same layout as the [pointer types] for which the pointee is |
| 36 | +`Sized`, and are layout compatible with C's `uintptr_t` and `intptr_t` types. |
| 37 | + |
| 38 | +> **Note**: Rust's `usize` and C's `unsigned` types are **not** equivalent. C's |
| 39 | +> `unsigned` is at least as large as a short, allowed to have padding bits, etc. |
| 40 | +> but it is not necessarily pointer-sized. |
31 | 41 |
|
32 |
| -`usize` and C’s `unsized` are *not* equivalent. |
| 42 | +The layout of `usize` determines the following: |
33 | 43 |
|
34 |
| -## Booleans |
35 |
| -Rust's `bool` has the same layout as C17's` _Bool`, that is, its size and alignment are implementation-defined. |
| 44 | +- the maximum size of Rust objects (`size_of` and `size_of_val` return `usize`), |
| 45 | +- the maximum number of elements in an array (`[T; N: usize]`), |
| 46 | +- how much a pointer of a certain type can be offseted (limited by `usize::max_value()`). |
| 47 | + |
| 48 | +> **FIXME**: Pointer `add` operates on `usize`, but pointer `offset` operates on |
| 49 | +> `isize`, so unless by "offseted" we mean something different from `ptr.offset` |
| 50 | +> above, `usize::max_value()` does not determine how much a pointer can be |
| 51 | +> "offseted". We should probably be more specific here and call out `ptr.add` |
| 52 | +> and `ptr.offset` explicitly. |
| 53 | +
|
| 54 | +The maximum size of any single value must fit within `usize` to [ensure that |
| 55 | +pointer diff is |
| 56 | +representable](https://github.com/rust-rfcs/unsafe-code-guidelines/pull/5#discussion_r212703192). |
36 | 57 |
|
37 |
| -Note: on all platforms that Rust's currently supports, the size and alignment of bool are 1, and its ABI class is INTEGER. |
| 58 | +> **FIXME**: This does not make sense. We state that the layout of `usize` |
| 59 | +> determines the maximum size of an object, and then argue that this is to |
| 60 | +> ensure that pointer diff is representable, which won't be the case if the size |
| 61 | +> of an object is `usize::max_val()`. The link cited actually states that, right |
| 62 | +> now, the largest size of a Rust object is limited by `isize::max_value()`. |
38 | 63 |
|
39 |
| -For full ABI compatibility details, see [Gankro’s post](https://gankro.github.io/blah/rust-layouts-and-abis/#the-layoutsabis-of-builtins). |
| 64 | +[pointer types]: ./pointers.md |
40 | 65 |
|
41 | 66 | ## Fixed-width integer types
|
42 | 67 |
|
43 |
| -Rust's fixed-width integer types `{i,u}{8,16,32,64}` have the same layout as the |
44 |
| -C fixed-width integer types from the `<stdint.h>` header |
| 68 | +Rust's signed and unsigned fixed-width integer types `{i,u}{8,16,32,64}` have |
| 69 | +the same layout as the C fixed-width integer types from the `<stdint.h>` header |
45 | 70 | `{u,}int{8,16,32,64}_t`. That is:
|
46 | 71 |
|
47 | 72 | * these types have no padding bits,
|
48 | 73 | * their size exactly matches their bit-width,
|
49 |
| -* negative values of signed integer types are represented using `2`'s complement. |
| 74 | +* negative values of signed integer types are represented using 2's complement. |
| 75 | + |
| 76 | +This properties also hold for Rust's 128-bit wide `{i,u}128` integer types, but |
| 77 | +C does not expose equivalent types in `<stdint.h>`. |
| 78 | + |
| 79 | +Rust fixed-width integer types are therefore safe to use directly in C FFI where |
| 80 | +the corresponding C fixed-width integer types are expected. |
| 81 | +integer types are expected. |
50 | 82 |
|
51 |
| -Therefore these integer types are safe to use directly in C FFI where the |
52 |
| -fixed-width integer types are expected. |
| 83 | +### Layout compatibility with C native integer types |
| 84 | + |
| 85 | +The specification of native C integer types, `char`, `short`, `int`, `long`, |
| 86 | +... as well as their `unsigned` variants, has a lower bound on their size, |
| 87 | +e.g., `short` is _at least_ 16-bit wide and _at least_ as wide as `char`. |
| 88 | +Their actual exact sizes are _implementation-defined_. |
| 89 | + |
| 90 | +Libraries like `libc` use knowledge of this _implementation-defined_ behavior on |
| 91 | +each platform to select a layout-compatible Rust fixed-width integer type when |
| 92 | +interfacing with native C integer types. |
| 93 | + |
| 94 | +> **Note**: Rust does not support C platforms on which the C native integer type |
| 95 | +> are not compatible with any of Rust's fixed-width integer type (e.g. because |
| 96 | +> of padding-bits, lack of 2's complement, etc.). |
53 | 97 |
|
54 | 98 | ## Fixed-width floating point types
|
55 | 99 |
|
56 |
| -Rust's `f32` and `f64` types have the same layout as C's `float` and `double` |
57 |
| -types, respectively. Therefore these floating-point types are safe to use |
58 |
| -directly in C FFI where the appropriate C types are expected. |
59 |
| - |
60 |
| -## Relationship to C integer hierarchy |
61 |
| -C integers: |
62 |
| -- char: at least 8 bits |
63 |
| -- short: at least 16 bits (also at least a char) |
64 |
| -- int: at least a short (intended to be a native integer size) |
65 |
| -- long: at least 32 bits (also at least an int) |
66 |
| -- long long: at least 64 bits (also at least a long) |
67 |
| -The C integer types specify a minimum size, but not the exact size. For this reason, Rust integer types are not necessarily compatible with the “corresponding” C integer type. Instead, use the corresponding fixed size data types (e.g. `i64` in Rust would correspond to `int64_t` in C). |
68 |
| - |
69 |
| -## Controversies |
70 |
| -There has been some debate about what to pick as the "official" behavior for bool: |
71 |
| -* Rust does what C does (this is what the lang team decided) |
72 |
| - * and in all cases you care about, that is 1 byte that is 0 or 1 |
73 |
| -or |
74 |
| -* Rust makes it 1 byte with values 0 or 1 |
75 |
| - * and in all cases you care about, this is what C does |
76 |
| - |
77 |
| -Related discussions: [document the size of bool](https://github.com/rust-lang/rust/pull/46156), [bool== _Bool?](https://github.com/rust-rfcs/unsafe-code-guidelines/issues/53#issuecomment-447050232), [bool ABI](https://github.com/rust-lang/rust/pull/46176#issuecomment-359593446) |
78 |
| - |
| 100 | +Rust's `f32` and `f64` single (32-bit) and double (64-bit) precision |
| 101 | +floating-point types have [IEEE-754] `binary32` and `binary64` floating-point |
| 102 | +layouts, respectively. |
| 103 | + |
| 104 | +When the platforms' `"math.h"` header defines the `__STDC_IEC_559__` macro, |
| 105 | +Rust's floating-point types are safe to use directly in C FFI where the |
| 106 | +appropriate C types are expected (`f32` for `float`, `f64` for `double`). |
| 107 | + |
| 108 | +If the C platform's `"math.h"` header does not define the `__STDC_IEC_559__` |
| 109 | +macro, whether using `f32` and `f64` in C FFI is safe or not for which C type is |
| 110 | +_implementation-defined_. |
79 | 111 |
|
| 112 | +> **Note**: the `libc` crate uses knowledge of each platform's |
| 113 | +> _implementation-defined_ behavior to provide portable `libc::c_float` and |
| 114 | +> `libc::c_double` types that can be used to safely interface with C via FFI. |
80 | 115 |
|
| 116 | +[IEEE-754]: https://en.wikipedia.org/wiki/IEEE_754 |
0 commit comments