Skip to content

Commit b6d30c5

Browse files
committed
Re-word the whole document; add two FIXME's
1 parent 02652ba commit b6d30c5

File tree

1 file changed

+92
-56
lines changed

1 file changed

+92
-56
lines changed
+92-56
Original file line numberDiff line numberDiff line change
@@ -1,80 +1,116 @@
11
# Layout of Boolean, Floating Point, and Integral Types
2-
This chapter represents the consensus from issue [#9]. It documents the memory layout and considerations for `bool`, `usize`, `isize`, floating point types, and integral types.
2+
3+
This chapter represents the consensus from issue [#9]. It documents the memory
4+
layout and considerations for `bool`, floating point types (`f{32, 64}`), and
5+
integral types (`{i,u}{8,16,32,64,128,size}`).
6+
7+
These types are all scalar types, representing a single value, and have no
8+
layout `#[repr()]` flags.
39

410
[#9]: https://github.com/rust-rfcs/unsafe-code-guidelines/issues/9
511

6-
## Overview
7-
These are all scalar types, representing a single value. These types have no layout variants (no `#[repr(C)]` or `#[repr(Rust)]`). Their size is fixed and well-defined across FFI boundaries and map to their corresponding integral types in the C ABI.
8-
- `bool`: 1 byte
9-
- any `bool` can be cast into an integer, taking on the values 1 (true) or 0 (false)
10-
- `usize`, `isize`: pointer-sized unsigned/signed integer type
11-
- `u8` .. `u128`, `i8` .. `i128`
12-
- {8, 16, 32, 64, 128}-bit unsigned integer
13-
- {8, 16, 32, 64, 128}-bit signed integer
14-
- `f32`, `f64`
15-
- IEEE floats
16-
- 32-bit or 64-bit
17-
- `char`
18-
- C++ char: equivalent to either `i8`/`u8`
19-
- Rust char: 32-bit
20-
- not ABI compatible
21-
- represents [Unicode scalar value](http://www.unicode.org/glossary/#unicode_scalar_value)
12+
## `bool`
13+
14+
Rust's `bool` has the same layout as C17's` _Bool`, that is, its size and
15+
alignment are implementation-defined. Any `bool` can be cast into an integer,
16+
taking on the values 1 (`true`) or 0 (`false`).
17+
18+
> **Note**: on all platforms that Rust's currently supports, its size and
19+
> alignment are 1, and its ABI class is `INTEGER` - see [Rust Layout and ABIs].
20+
21+
[Rust Layout and ABIs]: https://gankro.github.io/blah/rust-layouts-and-abis/#the-layoutsabis-of-builtins
22+
23+
## `char`
24+
25+
Rust char is 32-bit wide and represents [Unicode scalar value](http://www.unicode.org/glossary/#unicode_scalar_value).
26+
27+
> **Note**: Rust `char` type is not layout compatible with C / C++ `char` types.
28+
> The C / C++ `char` types correspond to either Rust's `i8` or `u8` types on all
29+
> currently supported platforms, depending on their signedness. Rust does not
30+
> support C platforms in which C `char` is not 8-bit wide.
2231
2332
## `usize`/`isize`
24-
Types `usize` and `isize` are committed to having the same size as a native pointer on the platform. The layout of `usize` determines the following:
25-
- how much a pointer of a certain type can be offseted,
26-
- the maximum size of Rust objects (because size_of/size_of_val return `usize`),
27-
- the maximum number of elements in an array (`[T; N: usize]`),
28-
- `usize`/`isize` in C FFI are compatible with C's `uintptr_t` / `intptr_t` (and have the same size and alignment).
2933

30-
The maximum size of any single value must fit within `usize` to [ensure that pointer diff is representable](https://github.com/rust-rfcs/unsafe-code-guidelines/pull/5#discussion_r212703192).
34+
The `usize` and `isize` types are pointer-sized signed and unsigned integers.
35+
They have the same layout as the [pointer types] for which the pointee is
36+
`Sized`, and are layout compatible with C's `uintptr_t` and `intptr_t` types.
37+
38+
> **Note**: Rust's `usize` and C's `unsigned` types are **not** equivalent. C's
39+
> `unsigned` is at least as large as a short, allowed to have padding bits, etc.
40+
> but it is not necessarily pointer-sized.
3141
32-
`usize` and C’s `unsized` are *not* equivalent.
42+
The layout of `usize` determines the following:
3343

34-
## Booleans
35-
Rust's `bool` has the same layout as C17's` _Bool`, that is, its size and alignment are implementation-defined.
44+
- the maximum size of Rust objects (`size_of` and `size_of_val` return `usize`),
45+
- the maximum number of elements in an array (`[T; N: usize]`),
46+
- how much a pointer of a certain type can be offseted (limited by `usize::max_value()`).
47+
48+
> **FIXME**: Pointer `add` operates on `usize`, but pointer `offset` operates on
49+
> `isize`, so unless by "offseted" we mean something different from `ptr.offset`
50+
> above, `usize::max_value()` does not determine how much a pointer can be
51+
> "offseted". We should probably be more specific here and call out `ptr.add`
52+
> and `ptr.offset` explicitly.
53+
54+
The maximum size of any single value must fit within `usize` to [ensure that
55+
pointer diff is
56+
representable](https://github.com/rust-rfcs/unsafe-code-guidelines/pull/5#discussion_r212703192).
3657

37-
Note: on all platforms that Rust's currently supports, the size and alignment of bool are 1, and its ABI class is INTEGER.
58+
> **FIXME**: This does not make sense. We state that the layout of `usize`
59+
> determines the maximum size of an object, and then argue that this is to
60+
> ensure that pointer diff is representable, which won't be the case if the size
61+
> of an object is `usize::max_val()`. The link cited actually states that, right
62+
> now, the largest size of a Rust object is limited by `isize::max_value()`.
3863
39-
For full ABI compatibility details, see [Gankro’s post](https://gankro.github.io/blah/rust-layouts-and-abis/#the-layoutsabis-of-builtins).
64+
[pointer types]: ./pointers.md
4065

4166
## Fixed-width integer types
4267

43-
Rust's fixed-width integer types `{i,u}{8,16,32,64}` have the same layout as the
44-
C fixed-width integer types from the `<stdint.h>` header
68+
Rust's signed and unsigned fixed-width integer types `{i,u}{8,16,32,64}` have
69+
the same layout as the C fixed-width integer types from the `<stdint.h>` header
4570
`{u,}int{8,16,32,64}_t`. That is:
4671

4772
* these types have no padding bits,
4873
* their size exactly matches their bit-width,
49-
* negative values of signed integer types are represented using `2`'s complement.
74+
* negative values of signed integer types are represented using 2's complement.
75+
76+
This properties also hold for Rust's 128-bit wide `{i,u}128` integer types, but
77+
C does not expose equivalent types in `<stdint.h>`.
78+
79+
Rust fixed-width integer types are therefore safe to use directly in C FFI where
80+
the corresponding C fixed-width integer types are expected.
81+
integer types are expected.
5082

51-
Therefore these integer types are safe to use directly in C FFI where the
52-
fixed-width integer types are expected.
83+
### Layout compatibility with C native integer types
84+
85+
The specification of native C integer types, `char`, `short`, `int`, `long`,
86+
... as well as their `unsigned` variants, has a lower bound on their size,
87+
e.g., `short` is _at least_ 16-bit wide and _at least_ as wide as `char`.
88+
Their actual exact sizes are _implementation-defined_.
89+
90+
Libraries like `libc` use knowledge of this _implementation-defined_ behavior on
91+
each platform to select a layout-compatible Rust fixed-width integer type when
92+
interfacing with native C integer types.
93+
94+
> **Note**: Rust does not support C platforms on which the C native integer type
95+
> are not compatible with any of Rust's fixed-width integer type (e.g. because
96+
> of padding-bits, lack of 2's complement, etc.).
5397
5498
## Fixed-width floating point types
5599

56-
Rust's `f32` and `f64` types have the same layout as C's `float` and `double`
57-
types, respectively. Therefore these floating-point types are safe to use
58-
directly in C FFI where the appropriate C types are expected.
59-
60-
## Relationship to C integer hierarchy
61-
C integers:
62-
- char: at least 8 bits
63-
- short: at least 16 bits (also at least a char)
64-
- int: at least a short (intended to be a native integer size)
65-
- long: at least 32 bits (also at least an int)
66-
- long long: at least 64 bits (also at least a long)
67-
The C integer types specify a minimum size, but not the exact size. For this reason, Rust integer types are not necessarily compatible with the “corresponding” C integer type. Instead, use the corresponding fixed size data types (e.g. `i64` in Rust would correspond to `int64_t` in C).
68-
69-
## Controversies
70-
There has been some debate about what to pick as the "official" behavior for bool:
71-
* Rust does what C does (this is what the lang team decided)
72-
* and in all cases you care about, that is 1 byte that is 0 or 1
73-
or
74-
* Rust makes it 1 byte with values 0 or 1
75-
* and in all cases you care about, this is what C does
76-
77-
Related discussions: [document the size of bool](https://github.com/rust-lang/rust/pull/46156), [bool== _Bool?](https://github.com/rust-rfcs/unsafe-code-guidelines/issues/53#issuecomment-447050232), [bool ABI](https://github.com/rust-lang/rust/pull/46176#issuecomment-359593446)
78-
100+
Rust's `f32` and `f64` single (32-bit) and double (64-bit) precision
101+
floating-point types have [IEEE-754] `binary32` and `binary64` floating-point
102+
layouts, respectively.
103+
104+
When the platforms' `"math.h"` header defines the `__STDC_IEC_559__` macro,
105+
Rust's floating-point types are safe to use directly in C FFI where the
106+
appropriate C types are expected (`f32` for `float`, `f64` for `double`).
107+
108+
If the C platform's `"math.h"` header does not define the `__STDC_IEC_559__`
109+
macro, whether using `f32` and `f64` in C FFI is safe or not for which C type is
110+
_implementation-defined_.
79111

112+
> **Note**: the `libc` crate uses knowledge of each platform's
113+
> _implementation-defined_ behavior to provide portable `libc::c_float` and
114+
> `libc::c_double` types that can be used to safely interface with C via FFI.
80115
116+
[IEEE-754]: https://en.wikipedia.org/wiki/IEEE_754

0 commit comments

Comments
 (0)