Skip to content

Commit 5789106

Browse files
committed
many many pnkfelix fixes
1 parent 05bb1db commit 5789106

File tree

7 files changed

+200
-126
lines changed

7 files changed

+200
-126
lines changed

src/doc/tarpl/exotic-sizes.md

Lines changed: 49 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -9,19 +9,24 @@ is not always the case, however.
99

1010
# Dynamically Sized Types (DSTs)
1111

12-
Rust also supports types without a statically known size. On the surface, this
13-
is a bit nonsensical: Rust *must* know the size of something in order to work
14-
with it! DSTs are generally produced as views, or through type-erasure of types
15-
that *do* have a known size. Due to their lack of a statically known size, these
16-
types can only exist *behind* some kind of pointer. They consequently produce a
17-
*fat* pointer consisting of the pointer and the information that *completes*
18-
them.
19-
20-
For instance, the slice type, `[T]`, is some statically unknown number of
21-
elements stored contiguously. `&[T]` consequently consists of a `(&T, usize)`
22-
pair that specifies where the slice starts, and how many elements it contains.
23-
Similarly, Trait Objects support interface-oriented type erasure through a
24-
`(data_ptr, vtable_ptr)` pair.
12+
Rust in fact supports Dynamically Sized Types (DSTs): types without a statically
13+
known size or alignment. On the surface, this is a bit nonsensical: Rust *must*
14+
know the size and alignment of something in order to correctly work with it! In
15+
this regard, DSTs are not normal types. Due to their lack of a statically known
16+
size, these types can only exist behind some kind of pointer. Any pointer to a
17+
DST consequently becomes a *fat* pointer consisting of the pointer and the
18+
information that "completes" them (more on this below).
19+
20+
There are two major DSTs exposed by the language: trait objects, and slices.
21+
22+
A trait object represents some type that implements the traits it specifies.
23+
The exact original type is *erased* in favour of runtime reflection
24+
with a vtable containing all the information necessary to use the type.
25+
This is the information that completes a trait object: a pointer to its vtable.
26+
27+
A slice is simply a view into some contiguous storage -- typically an array or
28+
`Vec`. The information that completes a slice is just the number of elements
29+
it points to.
2530

2631
Structs can actually store a single DST directly as their last field, but this
2732
makes them a DST as well:
@@ -34,8 +39,8 @@ struct Foo {
3439
}
3540
```
3641

37-
**NOTE: As of Rust 1.0 struct DSTs are broken if the last field has
38-
a variable position based on its alignment.**
42+
**NOTE: [As of Rust 1.0 struct DSTs are broken if the last field has
43+
a variable position based on its alignment][dst-issue].**
3944

4045

4146

@@ -56,22 +61,32 @@ struct Baz {
5661
}
5762
```
5863

59-
On their own, ZSTs are, for obvious reasons, pretty useless. However as with
60-
many curious layout choices in Rust, their potential is realized in a generic
61-
context.
62-
63-
Rust largely understands that any operation that produces or stores a ZST can be
64-
reduced to a no-op. For instance, a `HashSet<T>` can be effeciently implemented
65-
as a thin wrapper around `HashMap<T, ()>` because all the operations `HashMap`
66-
normally does to store and retrieve values will be completely stripped in
67-
monomorphization.
68-
69-
Similarly `Result<(), ()>` and `Option<()>` are effectively just fancy `bool`s.
64+
On their own, Zero Sized Types (ZSTs) are, for obvious reasons, pretty useless.
65+
However as with many curious layout choices in Rust, their potential is realized
66+
in a generic context: Rust largely understands that any operation that produces
67+
or stores a ZST can be reduced to a no-op. First off, storing it doesn't even
68+
make sense -- it doesn't occupy any space. Also there's only one value of that
69+
type, so anything that loads it can just produce it from the aether -- which is
70+
also a no-op since it doesn't occupy any space.
71+
72+
One of the most extreme example's of this is Sets and Maps. Given a
73+
`Map<Key, Value>`, it is common to implement a `Set<Key>` as just a thin wrapper
74+
around `Map<Key, UselessJunk>`. In many languages, this would necessitate
75+
allocating space for UselessJunk and doing work to store and load UselessJunk
76+
only to discard it. Proving this unnecessary would be a difficult analysis for
77+
the compiler.
78+
79+
However in Rust, we can just say that `Set<Key> = Map<Key, ()>`. Now Rust
80+
statically knows that every load and store is useless, and no allocation has any
81+
size. The result is that the monomorphized code is basically a custom
82+
implementation of a HashSet with none of the overhead that HashMap would have to
83+
support values.
7084

7185
Safe code need not worry about ZSTs, but *unsafe* code must be careful about the
7286
consequence of types with no size. In particular, pointer offsets are no-ops,
73-
and standard allocators (including jemalloc, the one used by Rust) generally
74-
consider passing in `0` as Undefined Behaviour.
87+
and standard allocators (including jemalloc, the one used by default in Rust)
88+
generally consider passing in `0` for the size of an allocation as Undefined
89+
Behaviour.
7590

7691

7792

@@ -93,11 +108,12 @@ return a Result in general, but a specific case actually is infallible. It's
93108
actually possible to communicate this at the type level by returning a
94109
`Result<T, Void>`. Consumers of the API can confidently unwrap such a Result
95110
knowing that it's *statically impossible* for this value to be an `Err`, as
96-
this would require providing a value of type Void.
111+
this would require providing a value of type `Void`.
97112

98113
In principle, Rust can do some interesting analyses and optimizations based
99114
on this fact. For instance, `Result<T, Void>` could be represented as just `T`,
100-
because the Err case doesn't actually exist. The following *could* also compile:
115+
because the `Err` case doesn't actually exist. The following *could* also
116+
compile:
101117

102118
```rust,ignore
103119
enum Void {}
@@ -116,3 +132,6 @@ actually valid to construct, but dereferencing them is Undefined Behaviour
116132
because that doesn't actually make sense. That is, you could model C's `void *`
117133
type with `*const Void`, but this doesn't necessarily gain anything over using
118134
e.g. `*const ()`, which *is* safe to randomly dereference.
135+
136+
137+
[dst-issue]: https://github.com/rust-lang/rust/issues/26403

src/doc/tarpl/meet-safe-and-unsafe.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Programmers in safe "high-level" languages face a fundamental dilemma. On one
44
hand, it would be *really* great to just say what you want and not worry about
5-
how it's done. On the other hand, that can lead to some *really* poor
5+
how it's done. On the other hand, that can lead to unacceptably poor
66
performance. It may be necessary to drop down to less clear or idiomatic
77
practices to get the performance characteristics you want. Or maybe you just
88
throw up your hands in disgust and decide to shell out to an implementation in
@@ -12,21 +12,22 @@ Worse, when you want to talk directly to the operating system, you *have* to
1212
talk to an unsafe language: *C*. C is ever-present and unavoidable. It's the
1313
lingua-franca of the programming world.
1414
Even other safe languages generally expose C interfaces for the world at large!
15-
Regardless of *why* you're doing it, as soon as your program starts talking to
15+
Regardless of why you're doing it, as soon as your program starts talking to
1616
C it stops being safe.
1717

1818
With that said, Rust is *totally* a safe programming language.
1919

2020
Well, Rust *has* a safe programming language. Let's step back a bit.
2121

22-
Rust can be thought of as being composed of two
23-
programming languages: *Safe* and *Unsafe*. Safe is For Reals Totally Safe.
24-
Unsafe, unsurprisingly, is *not* For Reals Totally Safe. In fact, Unsafe lets
25-
you do some really crazy unsafe things.
22+
Rust can be thought of as being composed of two programming languages: *Safe
23+
Rust* and *Unsafe Rust*. Safe Rust is For Reals Totally Safe. Unsafe Rust,
24+
unsurprisingly, is *not* For Reals Totally Safe. In fact, Unsafe Rust lets you
25+
do some really crazy unsafe things.
2626

27-
Safe is *the* Rust programming language. If all you do is write Safe Rust,
28-
you will never have to worry about type-safety or memory-safety. You will never
29-
endure a null or dangling pointer, or any of that Undefined Behaviour nonsense.
27+
Safe Rust is the *true* Rust programming language. If all you do is write Safe
28+
Rust, you will never have to worry about type-safety or memory-safety. You will
29+
never endure a null or dangling pointer, or any of that Undefined Behaviour
30+
nonsense.
3031

3132
*That's totally awesome*.
3233

@@ -69,17 +70,16 @@ language cares about is preventing the following things:
6970
* A non-utf8 `str`
7071
* Unwinding into another language
7172
* Causing a [data race][race]
72-
* Double-dropping a value
7373

74-
That's it. That's all the Undefined Behaviour baked into Rust. Of course, unsafe
75-
functions and traits are free to declare arbitrary other constraints that a
76-
program must maintain to avoid Undefined Behaviour. However these are generally
77-
just things that will transitively lead to one of the above problems. Some
78-
additional constraints may also derive from compiler intrinsics that make special
79-
assumptions about how code can be optimized.
74+
That's it. That's all the causes of Undefined Behaviour baked into Rust. Of
75+
course, unsafe functions and traits are free to declare arbitrary other
76+
constraints that a program must maintain to avoid Undefined Behaviour. However,
77+
generally violations of these constraints will just transitively lead to one of
78+
the above problems. Some additional constraints may also derive from compiler
79+
intrinsics that make special assumptions about how code can be optimized.
8080

81-
Rust is otherwise quite permissive with respect to other dubious operations. Rust
82-
considers it "safe" to:
81+
Rust is otherwise quite permissive with respect to other dubious operations.
82+
Rust considers it "safe" to:
8383

8484
* Deadlock
8585
* Have a [race condition][race]

src/doc/tarpl/other-reprs.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,21 +12,21 @@ The order, size, and alignment of fields is exactly what you would expect from C
1212
or C++. Any type you expect to pass through an FFI boundary should have
1313
`repr(C)`, as C is the lingua-franca of the programming world. This is also
1414
necessary to soundly do more elaborate tricks with data layout such as
15-
reintepretting values as a different type.
15+
reinterpreting values as a different type.
1616

1717
However, the interaction with Rust's more exotic data layout features must be
1818
kept in mind. Due to its dual purpose as "for FFI" and "for layout control",
1919
`repr(C)` can be applied to types that will be nonsensical or problematic if
2020
passed through the FFI boundary.
2121

22-
* ZSTs are still zero-sized, even though this is not a standard behaviour in
22+
* ZSTs are still zero-sized, even though this is not a standard behaviour in
2323
C, and is explicitly contrary to the behaviour of an empty type in C++, which
2424
still consumes a byte of space.
2525

2626
* DSTs, tuples, and tagged unions are not a concept in C and as such are never
2727
FFI safe.
2828

29-
* **The [drop flag][] will still be added**
29+
* **If the type would have any [drop flags][], they will still be added**
3030

3131
* This is equivalent to one of `repr(u*)` (see the next section) for enums. The
3232
chosen size is the default enum size for the target platform's C ABI. Note that
@@ -39,10 +39,10 @@ compiled with certain flags.
3939
# repr(u8), repr(u16), repr(u32), repr(u64)
4040

4141
These specify the size to make a C-like enum. If the discriminant overflows the
42-
integer it has to fit in, it will be an error. You can manually ask Rust to
43-
allow this by setting the overflowing element to explicitly be 0. However Rust
44-
will not allow you to create an enum where two variants have the same
45-
discriminant.
42+
integer it has to fit in, it will produce a compile-time error. You can manually
43+
ask Rust to allow this by setting the overflowing element to explicitly be 0.
44+
However Rust will not allow you to create an enum where two variants have the
45+
same discriminant.
4646

4747
On non-C-like enums, this will inhibit certain optimizations like the null-
4848
pointer optimization.
@@ -65,9 +65,12 @@ compiler might be able to paper over alignment issues with shifts and masks.
6565
However if you take a reference to a packed field, it's unlikely that the
6666
compiler will be able to emit code to avoid an unaligned load.
6767

68+
**[As of Rust 1.0 this can cause undefined behaviour.][ub loads]**
69+
6870
`repr(packed)` is not to be used lightly. Unless you have extreme requirements,
6971
this should not be used.
7072

7173
This repr is a modifier on `repr(C)` and `repr(rust)`.
7274

73-
[drop flag]: drop-flags.html
75+
[drop flags]: drop-flags.html
76+
[ub loads]: https://github.com/rust-lang/rust/issues/27060

src/doc/tarpl/ownership.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,17 @@ memory-safe and efficient, while avoiding garbage collection. Before getting
55
into the ownership system in detail, we will consider the motivation of this
66
design.
77

8-
We will assume that you accept that garbage collection is not always an optimal
9-
solution, and that it is desirable to manually manage memory to some extent.
10-
If you do not accept this, might I interest you in a different language?
8+
We will assume that you accept that garbage collection (GC) is not always an
9+
optimal solution, and that it is desirable to manually manage memory in some
10+
contexts. If you do not accept this, might I interest you in a different
11+
language?
1112

1213
Regardless of your feelings on GC, it is pretty clearly a *massive* boon to
1314
making code safe. You never have to worry about things going away *too soon*
1415
(although whether you still *wanted* to be pointing at that thing is a different
15-
issue...). This is a pervasive problem that C and C++ need to deal with.
16-
Consider this simple mistake that all of us who have used a non-GC'd language
17-
have made at one point:
16+
issue...). This is a pervasive problem that C and C++ programs need to deal
17+
with. Consider this simple mistake that all of us who have used a non-GC'd
18+
language have made at one point:
1819

1920
```rust,ignore
2021
fn as_str(data: &u32) -> &str {
@@ -40,7 +41,7 @@ be forced to accept your program on the assumption that it is correct.
4041
This will never happen to Rust. It's up to the programmer to prove to the
4142
compiler that everything is sound.
4243

43-
Of course, rust's story around ownership is much more complicated than just
44+
Of course, Rust's story around ownership is much more complicated than just
4445
verifying that references don't escape the scope of their referent. That's
4546
because ensuring pointers are always valid is much more complicated than this.
4647
For instance in this code,

0 commit comments

Comments
 (0)