Skip to content

Commit 074b6e0

Browse files
committed
Auto merge of rust-lang#117329 - RalfJung:offset-by-zero, r=oli-obk,scottmcm
offset: allow zero-byte offset on arbitrary pointers As per prior `@rust-lang/opsem` [discussion](rust-lang/opsem-team#10) and [FCP](rust-lang/unsafe-code-guidelines#472 (comment)): - Zero-sized reads and writes are allowed on all sufficiently aligned pointers, including the null pointer - Inbounds-offset-by-zero is allowed on all pointers, including the null pointer - `offset_from` on two pointers derived from the same allocation is always allowed when they have the same address This removes surprising UB (in particular, even C++ allows "nullptr + 0", which we currently disallow), and it brings us one step closer to an important theoretical property for our semantics ("provenance monotonicity": if operations are valid on bytes without provenance, then adding provenance can't make them invalid). The minimum LLVM we require (v17) includes https://reviews.llvm.org/D154051, so we can finally implement this. The `offset_from` change is needed to maintain the equivalence with `offset`: if `let ptr2 = ptr1.offset(N)` is well-defined, then `ptr2.offset_from(ptr1)` should be well-defined and return N. Now consider the case where N is 0 and `ptr1` dangles: we want to still allow offset_from here. I think we should change offset_from further, but that's a separate discussion. Fixes rust-lang#65108 [Tracking issue](rust-lang#117945) | [T-lang summary](rust-lang#117329 (comment)) Cc `@nikic`
2 parents 725b38e + 376bf96 commit 074b6e0

File tree

4 files changed

+34
-33
lines changed

4 files changed

+34
-33
lines changed

core/src/intrinsics.rs

+5-5
Original file line numberDiff line numberDiff line change
@@ -1483,10 +1483,10 @@ extern "rust-intrinsic" {
14831483
///
14841484
/// # Safety
14851485
///
1486-
/// Both the starting and resulting pointer must be either in bounds or one
1487-
/// byte past the end of an allocated object. If either pointer is out of
1488-
/// bounds or arithmetic overflow occurs then any further use of the
1489-
/// returned value will result in undefined behavior.
1486+
/// If the computed offset is non-zero, then both the starting and resulting pointer must be
1487+
/// either in bounds or at the end of an allocated object. If either pointer is out
1488+
/// of bounds or arithmetic overflow occurs then any further use of the returned value will
1489+
/// result in undefined behavior.
14901490
///
14911491
/// The stabilized version of this intrinsic is [`pointer::offset`].
14921492
#[must_use = "returns a new pointer rather than modifying its argument"]
@@ -1502,7 +1502,7 @@ extern "rust-intrinsic" {
15021502
/// # Safety
15031503
///
15041504
/// Unlike the `offset` intrinsic, this intrinsic does not restrict the
1505-
/// resulting pointer to point into or one byte past the end of an allocated
1505+
/// resulting pointer to point into or at the end of an allocated
15061506
/// object, and it wraps with two's complement arithmetic. The resulting
15071507
/// value is not necessarily valid to be used to actually access memory.
15081508
///

core/src/ptr/const_ptr.rs

+13-10
Original file line numberDiff line numberDiff line change
@@ -465,8 +465,9 @@ impl<T: ?Sized> *const T {
465465
/// If any of the following conditions are violated, the result is Undefined
466466
/// Behavior:
467467
///
468-
/// * Both the starting and resulting pointer must be either in bounds or one
469-
/// byte past the end of the same [allocated object].
468+
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
469+
/// pointer must be either in bounds or at the end of the same [allocated object].
470+
/// (If it is zero, then the function is always well-defined.)
470471
///
471472
/// * The computed offset, **in bytes**, cannot overflow an `isize`.
472473
///
@@ -676,11 +677,11 @@ impl<T: ?Sized> *const T {
676677
/// If any of the following conditions are violated, the result is Undefined
677678
/// Behavior:
678679
///
679-
/// * Both `self` and `origin` must be either in bounds or one
680-
/// byte past the end of the same [allocated object].
680+
/// * `self` and `origin` must either
681681
///
682-
/// * Both pointers must be *derived from* a pointer to the same object.
683-
/// (See below for an example.)
682+
/// * both be *derived from* a pointer to the same [allocated object], and the memory range between
683+
/// the two pointers must be either empty or in bounds of that object. (See below for an example.)
684+
/// * or both be derived from an integer literal/constant, and point to the same address.
684685
///
685686
/// * The distance between the pointers, in bytes, must be an exact multiple
686687
/// of the size of `T`.
@@ -951,8 +952,9 @@ impl<T: ?Sized> *const T {
951952
/// If any of the following conditions are violated, the result is Undefined
952953
/// Behavior:
953954
///
954-
/// * Both the starting and resulting pointer must be either in bounds or one
955-
/// byte past the end of the same [allocated object].
955+
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
956+
/// pointer must be either in bounds or at the end of the same [allocated object].
957+
/// (If it is zero, then the function is always well-defined.)
956958
///
957959
/// * The computed offset, **in bytes**, cannot overflow an `isize`.
958960
///
@@ -1035,8 +1037,9 @@ impl<T: ?Sized> *const T {
10351037
/// If any of the following conditions are violated, the result is Undefined
10361038
/// Behavior:
10371039
///
1038-
/// * Both the starting and resulting pointer must be either in bounds or one
1039-
/// byte past the end of the same [allocated object].
1040+
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
1041+
/// pointer must be either in bounds or at the end of the same [allocated object].
1042+
/// (If it is zero, then the function is always well-defined.)
10401043
///
10411044
/// * The computed offset cannot exceed `isize::MAX` **bytes**.
10421045
///

core/src/ptr/mod.rs

+3-8
Original file line numberDiff line numberDiff line change
@@ -15,18 +15,13 @@
1515
//! The precise rules for validity are not determined yet. The guarantees that are
1616
//! provided at this point are very minimal:
1717
//!
18-
//! * A [null] pointer is *never* valid, not even for accesses of [size zero][zst].
18+
//! * For operations of [size zero][zst], *every* pointer is valid, including the [null] pointer.
19+
//! The following points are only concerned with non-zero-sized accesses.
20+
//! * A [null] pointer is *never* valid.
1921
//! * For a pointer to be valid, it is necessary, but not always sufficient, that the pointer
2022
//! be *dereferenceable*: the memory range of the given size starting at the pointer must all be
2123
//! within the bounds of a single allocated object. Note that in Rust,
2224
//! every (stack-allocated) variable is considered a separate allocated object.
23-
//! * Even for operations of [size zero][zst], the pointer must not be pointing to deallocated
24-
//! memory, i.e., deallocation makes pointers invalid even for zero-sized operations. However,
25-
//! casting any non-zero integer *literal* to a pointer is valid for zero-sized accesses, even if
26-
//! some memory happens to exist at that address and gets deallocated. This corresponds to writing
27-
//! your own allocator: allocating zero-sized objects is not very hard. The canonical way to
28-
//! obtain a pointer that is valid for zero-sized accesses is [`NonNull::dangling`].
29-
//FIXME: mention `ptr::dangling` above, once it is stable.
3025
//! * All accesses performed by functions in this module are *non-atomic* in the sense
3126
//! of [atomic operations] used to synchronize between threads. This means it is
3227
//! undefined behavior to perform two concurrent accesses to the same location from different

core/src/ptr/mut_ptr.rs

+13-10
Original file line numberDiff line numberDiff line change
@@ -480,8 +480,9 @@ impl<T: ?Sized> *mut T {
480480
/// If any of the following conditions are violated, the result is Undefined
481481
/// Behavior:
482482
///
483-
/// * Both the starting and resulting pointer must be either in bounds or one
484-
/// byte past the end of the same [allocated object].
483+
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
484+
/// pointer must be either in bounds or at the end of the same [allocated object].
485+
/// (If it is zero, then the function is always well-defined.)
485486
///
486487
/// * The computed offset, **in bytes**, cannot overflow an `isize`.
487488
///
@@ -904,11 +905,11 @@ impl<T: ?Sized> *mut T {
904905
/// If any of the following conditions are violated, the result is Undefined
905906
/// Behavior:
906907
///
907-
/// * Both `self` and `origin` must be either in bounds or one
908-
/// byte past the end of the same [allocated object].
908+
/// * `self` and `origin` must either
909909
///
910-
/// * Both pointers must be *derived from* a pointer to the same object.
911-
/// (See below for an example.)
910+
/// * both be *derived from* a pointer to the same [allocated object], and the memory range between
911+
/// the two pointers must be either empty or in bounds of that object. (See below for an example.)
912+
/// * or both be derived from an integer literal/constant, and point to the same address.
912913
///
913914
/// * The distance between the pointers, in bytes, must be an exact multiple
914915
/// of the size of `T`.
@@ -1095,8 +1096,9 @@ impl<T: ?Sized> *mut T {
10951096
/// If any of the following conditions are violated, the result is Undefined
10961097
/// Behavior:
10971098
///
1098-
/// * Both the starting and resulting pointer must be either in bounds or one
1099-
/// byte past the end of the same [allocated object].
1099+
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
1100+
/// pointer must be either in bounds or at the end of the same [allocated object].
1101+
/// (If it is zero, then the function is always well-defined.)
11001102
///
11011103
/// * The computed offset, **in bytes**, cannot overflow an `isize`.
11021104
///
@@ -1179,8 +1181,9 @@ impl<T: ?Sized> *mut T {
11791181
/// If any of the following conditions are violated, the result is Undefined
11801182
/// Behavior:
11811183
///
1182-
/// * Both the starting and resulting pointer must be either in bounds or one
1183-
/// byte past the end of the same [allocated object].
1184+
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
1185+
/// pointer must be either in bounds or at the end of the same [allocated object].
1186+
/// (If it is zero, then the function is always well-defined.)
11841187
///
11851188
/// * The computed offset cannot exceed `isize::MAX` **bytes**.
11861189
///

0 commit comments

Comments
 (0)