This design meeting will cover problems around &
and the dereferenceable attribute, such as rust-lang/rust#55005 and MMIO.
- Arc::drop has a (potentially) dangling shared ref #55005
- Ref and RefCell lifetimes are a “lie” rust-lang/unsafe-code-guidelines#125
- Japaric’s embedded RFC rust-embedded/wg#387
- MMIO and volatile memory conflicts rust-embedded/volatile-register#10
- Zulip conversation about dereferenceable attributes and LLVM (link)
- LLVM “introduce
dereference_globally
" https://reviews.llvm.org/D61652 - LLVM “add/infer a
nofree
function attribute” https://reviews.llvm.org/D49165
- LLVM “introduce
- Current behavior:
- The meaning of the “dereferenceable” attribute in LLVM
- we have this attribute, using it to model C++ references, but realized that it is not what people need
- the intended semantics was vaguely “this thing will be dereferenceable through the function call”
- but the C++ spec doesn’t say that, the function can deallocate the memory backing a
T&
during course of the function, it just has to be “dereferenceable initially”- they intend to fix this
- there has been discussion of how best to fix, so that you can express that something is dereferenceable initially
- What changes are being considered in the future from LLVM’s perspective?
- a “nofree” function indicates that it doesn’t free anything
- so maintains dereferenceability (if you ignore concurrency)
- with concurrency, you have more problems (c.f. Arc)
- a “nosync” function doesn’t “interact with other threads” (no atomic ops, no loads/stores, no synchronizes-with edges introduced)
- “nofree” can also be put on parameters, perhaps, to indicate that data behind a particular pointer is not freed
- dereferenceable will change to have the weaker semantics to (ret-con) compatibility
dereference_globally
means it will never be derefered ever? maybe useful, but not for Rust
- a “nofree” function indicates that it doesn’t free anything
- How do we justify “derefenceable” in stacked borrows?
- Context: We intend to justify emitting derefenceable because of UB caused under stacked borrows. If we were going to try to stop emitting dereferenceable, we’d presumably also have to adjust stacked borrows to make the pattern not UB. Right?
- Answer: we justify by introducing these “protectors” on entry/exit from function
- when you push something onto the stack, there’s an optional flag to say “this is protected by that function item”
- when you pop an item that is “protected by” a fn that is ongoing, that is UB
- this is “per item” — i.e., per element on the “stacked borrows stack”
- a side effect of protectors is that having two
&mut
arguments that wind up being given the same value by caller, that is UB due to adding the protectors- Stacked Borrows - An Aliasing Model for Rust, https://www.youtube.com/watch?v=h9Fh4jRDGLo
- The meaning of the “dereferenceable” attribute in LLVM
- What kinds of bugs do we run into
[Arc::drop](https://github.com/rust-lang/rust/issues/55005)
#55005- thread A has an
AtomicUsize
, it does a decrement (and ref count reaches 1) - thread B does a decrement (and ref count reaches 0) and
- Stacked Borrows issue: rust-lang/unsafe-code-guidelines#88
- thread A has an
- MMIO and volatile memory conflicts rust-embedded/volatile-register#10
Quoting Ralf: “They want to define structs like…
struct MyRegisters {
reg1: ReadOnly,
reg2: WriteOnly,
reg3: ReadWrite
}
…and then hand around references to such structs as their encoding of MMIO registers. But any reference currently permits spurious accesses, which obviously they cannot have.”
However, in this case, ReadOnly
etc are presumably wrappers around a VolatileCell
which in turn raises the question of how VolatileCell
is to be implemented.
The RFC proposes various alternatives:
-
“ZST all the way down” — use a unique type for every hardware set of register
- each such type implements
read() → u8
or whatever, which is implemented by something like ptr::read(0xXXXXX as *const u8)`
- each such type implements
-
“VolAddress” — create a
VolAddress
type whose value is the address to read from; theread
op is thenptr::read(self.address as *const u8)
-
Proposal 1 (addresses MMIO, not Arc):
- can we make raw pointers sufficiently ergonomic?
- so you are passing around
*MyRegisters
- ergonomic hits from raw pointers:
- integration with methods and traits (
*const self
) - some form of postfix deref (or even auto-deref?), so that you can do
- first-class
&x
form that is nicer than&raw const x
? autoref?
- integration with methods and traits (
- concern: “viral”
- you can’t have a “safe struct” that contains
-
Proposal: remove dereferenceable for
&UnsafeCell
(comment)- What does this mean in terms of stacked borrows?
- make adding protectors dependent on the type?
- how infectious is it? if you have a
(u8, UnsafeCell<T>)
, does it affect theu8
?- the effect of an unsafe cell is already “expanded” to cover enum variants
- but not presently other struct fields
- what about generic contexts (it interacts with optimizations potentially)
- is this just about references? what if you newtype a reference and pass it to a struct, or pass in a
struct BunchOfRefs<``'``a> { .. }
?- the drain iterator of a vector is a struct that contains a
&``[T]
, and it (potentially) deallocates memory that is referenced by that&
- two more instances:
Ref
andRefMut
, the RefCell handles - rust-lang/unsafe-code-guidelines#125
- the drain iterator of a vector is a struct that contains a
- What does this help, and what doesn’t it help?
- Ralf mentioned “this helps some cases…”, which cases doesn’t it help?
// a pattern like this wouldn't necessarily be helped fn foo(x: &AtomicUsize, y: &Bar) { // in particular if the
AtomicUSize
is not in they
// // imagine say that allBar
are handles to some global resource // with a single atomic counter if x.decrement() == 0 { free(y); } } - doesn’t help with “the drain problem” — where drain internally has an&
withoutUnsafeCell
to “the things still to be drained”, but there is some function you can call that will cause the -RefMut
has a&mut
references, so it does not help there either fn foo(x: RefMut<'a>, y: &RefCell<..>){ // the&mut
insidex
gets a protector on entry here drop(x); // the "lock" is releasedy.borrow_mut(); // this will pop the protector, leading to UB
} - problem is not confined to
&UnsafeCell<T>
-- - the “disconnected” Arc example - drain is&T
withoutUnsafeCell
-RefMut
has&mut
- interestingly, theRefCell
test suite never hits the bad pattern here :(- One observation is that we would lose some amount of optimization, do we have any idea how much?
- We might enable a “opt back in” for types like
Cell
- We might enable a “opt back in” for types like
- Related to, but distinct from, the decision to forbid niches within
UnsafeCell<T>
- Like with niches: part of this is that
UnsafeCell
presently implies multiple threads potentially accessing the data - Unlike with niches: the violations of dereferenceable are not inherently caused by the presence of multiple threads, but more the things that threads commonly want to do (e.g., maintain ref counts)
- Also: there are other patterns unrelated to
&UnsafeCell
, as noted above
- Like with niches: part of this is that
- What does this mean in terms of stacked borrows?
-
One actionable thing is to try to make raw pointers more ergonomic, so that the rule of thumb
-
One thing these have in common is that you are “lieing” about a lifetime somewhere
- they’re all cases where you want to have a pointer that is potentially dangling
- we have
NonNull<T>
today but very unergonomic
-
boats has been pondering the idea of a pointer in between raw and ref
&unsafe
is working title, would also be an operatorlet x = &unsafe y
- “can dangle” but can’t be null, may not be aligned, not safe to read
Option<&unsafe T>
would get the niche optimization
&unsafe T // non-null, non-aligned, but no lifetime (unsafe to deref) coerces to *const T etc &unsafe mut T
- impact “virality” for safe abstractions:
- it feels weird that
&unsafe RegisterBlock
is the abstraction - you would make
struct RegisterBlock``Ref
{ &unsafe RegisterBlock }
- this has limitations, but is analogous to
Ref
(and hence has things likeRef::map
) - maybe we could do something that makes that more natural, some kind of “safe projection” mechanism
- unsafe references makes it easier to build,
- this has limitations, but is analogous to
- it feels weird that
- one thing we could do:
- emit ‘dereferenceable-on-entry’, which solves most of the above problems
- would still mean that
fn helper(&self)
(which never uses)
- would still mean that
- but not MMIO (which are never dereferenceable)
- also gives up most optimization potential
- emit ‘dereferenceable-on-entry’, which solves most of the above problems
- connection to editions:
- we might want to reserve some syntactic space here?
- but
&unsafe
already dodges most of the problems,&unsafe {
is the only “overlap” today
&unsafe { 0 } - You can use 1 token look-ahead to parse
&unsafe { … }
in the way it is parsed today. - and would we maybe want to reserveraw
- stdlib uses it, as do a number of libraries (e.g., C API wrappers may put the extern API in a raw module) -
Niko felt like they came in wanting to remove
dereferenceable
from&UnsafeCell
but now are less sure that is a good choice- but if we “just” added
&unsafe
would that help to address theArc
problems? The APIs onAtomicUsize
etc exist, and they take&self
…?- presumably, if we made no other changes, either
fetch_sub
etc have to be modified to take&unsafe self
(breaking change? maybe propagates to things likeAtomicCell
from crossbeam too..?) - or else we add new methods that take
&unsafe self
- and we have to propagate that back to things invoked from within the destructor
- presumably, if we made no other changes, either
- but if we “just” added
-
two main ways to address existing
&self
APIs:- “remove derefenceable attribute entirely from
&UnsafeCell<T>
"- open design question: maybe still mark the parts outside
UnsafeCell
“dereferencable”? Or make it fully infectious? - pros
- addresses MMIO and existing Arc APIs
- a partially infectious solution would still break in a simple variant of
Arc
that has a helper methodfn decrement(&ArcInner)
- a partially infectious solution would still break in a simple variant of
- maintains a lot of optimization potential: the no-longer-deref pointers are anyway escaped and do not have
noalias
, so they do not get strong optimizations
- addresses MMIO and existing Arc APIs
- cons
RefMut
andDrain
are unfixed (would have to check forRef
)- would need to use raw pointers
- similarly trying to separate ref count from memory to be freed
- more complex, type-dependent behavior
- may interact with trying to do optimization pre-monomorphization (though not more than what we already do with
noalias
)
- may interact with trying to do optimization pre-monomorphization (though not more than what we already do with
- have to figure out “how infectious”
UnsafeCell
is- but, this is already somewhat true, since
UnsafeCell
has an impact on&T
- but, this is already somewhat true, since
- open design question: maybe still mark the parts outside
- “downgrade all derereference attributes to derefenceable-on-entry”
- pros
- simple and uniform
- fixes everything except for MMIO
- cons
- we lose out on the more advanced, protector-based optimizations described in the stacked borrows paper (the ones that extend the lifetime of a pointer, moving an access down across call where previously the pointer was not used after the call)
- doesn’t address MMIO
- would need to use newtyped-wrapped
&unsafe
ptr or so
- would need to use newtyped-wrapped
- pros
- “downgrade attributes on
&UnsafeCell
only” — a mix of the two above
- “remove derefenceable attribute entirely from