-
Notifications
You must be signed in to change notification settings - Fork 15
"Tootsie Pop" model #21
Comments
The issue I have with the "tootsie pop" model is that you then need to understand 2, if not 3 (if you want fast code) models. I still like it just about as much as I did when you first proposed it. Giving up optimization opportunities to make the model easier to understand is, imho, a Good Thing; we don't need insane optimizations: our users are already writing procedural code, if they need faster, they can write it faster. User optimization (when based on evidence) will always win out over compiler optimization (because user optimization has compiler optimization to back it up ;P), and hurting the ability of the user to optimize makes me very uncomfortable (see: TBAA and signed integer overflow in C) |
How about by default performing the high optimization environment, but when people want to do crazy things that they aren't sure are safe, they can put a We then just teach people that "if you're writing unsafe code which interacts in potentially unsafe ways with safe code in the same module, use #![unsafe_module] to make sure that you don't trigger UB". So the model would be The unsafe_module attribute could also take arguments which describe what types of optimizations to inhibit. I don't super like this idea, because it could be confusing, but it does make the lexical unsafety boundary explicitly visible, and give people power to move it. |
This phrasing, to me, illustrates what I think is one of the biggest problems. Namely, it's incorrect about what circumstances one would need to use In particular, even in the "unchecked The real difference is that With that framing, Of course, |
This is a concern of mine as well. Shortly after posting the original TPM post, I was going to write a follow-up basically describing this scheme, which describes the notion of narrowing the "unsafe abstraction" region. But in writing up that post I realized that I myself had two distinct notions of what the unsafe abstraction region ought to mean and I hadn't even fully realized it. One of them is the "logical" abstraction region, which aligns with privacy. And the other is the "type trusting" region -- just as you describe. This gave me pause and made me feel that perhaps this is indeed barking up the wrong tree. Perhaps there is a simpler way to frame things that winds up feeling less subtle. I've since reconsidered and am now in the middle of the road again. =) I very much want to pursue other avenues, but I think that maybe talking explicitly about being able to designate the boundary where:
might work out ok, but I still hope to find an alternative. |
A couple of questions about the tootsie pop... When you exit an unsafe boundary, are you required to restore the Rust memory safety invariants for all memory, or are you allowed to have memory that is only reachable via your module (e.g. via a private field)? When an unsafe module calls a safe module, does that count as crossing a safety boundary, so the memory safety invariants need to be restored? If yes, then how does unsafe code do anything (e.g. use a logger)? If no, then do we need to compile every module twice, once as a safe module, and once as an unsafe one? |
Just my 2 cents: I would argue that the safety invariants of the part of memory reachable by the safe module need to be restored. Essentially, that's all global variables and all arguments, and everything transitively reachable from them. However, things that are private to the unsafe module should be allowed to stay "tainted". |
@RalfJung yes, I'd been thinking something in terms of safe reachability. We could try something like saying that the safe roots from a module are the ones that escape from it, either by being returned or by passing as a callback argument. Then the safely reachable heap is the subset of it that includes the safe roots, and is closed under dereferencing public &T pointers. Ditto for the safely mutable heap. Each module is responsible for maintaining that the safely reachable heap maintains the Rust memory invariants. Something like this would answer both of my questions. It would also address some of the concurrency issues, since we could ask for unsafe code to always maintain safety of the safely reachable heap, not just at function call/return boundaries. |
The Tootsie Pop model leverages unsafe declarations to simultaneously permit aggressive optimization in safe code while being very accepting of unsafe code patterns. The high-level summary is roughly:
This has the advantage of being very permissive -- if we pick a suitable scope for the unsafe abstraction, I suspect that most any unsafe code that is out there in the wild which is sort of "remotely correct" will work out fine. But its achilles heel is that it can inhibit quite a lot of optimization. Somewhat annoyingly, it seems to interact poorly with both simple and complex cases of unsafe code:
Where the Tootsie Pop model does really well is the "middle" cases -- unsafe code that manipulates pointers and so forth, but where the author is not familiar with the unsafe code guidelines in depth. (The appeal of the Tootsie Pop model is thus dependent, to some extent, on how complex it is to understand and use the more advanced models.)
It's worth noting that even if we adopted the Tootsie Pop model, we'd likely still want to hammer out a more advanced model to cover the more advanced use cases.
The text was updated successfully, but these errors were encountered: