-
Notifications
You must be signed in to change notification settings - Fork 9
valtrees and padding #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Potential ways I can think of to fix this would be:
I currently think that the second option is the best one of these three. |
The behaviour only differs in as far as |
I assume it is because due to the way const equality works, we can probably end up with trait solving only evaluating one of the two constants, so while probably not unsound in an exploitable way, it would definitely be bad for this to compile and then break due to some unrelated change in the compiler or library 😓 |
oh... yea, since we compare generic consts by their body, we can indeed only evaluate one of them. But, whatever equivalence algorithm we come up with knows there is a valtree conversion in there, and we can check that the types that undergo a valtree conversion have no padding. Concepts like Although we also have problems around relocations if we ever expose something like There's probably other things, too, like if someone feeds two unrelated pointers into an unsafe function that expects them to point into the same allocation. Basically all UB is a problem here, as it could be fine without valtrees and only be UB after a valtree roundtrip. So, in summary:
I think I am overthinking the situation. The safe solution is to make sure all constants are evaluated, even if they get deduplicated, just like we do with MIR optimizations. That's fairly annoying, but seems like the safe route. Another one is to always pick one specific constant (the one with the valtree roundtrip), which is kind of your (3) solution, but only if two constants aren't perfectly the same. This basically makes the valtree-rounddrip an explicit part of comparing generic constants. We could even go as far as making comparison of two generic constants ( |
This seems to be the crux here: the new constraint that is imposed by examples like this is that two CTFE values, if they are equal after creating valtrees, must be fully observationally equivalent. That is a lot stronger than the requirement that I thought would be necessary. I think in a first step it would probably be better to not do this kind of unification in these cases. ("These cases" being the ones where valtree construction is lossy, which AFAIK is only the case for types that have padding.) Longer-term, could partially solve this by fixing CTFE: indeed I consider it to be a bug that the value in padding is preserved when a value gets copied around; this does not represent the actual intended semantics of Rust. But indeed in your example the padding is behind an indirection, and then there really is no UB here: it is perfectly legitimate for unsafe code to look at the padding byte of a
The thing is, there is no UB in the non-valtree-roundtrip version of this code. In fact I think this is at the very least rather surprising even if we disregard the unification problem: I might have (The original example here is even more tricky since it starts not even with one Allocation identityAs @oli-obk mentioned, another concern besides padding here could be pointer identity. But there we already have the stance that
What would be an example of that? I think this would be a bug in guaranteed_eq. That function needs to be carefully written such that a valtree roundtrip (or other changes in alloocation identity that we consider permissible) does not change the behavior of guaranteed_eq.
Besides ptr equality, the only way to observe this is to mutate the memory, and valtree construction must error out if it ever reads from mutable memory. So this, too, should be excluded. |
the same issue also without padding, but "oversized" allocations. Is that intended to be UB? #![feature(generic_const_exprs, adt_const_params)]
const fn shrink<const N: u64>() -> &'static u32 {
unsafe {
&*(&N as *const u64).cast::<u32>()
}
}
const fn grow(v: &'static u32) -> u64 {
unsafe {
*(v as *const u32).cast::<u64>()
}
}
fn direct_use<const N: u64>() -> [u8; grow(shrink::<N>()) as usize] {
inner::<{ shrink::<N>() }>()
}
fn inner<const FOO: &'static u32>() -> [u8; grow(FOO) as usize] {
[0; grow(FOO) as usize]
}
fn main() {
println!("{:x}", direct_use::<0x0000bb00>().len());
} |
Ah, good catch. Inside a single CTFE execution this is clearly entirely fine. Generally it seems super hard to try and argue that a single execution can be equivalently split into two separate executions with just a valtree passed from the first to the second -- unsafe Rust code makes a lot of things observable, and intentionally so since it is made for low-level programming. |
So, essentially the problem here depends on the type of But this also means that we should be really suspicious of any place where a valtree has to be converted back to a Miri value. Which... seems like a problem for the entire valtree scheme? |
I think
is enough for this to be correct. We would have to additionally verify that the valtree conversion works at each potential "split", but that doesn't seem too bad. or probably even "all initialized bytes in the value have to be used during the conversion", i.e. the valtree conversion has to be a bijective function. If unused initialized bytes happen frequently, e.g. when changing enum variants to a smaller one or something? We could erase them at all potential "split"s. Generally a bit surprised that we didn't realize this earlier, but I feel like there are good enough solutions ^^ |
It seems pretty bad TBH -- we would have to do a valtree check basically each time a "typed copy" is made, or at least each time values pass across function boundaries. It will be truly horrendous for performance. This is also very fragile due to trying to enforce things that go beyond what the Rust spec is about, so I expect a permanent struggle of fixing other "leaks" that we discover.
Well, I had so far basically refused to even think about the unification problem, since I felt we had enough issues on our plate without considering "partial executions". And I think this issue proves me right to be worried about further dragons awaiting in those unexplored parts of this dungeon. Doing this for integer values (or anything without a pointer) seems fine, but I am now honestly very skeptical about ever doing this for values that contain pointers. This goes totally against how Rust has been designed, and there is a fundamental conflict between allowing unsafe code during CTFE and doing this kind of high-level reasoning. |
I don't intend to look inside of functions for unification so e.g. the constant
"partial executions" are needed for other reasons as well, e.g. we want the following snippet to compile struct Foo<const N: Option<usize>>;
impl<const N: usize> Foo<{ Some(N) }> {}
We still have a lot of time to figure all of this out, and to do a lot of cleanup of our code so that it's clear whether there's something that could go wrong. It should obviously be a prerequisite that we have a firm understanding of what could go wrong before we start to work on stabilizing stuff here. |
It doesn't really matter what the typical const generic constants are though, we'd have to do this kind of roundtrip for every function call since it might be part of a const generic. It's also odd (IMO) to special-case function calls in the semantics, and it is very surprising that values are adjusted at all in a function call -- that's just not how Rust programs typically behave.
I am not sure where there is a partial execution in that example? |
rust-lang/rust#99798 (comment) made me realize that this is already at least partially implemented?!? I would have expected that to be done a bit more carefully TBH.^^ So, how do I trigger these "abstract" consts to be created? I tried this but there does not seem to be a valtree roundtrip there. |
we intend to transition const generics over to valtrees, this representation erases padding bytes.
We also intend to allow the following example to compile:
for this to compile, we have to unify the constant
other_operation(some_operation::<N>())
with the constantother_operation(M)
using the generic argumentsome_operation::<N>()
as argument forM
. This is sound as long as const eval doesn't have side effects.A concern is the interaction between padding bytes and valtrees. We intend to not remember padding bytes in constants used in the type system, but do remember them during const eval and can soundly get their value (afaik). This means that putting values into the type system in the middle of const eval can have observable effects, e.g. consider
direct_use
must not compile with valtrees as the behavior ofreads_padding
differs between the two constants.cc @RalfJung @oli-obk
The text was updated successfully, but these errors were encountered: