-
Notifications
You must be signed in to change notification settings - Fork 59
lvalues are called places in Rust/MIR #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I still like "value" and "place"; I do think that the "place value" thing doesn't make a lot of sense. They're called "addresses" or "reference values", or "values of reference type". |
@@ -8,7 +8,7 @@ | |||
* *niche* | |||
* *layout* | |||
* *tag* | |||
* *rvalue* | |||
* *lvalue* | |||
* *place* (or *lvalue* in C/C++ speak) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are called "lvalues" in C, and "glvalues" in C++.
* *rvalue* | ||
* *lvalue* | ||
* *place* (or *lvalue* in C/C++ speak) | ||
* *rvalue* (maybe we can come up with a Rust term for this as well?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are called "values of the expression" in C, "prvalue" in C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and we should just call them "values")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a term to distinguish expressions like "3+2" from expressions like "x", where the latter can be used as lvalues but the former cannot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the latter is a (Rust) place expression, (C) lvalue expression, or (C++) glvalue expression, whereas the former is a (Rust) value expression, (C++) prvalue expression, and C doesn't have a special name for it.
In Rust, there's an automatic coercion from value expressions to place expressions in some specific places; and in C++, there's an automatic coercion from prvalue expressions to xvalue expressions (a type of glvalue expression).
If you want to convert from a place expression to a value expression it depends on what the kind of the place expression is:
- in C it doesn't matter, you can always convert an lvalue expression to the value of the expression.
- in Rust
- if the type of the place expression is immovable (!Move), you cannot convert, in general, to a value expression.
- otherwise, if the type of the place expression is !Copy, then the place expression must be owning in order to convert to a value expression
- otherwise, the type is Copy, and you can always go from the place expression to the value expression.
- in C++, a glvalue expression is either an lvalue or an xvalue
- if the type is trivially copyable, then the value is just copied out of the glvalue expression
- otherwise, if the expression is an lvalue, then the type must be copyable, and the copy constructor will be called; otherwise, compilation will fail
- otherwise, if the expression is an xvalue, and the move constructor exists, then that is called
- otherwise, the expression is an xvalue and the move constructor does not exist. If the copy constructor exists and takes a
type const&
, then that is called; otherwise, compilation fails.
A
These certainly do not work, there is a distinction between a Both places and what you call "values" have the distinction between "what the programmer writes down" and "what this eventually denotes/evaluates to", so we should use consistent terminology. "Compile-time place" vs "run-time place" would also work, but I think using the well-established expression vs value terminology from functional languages makes most sense here. And either way we should not use "value" in another way, that will just be confusing to the part of our community coming from functional languages where that term has a very clear meaning already. It might be better to avoid "value" entirely. |
@RalfJung oh, I missed what you meant by "place value", just ignore that part. |
The best term I could come up with so far for Foe inspiration: Synonyms of object. I kind of like "matter" because it is so far away from everything it will likely not conflict with anything. ;) |
@RalfJung why not "Operand"? |
FWIW, I am going to call them "robjects" for now just so that I can consistently refer to them some way; that term should be sufficiently bad that nobody should like it enough to grow attached to it. ;) "Operand" is already used in MIR to refer to the LHS and RHS of |
So, I'm going to recommend we call them immediates. It's a reasonable way of referring to them, and evokes their ephemeral nature. @RalfJung didn't like calling them values, as a value is the result of evaluating an expression. We will categorize the result (or value) of an expression into either an lvalue (place) or an rvalue (immediate). |
For the record "immediate" it already used for other compiler-y things
We're not going to find any fitting term that's completely unused, but personally I have to deal with the CPU architecture kind of immediate multiple times per day on average so I wouldn't be super happy if it became overloaded like that. |
@rkruppe There's not a lot of words that can reasonably be used without overloading somebody. |
Sure, I said as much myself. I just want to spell out for each option who that is going to be. For "immediate" it's going to be everyone who interacts with assembly and (unless that name gets changed again) everyone who works on miri or on certain parts of rustc. |
I still personally think overloading |
@ubsan Even then we still lack a term for "non-place expressions". Expressing that as a negative (non-place) is far from optimal. I consider "Operand" to be the best option so far. |
@RalfJung this is what we're discussing. The options so far for "result of expression:
The options so far for "non-place result of expression":
if you want to discuss an expression, then you can add the category of the result to "expression", i.e.,
this is contrasting to "place" and "place expression", respectively. |
That's not what I intended "operand", "robject" or so to mean. I was originally trying to ask for (and I understand now I wasn't clear enough, so here you go):
The constraint I hope to satisfy is that we can say "place expression" and " For example:
This is my current favorite (after scratching "object"). Or: place expression, rvalue expression, place value, rvalue value. I find this really awkward, "rvalue value" is just not a good term. Or: place expression, operand expression, place result, operand result. I guess I could live with this. Or: place expression, rvalue expression, place result, rvalue result. I would dislike this because it goes against the common (but admittedly not omnipresent) notion that a "value" is some kind of normal form, so "rvalue expression" is both saying it is already in normal form and yet it needs evaluating (being an expression)...? As for "immediate", I agree with @rkruppe. I think "operand" is likely going to cause the least confusion. It is certainly my favorite from @ubsan's list. If we accept that I would propose we rename:
No renaming would be necessary in miri as it already uses the "operand" terminology. |
I think, personally, that it will cause the least confusion if we use
operand makes me think too much of "argument to a function/operator". |
I am somewhat surprised that you want to make that distinction. It does not arise in surface Rust, does it? Everything you can use on the RHS of an assignment, you can also use as a function argument. The distinction does arise in MIR, and it is precise the difference between what I proposed to be called "atomic operand" ("primitive operand", "simple operand") and a general operand. Also, I thought we had agreed to not use terminology that will confuse the heck out of a sizeable fraction of people caring about Rust. :/ In particular, the whole point of I begrudgingly accepted to drop "object" because you said it would be very confusing to people that are deep into C/C++ terminology. I am still wondering how this clashes with C's use of the term. "The content of a place is an object" seems perfectly consistent, and there also is a notion of "temporary object" that seems to fit well with what miri calls "immediate operands" (would then be "immediate objects"), namely constants like in Another term that just came to my mind is "intermediate" (the noun). The idea being that And to mention yet another alternative, we could call |
For what is worth, having some background in C++ terminology, I do not think using object in this way is an issue. I am not as deep in as @ubsan and I agree with them that
Could you show the same table using object ? Independently of the terms we choose (object, result, operand, ...), I think this type of table representation is much easier to understand than than the trees used to represent "value" categories in C++. |
Sure, it is literally
Or with "result":
In both columns, I assume we'd sometimes just talk of "places" or "objects/operands/..." when it is clear from context whether we are talking about an expression or a value/result.
Note that "result" was only ever proposed for |
EDIT: see below for the transposed table. So @ubsan , is this what you meant?
|
@gnzlbg You flipped the table at the diagonal.^^ Makes it rather confusing to compare them... |
Sorry, I just copied @ubsan lines (that confused me a bit) :D Here it is unflipped:
|
@gnzlbg |
So could we nail this down (and summarize this) to a couple of concrete proposals that can be discussed in the next meeting ? |
Before I over-bikeshed the rvalue/non-places question, I'd like to clarify that I am a huge fan of using "place" for lvalues in Rust. I also like using "expression" vs "result" for whether or not we mean the thing pre- or post-evaluation, and agree with the premise that we'd like to have a 2x2 table where each single word refers collectively to two of the four cells and each cell is referred to by two words. Now the bikeshedding...
Maybe I'm missing something, but it also seems like nonsense to say that a "place expression" is a form of a "place"? Surely "X expression" is an expression that evaluates to a X, not a form of an X, right? Though I'm not a huge fan of "value" for this anyway since it's not at all intuitive that "places" don't qualify as "values" (if nothing else, C++'s "lvalue" set a pretty strong precedent that places are a kind of value). In general, I think "value" is just far too overloaded with incompatible meanings to be helpful in any new attempts at precise definitions (even more so than all of the other options). I don't think using "object" for this purpose would be confusing just because C++ objects are a totally different thing; everyone knows Rust doesn't have traditional OOP classes. But there is another reason why I'm not a fan of "object" in particular: C++ gives "object" two completely different meanings, effectively one for normal programmers and one for language lawyers, which makes any use of it in a formal/spec-y/UB-defining context come across as an obfuscating term that immediately makes the document gibberish to anyone that isn't already familiar with what definition is implied and all the unstated implications of that definition (I still have no idea which primitives types are and are not "lawyer objects" in C++). In other words, if "object" was used, it'd give me the impression that, just like the C and C++ specs, being comprehensible for non-wizards/non-lawyers is a non-goal of these documents. Which is probably completely unfair to you guys, but that is the connotation C++ has given it for me. @ubsan I've never heard of "object" having a defined meaning in Rust, unless it's supposed to be short for "trait object". What meaning are you talking about? Using "operand" for this would also be super weird because it already has the "normal people meaning" of things you pass to operators in Rust and pretty much every other language. So giving that term a double meaning no one would ever expect also seems like it'd create the "wizards only" connotation. In particular, the popular oversimplification that that a C++ lvalue is "anything you can put on the left of the assignment operator" immediately implies that places are operands for assignment operators, so if we're willing to rule anything out for being extra bizarre to C++ users, I'd think "operand" qualifies. On the constructive side, I don't think we've exhausted all the candidates yet. I can think of at least two:
They aren't amazing options, but I do think "data expression" causes nowhere near as many problems as "object/value/operand expression" would. Thoughts? |
Agreed! These expressions refer to the data stored in a place, or data created by a computation, so -- yeah, I think I quite like that proposal, at least on a gut-reaction level. |
Bah. I wanted to write a big opus here but I've not had time to re-read the comments and really form an opinion. One thing I did want to say though: I've generally found that "place expressions" is a bit more awkward in practice than I would like. I've found myself gravitating more and more to the term path to represent an "lvalue" -- basically, an expression that leads one to some place in memory (i.e., (I believe there is some precedent for using the term path to refer to expressions like |
So you think Main main issue with that is that the fact that "place" and "path" are the dynamic and static representation of each other is not at all obvious in the terms.
AFAIK this does not usually include derefs though. Actually, my understanding was that |
As part of our ongoing efforts to formalize a language that's reasonably close to MIR (and maybe extends a bit to slightly higher levels of abstractions, we'll see), I came to realize one thing: If I now had to define "rvalue", I'd say it is "something that can be put into a place". Putting it into a place is the only "elimination-style" operation that is ever performed on rvalues. The candidate model we are considering here is to literally model an rvalue as a function (closure) that takes a place, and then has the effect of storing itself into that place. Miri's "operands" (which are either a place or an immediate constant) cover two special cases of "things you can copy into a place". I wonder if there is a good name one could derive from this. |
I think I like:
where we just use "expression" to denote places and values that are not completely evaluated, and just "place" and "value" to mean evaluated ones. If the context requires disambiguation or being extremely pedantic, one can use "evaluated" to emphasize that, resulting in "evaluated place" and "evaluated value". |
Like @RalfJung (#40 (comment)), I think it is natural to say that "An object can be put into a place." which fits with #40 (comment). However, as @Ixrec noted it might not work well so datum seems fine to me.
When you don't care about the expression / result distinction you can just drop that part and just talk about "place" and "datum". Also, we can say:
|
I don't think there's a reasonable argument against I have taught the intuition of this stuff to many people across the experience spectrum, and |
The argument against using "value" for this purpose seems to be that it could instead be used as specific term for the result that place/value expressions evaluate to, to explicitly distinguish them from the expressions. I for one can understand why some people might want that, but there's multiple reasonable alternatives to that, and the usefulness for teaching is significant -- @ubsan is right, value is by and large the predominant and known term for the things we put into places. As far as I can tell, the conflict here is not that anyone disputes that So I'd like to raise the banner again for simply having place expressions and value expressions result in places and values respectively. That is, demarcating the evaluation result from expressions by simply not including "expression", and not caring that informal discussion will sometimes use the shorer terms to include the expression forms as well. This is what the reference does right now and personally, despite having some formal/PL background and often nit picking about the difference between expressions and evaluation results, I've never had the slightest problem with it. One other position I find interesting is @nikomatsakis saying "place expressions" is a bit awkward in practice. I can see that, though the alternative "path" massively collides with the existing use in the module/namespace system (including e.g. corresponding macro_rules fragments -- I could totally see declarative macros wanting to accept place expressions specifically). |
Are you referring to the entire right column in my table or just the bottom-right corner? If you include the top-right corner, then... let me just say that I'd be fine with you putting different weight on things, but outright calling my arguments unreasonable is not a productive way to communicate.
That would be the bottom-right corner, though? We don't put "2+2" into a place, but "4". I even found myself tempted to call the things in the bottom-right corner "values". I could live with that, provided we find a reasonable term for the top-right corner. "value expression" is... odd.
"Place Expressions and Value Expressions", oh wow. I had no idea. FWIW, in an ongoing formalization where "value" would heavily conflict with existing terminology, we call these things "temporaries" since they only exist ephemerally during an assignment (in our model). Once stored in a place, they become a sequence of "things that can be stored in a heap cell", which we call "immediates" (this matches Miri's use of the term "immediate"; our heap cells are just a bit weird -- in a more realistic model this would be just "byte"). In a more C-style model with an idea of "abstract values", the term "temporary" admittedly works less well. |
It's not perfect IMO but "a foo expression evaluates to a foo" is a nice mnemonic and I don't see a better overall alternative. re: temporaries, it would be a fine term on its own, but besides the downside you mention it also collides with the idea of compiler-generated temporary places that occur in expressions such as |
This needs rebasing. |
I'd like to propose revisiting this in the next meeting (next thursday) with the goal of making progress here. Ideally, we would agree on the "least bad" option, temporarily pick it, and open an issue to keep discussing this further. Picking the "least bad" option and start using it would give us experience with its downsides, which might allow us to come up with better terminology in the future. |
I won't be able to join next Thursday, can we do this in two weeks? |
Sure! |
Btw, turns out the reference used to speak about lvalues and rvalues but was changed in December 2017 to talk about "place expressions" and "value expressions" instead, without much of a discussion. Still almost one year earlier than this discussion started. |
It's unclear whether we will tackle this on the next meeting. @RalfJung do you think you would be able to write a summary comment here with the alternatives that got the most traction, and maybe include at the end a non-binding poll to see where everyone stands ? |
Closing in favor of #175, where I concede to the majority opinion that the 4 terms in question are "value", "place", "value expression" and "place expression". |
@gnzlbg sorry, I saw that too late. But with @rkruppe, @nikomatsakis, @gnzlbg and @ubsan in agreement, I think this is pretty much settled. We adapted this terminology for our Coq formalization of Stacked Borrows. We also introduced the term "result" to be able to talk about "evaluated place or evaluated value", i.e., "something that does not compute further" (something that we call "value" everywhere else). It's been a bit confusing at first, and there is friction because all our other developments use different terminology, but it'll work. |
At least for anyone with a background in functional programming, calling these a "*value" is very confusing. Values are the result of a computation, e.g. "5" is a value. "2+3" is not a value, it is an expression that evaluates/computes to the value "5".
The lvalue/rvalue distinction is on a different axis though, I think. Composing the two terminologies, we have "place expressions" like
x[3].foo
, and "place values" like (memory address)0x00803240
-- where again "place values" are the thing that "place expressions" evaluate to. This all makes a lot of sense. If we'd stick to C(++) terminology, we would have to distinguish "lvalue expressions" and "lvalue values", and that seems just absurd.This distinction, btw, manifests in the codebase as well:
rustc::mir::Place
is a place expression,rustc_mir::interpret::Place
is a place value.I think we should find matching terminology to replace "rvalue". Some "X" referring to the result of a computation like
2 + 3
or*foo.bar
or&(*x).field
. The result can be pretty much any sequence of bytes, the key factor being that you can read them (copy them into a place), but you cannot change them.Right now, their expression form is
rustc::mir::Rvalue
, and their value form isrustc_mir::interpret::Operand
. We could call these things "operand expression/value", ifrustc::mir::Operand
wasn't already taken to be something else ("operand expressions" are a subset of "rvalue expressions", namely they are constants or a place expression, they don't actually compute anything on the data). Any suggestions for a good noun to use here?