-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Get rid of = _
in variable definitions
#11225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Isn't |
As long as we're not targeting 3.0 with this change ... because this is not triggered by any feedback coming from the community, and we've already said that further language changes would only be done in response to community feedback. (Or blocking issues, which this is not since it's already an obscure use case, and as is, it works.) |
I think it's generally not achievable without complicating things due to aliasing --- it's the same as enforcing a typestate protocol in a language. On the other hand, if the object can be aliased, the usage of - var x: T = _
+ var x: Option[T] = None |
@Katrix The documentation of
Yes, but the use case |
Another choice would be to always use
But then the danger is that we will make use of the initial value since we went through the trouble of defining it:
Very tempting but wrong, since |
It seems to be at least in Dotty, In performance-sensitive applications, users can always use a cast to avoid runtime checks. |
The example of |
I am not sure I agree. The point is, we want to communicate "this variable is defined but not initialized". That's a common idiom in all imperative languages. Making up a non-sensical initializer and then casting obscures the logic. " |
We already have to do that for local |
Maybe |
Unless we can have a check that a variable is always initialized before accessing it (which seems hard to implement), leaving a variable uninitialized is a fundamentally unsafe way of programming, where the programmer basically tells the compiler "trust me on this". So an unsafe cast seems warranted for this. var hd: A = null.asInstanceOf[A] Such a cast is similar in nature to the ones we need in the low-level implementations of things like Having a special feature like |
var x: T = notInitialized That's pretty clear, no? And it would get rid of the obscure use of One can argue that |
I know I don't bring much value with this, but still: As far as know (not only) in the Scala community, unsafe things are usually named |
|
In fact, there is! We might not have a value for a field like here class Memo[A](x: => A):
private var cached: A = notInitialized
private var known: Boolean = false
def force =
if !known then
known = true
cached = x
cached But in a local context if we always have a value with which we could initialize a variable. Otherwise, why define the variable at all? |
def lastSuchThat[A](xs: List[A])(p: A => Boolean): A = {
var found: Boolean = false
var result: A = null.asInstanceOf[A] // no initial value here
var rest = xs
while (!rest.isEmpty) {
if (p(rest.head)) {
found = true
result = rest.head
}
rest = rest.tail
}
if (!found)
throw new NoSuchElementException
result
} Of course we could use an |
If the feature class Memo[A](x: => A):
@embed
private var cached: UnsafeOpt[A] = UnsafeOpt.Empty
def force: A =
if cached.isEmpty then
cached.set(x)
cached.get The annotation class Memo[A](x: => A):
private var cached$value: A = _
private var cached$init: Boolean = false
def force =
if !cached$init then
cached$init = true
cached$value = x
cached$value Of course, the implementation of |
I believe that an erased |
@sjrd Yes, that makes sense. In fact a |
Apologies for this pedantic comment, but this issue is currently titled: Get rid of Shouldn't this read: Get rid of |
_ =
in variable definitions= _
in variable definitions
Indeed. Fixed. |
So I still need to read through this thread and catch up, but I want to underscore that the current meaning of this construct, which results in constructors which have no assignment bytecode for those slots, is crucial for performance sensitive areas. Simply taking null and casting it does produce the same result, but at the cost of an added instruction. This mistakes the purpose of the construct: it isn't to obtain a default value, it is to obtain a variable which has an undefined value for all intents and purposes and which is half as costly. |
@djspiewak I guess the cast is removed by the erasure phase. Do you have evidence suggesting that initializing a variable with the default value for that type is slower than leaving it uninitialized? I would imagine the JVM compiles these down to the same machine code. |
IIRC in Java, leaving uninitialized and initializing with (what Java specifies as) the default value are synonyms. |
The cast is removed but not the assignment. This is particularly relevant if the assignment imposes a memory fence, but even without that, the cost is measurable if you're doing enough unavoidable allocations in a critical section. |
Why not have something called nullOrZero or something like that, which is a
utility that returns null or a type's empty value. It doesn't need to be
special for initialization but it should be named so that people don't use
it to introduce null without realizing it.
How variables should interact with null tracking should be its own
discussion. Even with _ syntax you're introducing the danger of an NPE. And
why shouldn't var x: String = null be allowed if = _ is? So I think it's a
separate problem. Either you force variables not initialized to a non null
to be | Null for safety, or you say that var fields are special.
…On Thu, Jan 28, 2021, 11:40 AM Facsimiler ***@***.***> wrote:
Apologies for this pedantic comment, but this issue is currently titled: *Get
rid of _ = in variable definitions*
Shouldn't this read: *Get rid of = _ in variable definitions* instead?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#11225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAYAUEPUQN42AIHWHCUPLTS4GHQFANCNFSM4WU6TKJQ>
.
|
I mean, var fields are already special. To be clear, I don't have any problem going through some extra type system pedantry to allow for uninitialized fields, though Honestly the ideal syntax here really would be just leaving off the assignment, but of course that conflicts with I guess my general conclusion is that this is not really that big of a deal. Changing it to Particularly in light of the lateness of the hour -- and it is very late in the game to be having these kinds of conversations! -- my preference would be to just leave it as it is. Changes at this point which remove syntax should really only be considered under extreme circumstances, because otherwise we're never going to be able to, as an ecosystem, land this plane. If status quo is considered an unacceptable outcome, then my second choice would be to lean into the magicness of Hopefully it's clear why my preference is to leave it alone. Even a small change here (making |
Note: |
Erasing is an optimization, that doesn't mean it can't use a more general
method.
I think C# has such a general method.
At long as it has null in its name people won't use it where they wouldn't
use null, but they should be able to when they want to.
…On Mon, Feb 1, 2021, 9:44 AM odersky ***@***.***> wrote:
Note: uninitialized is defined erased, so it cannot be misused as a
regular value.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#11225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAYAUFLCWNHCN7KUA5L3GDS42443ANCNFSM4WU6TKJQ>
.
|
What I meant by this is that there's a check that (theoretically you could use it also in typelevel code, as long as no actual bytecode gets So, to compare: We go from a truly obscure use of Looks like a clear win to me. The other reasonable alternative I see is to eliminate var x: T = null.asInstanceOf[T] I find that worse in many dimensions
|
Possible generalization for later: Allow |
On Mon, Feb 1, 2021, 12:53 PM odersky ***@***.***> wrote:
Note: uninitialized is defined erased, so it cannot be misused as a
regular value.
What I meant by this is that there's a check that uninitialized cannot be
used as a regular value.
The only place it is legal in normal code is as the initializer of a
mutable field.
(theoretically you could use it also in typelevel code, as long as no
actual bytecode gets
generated from that code)
So, to compare:
We go from a truly obscure use of _ that's impossible to google to a value
uninitialized that has to be imported from scala.compiletime and that
has a doc comment explaining what it is.
Looks like a clear win to me.
The other reasonable alternative I see is to eliminate = _ or =
uninitialized altogether.
and recommend that it's substituted by
var x: T = null.asInstanceOf[T]
I find that worse in many dimensions
- it abuses null
- it abuses asInstanceOf
- it encourages to use the same idiom elsewhere for normal values.
I already suggested something that's better than both. Call it
nullOrDefault[A] or something like that, make it legal anywhere but treat
it specially in var field initializers.
- It's a general utility
- The only thing that's special is an optimization
- It does not abuse null or asInstanceOf
- It doesn't encourage using it where you wouldn't use null
Actually I would only recommend using it for initializers when the type is
generic. If it's known to be a reference type you should write null, if
it's numeric 0, and fit Boolean false and for Unit (). That would make code
the least surprising. The fact that in reality doing an assignment
instruction can be elided is an optimization.
…
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#11225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAYAUH3COTSKCP4PK3WK3TS43TDNANCNFSM4WU6TKJQ>
.
|
Actually, no. The idea is that the |
The problem with this is that in a null-safe world, |
[This is essentially a duplicate of @smarter's comment, so you can just skip it.] I considered def nullOrDefault[T]: T | Null That's the wrong type for primitive instances of |
Like I said, that's a more general problem. Is it legal to initialize it to
null explicitly? If yes how do you avoid the null type issue? If no why
not, it will be null anyway, it's silly to not be allowed to say so.
Regarding making it more general, the point is to be useful more generally.
Once we are defining a method it could be useful elsewhere, however rarely.
It's true that null.asInstanceOf is not a good way to provide that but
something should.
If you're not defining a method that's usable anywhere else, and you're not
working within the type system, then special syntax is better. Maybe
underscore is not the best but something that looks like special syntax
conveys that it can only be used in certain places. Something that looks
like a method but can only be used in certain places or has a name that
only makes sense in such places doesn't seem like good design to me.
…On Mon, Feb 1, 2021, 1:20 PM odersky ***@***.***> wrote:
I considered nullOrDefault but discarded it because it would have the
wrong type. It would have to be defined like this:
def nullOrDefault[T]: T | Null
That's the wrong type for primitive instances of T. Int | Null erases to
AnyRef, not Int.
It's also of dubious value. Normal types don't have default values in any
useful sense. So the
danger is that this would be used in areas where it is not appropriate.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#11225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAYAUB7XFHD5OC46UNTTADS43WE7ANCNFSM4WU6TKJQ>
.
|
It's not like any other In addition, Therefore, I don't think that the argument that "it's just an |
I don't think When trying to assess the type safety of a program, one needs to determine where soundness holes are introduced. To do so, we already have to look out for uses of I concede that |
Yes, in the sense that it is removed when used as the initializer of a field. I mentioned that above.
But @compileTimeOnly` exists in Scala-2 and is equivalent. |
|
Is this legal or not (with null checking)?
class C {
var field: String = null
}
…On Mon, Feb 1, 2021 at 4:43 PM Sébastien Doeraene ***@***.***> wrote:
@compileTimeOnly has 0 common point with erased. It just emits an error
if any call to an @compileTimeOnly method survives until rechecks. That
also applies to calls that are themselves within an @compileTimeOnly
member. It's meant for fake methods that should be captured and replaced by
a surrounding macro. For example, the .value of sbt.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#11225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAYAUBQ2SPE2PCL56YHQE3S44OBVANCNFSM4WU6TKJQ>
.
|
@nafg It's not legal. |
And |
Is there any motivating reason why `= _` should be allowed and not `= null`?
…On Mon, Feb 1, 2021 at 5:25 PM odersky ***@***.***> wrote:
@nafg <https://github.com/nafg> It's not legal.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#11225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAYAUFIJGAWQJZCP7HTQ6DS44S4FANCNFSM4WU6TKJQ>
.
|
Either that means changing the meaning of Also, if I remember correctly from the discussions around |
When is that difference useful (in the presence of null aware types)?
I guess what I'm getting at is that for a var field, its initial value
doesn't follow the same rules as its type claims about other reads and
writes. That's a fact but is currently limited to _. But given that is the
case I don't see why the exception to its type can't be made broader.
Maybe the answer is that initialization checking sort of plugs that hole in
soundness, and so the hole shouldn't be widened more than it can be
plugged. But that just begs the question. It sounds like this is really a
teeny bit of flow typing, so couldn't it be generalized to check if it's
become non-null? After all that the underlying purpose of initialization
checking, isn't it?
…On Tue, Feb 2, 2021, 5:26 AM Dale Wijnand ***@***.***> wrote:
Either that means changing the meaning of null generally - making it
inhabit Nothing rather than Null - or changing its meaning when on the
RHS of a var. The latter seems definitely wrong to me, while the former
seems too risky of a change.
Also, if I remember correctly from the discussions around -Xcheck-init,
there's a difference drawn between = _ and = null where the former means
uninitialised and the latter means initialised to null (similarly for
0/0.0/false..) - relevant to the initialisation checking that flag is
defined for.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#11225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAYAUAEOT62XXSJ7FPK7OLS47HOFANCNFSM4WU6TKJQ>
.
|
I don't follow these parts in particular, but I think I agree with you: I can't come up with how |
Uh oh!
There was an error while loading. Please reload this page.
An obscure use of
_
occurs invar
definitions:It defines a concrete variable
x
without an initial value, or rather the default initial value that the JVM assigns to object fields. It can only be used in a class or object, not to initialize a local variable. It is essential in situations like this:Here we cannot initialize
hd
with a value since its type is parametric, and we do not know a value forA
. The logic inAbstractIterator
makes sure thathd
is read only after it is assigned by checkinghdDefined
before readinghd
.So the idiom is rare but essential where it is needed. But can we find a less obscure notation that does not add another overload to
_
?One possibility would be to define somewhere an abstraction like this:
defaultValue
can be expressed in terms of_
:It can also be a compiler-intrinsic, and then it could replace
= _
in variable definitions.But I am a bit reluctant to do that, for fear that then
defaultValue
will be also used in situations where it is not appropriate. The big problem withdefaultValue
is that it returnsnull
for all class types including boxed numeric types. That's very confusing! We do not want to give this a more prominent status than what we have. The situation would get worse with explicit nulls. In that case, we would have:But then
would be ill-typed. It would have to be:
This is safe, but it defeats the purpose. What should happen for
is that the programmer or the init-checker proves that
x
will be initialized with something else before it is first referenced. That's the intended meaning.So, maybe a better choice is to re-use
compiletime.erasedValue
for this?This means there is no initial value at runtime (the value is erased), so there is no initial assignment. That can be allowed for objects and classes since objects are bulk-initialized (but the init checker should check that such variables are not referenced before being assigned to). It could be allowed in the future also for local variables if we can prove that the variable is initialized by assignment before use. Everywhere else
erasedValue
is illegal, by the current rules.Admittedly,
erasedValue
is also obscure. HowevererasedValue
can be looked up, unlike_
erasedValue
does not add to the confusion relative to the other uses of_
.The text was updated successfully, but these errors were encountered: