Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit e240bcc

Browse files
committedAug 19, 2024·
Update SIPs state
1 parent 831439c commit e240bcc

12 files changed

+3445
-45
lines changed
 

‎_sips/sips/alternative-bind-patterns.md

Lines changed: 0 additions & 7 deletions
This file was deleted.
Lines changed: 331 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,331 @@
1+
---
2+
layout: sip
3+
permalink: /sips/:title.html
4+
stage: pre-sip
5+
status: submitted
6+
presip-thread: https://contributors.scala-lang.org/t/pre-sip-bind-variables-for-alternative-patterns/6321/13
7+
title: SIP-60 - Bind variables within alternative patterns
8+
---
9+
10+
**By: Yilin Wei**
11+
12+
## History
13+
14+
| Date | Version |
15+
|---------------|--------------------|
16+
| Sep 17th 2023 | Initial Draft |
17+
| Jan 16th 2024 | Amendments |
18+
19+
## Summary
20+
21+
Pattern matching is one of the most commonly used features in Scala by beginners and experts alike. Most of
22+
the features of pattern matching compose beautifully — for example, a user who learns about bind variables
23+
and guard patterns can mix the two features intuitively.
24+
25+
One of the few outstanding cases where this is untrue, is when mixing bind variables and alternative patterns. The part of
26+
current [specification](https://scala-lang.org/files/archive/spec/2.13/08-pattern-matching.html) which we are concerned with is under section **8.1.12** and is copied below, with the relevant clause
27+
highlighted.
28+
29+
> … All alternative patterns are type checked with the expected type of the pattern. **They may not bind variables other than wildcards**. The alternative …
30+
31+
We propose that this restriction be lifted and this corner case be eliminated.
32+
33+
Removing the corner case would make the language easier to teach, reduce friction and allow users to express intent in a more natural manner.
34+
35+
## Motivation
36+
37+
## Scenario
38+
39+
The following scenario is shamelessly stolen from [PEP 636](https://peps.python.org/pep-0636), which introduces pattern matching to the
40+
Python language.
41+
42+
Suppose a user is writing classic text adventure game such as [Zork](https://en.wikipedia.org/wiki/Zork). For readers unfamiliar with
43+
text adventure games, the player typically enters freeform text into the terminal in the form of commands to interact with the game
44+
world. Examples of commands might be `"pick up rabbit"` or `"open door"`.
45+
46+
Typically, the commands are tokenized and parsed. After a parsing stage we may end up with a encoding which is similar to the following:
47+
48+
```scala
49+
enum Word
50+
case Get, North, Go, Pick, Up
51+
case Item(name: String)
52+
53+
case class Command(words: List[Word])
54+
```
55+
56+
In this encoding, the string `pick up jar`, would be parsed as `Command(List(Pick, Up, Item("jar")))`.
57+
58+
Once the command is parsed, we want to actually *do* something with the command. With this particular encoding,
59+
we would naturally reach for a pattern match — in the simplest case, we could get away with a single recursive function for
60+
our whole program.
61+
62+
Suppose we take the simplest example where we want to match on a command like `"north"`. The pattern match consists of
63+
matching on a single stable identifier, `North` and the code would look like this:
64+
65+
~~~ scala
66+
import Command.*
67+
68+
def loop(cmd: Command): Unit =
69+
cmd match
70+
case Command(North :: Nil) => // Code for going north
71+
~~~
72+
73+
However as we begin play-testing the actual text adventure, we observe that users type `"go north"`. We decide
74+
our program should treat the two distinct commands as synonyms. At this point we would reach for an alternative pattern `|` and
75+
refactor the code like so:
76+
77+
~~~ scala
78+
case Command(North :: Nil | Go :: North :: Nil) => // Code for going north
79+
~~~
80+
81+
This clearly expresses our intent that the two commands map to the same underlying logic.
82+
83+
Later we decide that we want more complex logic in our game; perhaps allowing the user to pick up
84+
items with a command like `pick up jar`. We would then extend our function with another case, binding the variable `name`:
85+
86+
~~~ scala
87+
case Command(Pick :: Up :: Item(name) :: Nil) => // Code for picking up items
88+
~~~
89+
90+
Again, we might realise through our play-testing that users type `get` as a synonym for `pick up`. After playing around
91+
with alternative patterns, we may reasonably write something like:
92+
93+
~~~ scala
94+
case Command(Pick :: Up :: Item(name) :: Nil | Get :: Item(name) :: Nil) => // Code for picking up items
95+
~~~
96+
97+
Unfortunately at this point, we are stopped in our tracks by the compiler. The bind variable for `name` cannot be used in conjunction with alternative patterns.
98+
We must either choose a different encoding. We carefully consult the specification and that this is not possible.
99+
100+
We can, of course, work around it by hoisting the logic to a helper function to the nearest scope which function definitions:
101+
102+
~~~ scala
103+
def loop(cmd: Cmd): Unit =
104+
def pickUp(item: String): Unit = // Code for picking up item
105+
cmd match
106+
case Command(Pick :: Up :: Item(name)) => pickUp(name)
107+
case Command(Get :: Item(name)) => pickUp(name)
108+
~~~
109+
110+
Or any number of different encodings. However, all of them are less intuitive and less obvious than the code we tried to write.
111+
112+
## Commentary
113+
114+
Removing the restriction leads to more obvious encodings in the case of alternative patterns. Arguably, the language
115+
would be simpler and easier to teach — we do not have to remember that bind patterns and alternatives
116+
do not mix and need to teach newcomers the workarounds.
117+
118+
For languages which have pattern matching, a significant number also support the same feature. Languages such as [Rust](https://github.com/rust-lang/reference/pull/957) and [Python](https://peps.python.org/pep-0636/#or-patterns) have
119+
supported it for some time. While
120+
this is not a great reason for Scala to do the same, having the feature exist in other languages means that users
121+
that are more likely to expect the feature.
122+
123+
A smaller benefit for existing users, is that removing the corner case leads to code which is
124+
easier to review; the absolute code difference between adding a bind variable within an alternative versus switching to a different
125+
encoding entirely is smaller and conveys the intent of such changesets better.
126+
127+
It is acknowledged, however, that such cases where we share the same logic with an alternative branches are relatively rare compared to
128+
the usage of pattern matching in general. The current restrictions are not too arduous to workaround for experienced practitioners, which
129+
can be inferred from the relatively low number of comments from the original [issue](https://github.com/scala/bug/issues/182) first raised in 2007.
130+
131+
To summarize, the main arguments for the proposal are to make the language more consistent, simpler and easier to teach. The arguments
132+
against a change are that it will be low impact for the majority of existing users.
133+
134+
## Proposed solution
135+
136+
Removing the alternative restriction means that we need to specify some additional constraints. Intuitively, we
137+
need to consider the restrictions on variable bindings within each alternative branch, as well as the types inferred
138+
for each binding within the scope of the pattern.
139+
140+
## Bindings
141+
142+
The simplest case of mixing an alternative pattern and bind variables, is where we have two `UnApply` methods, with
143+
a single alternative pattern. For now, we specifically only consider the case where each bind variable is of the same
144+
type, like so:
145+
146+
~~~ scala
147+
enum Foo:
148+
case Bar(x: Int)
149+
case Baz(y: Int)
150+
151+
def fun = this match
152+
case Bar(z) | Baz(z) => ... // z: Int
153+
~~~
154+
155+
For the expression to make sense with the current semantics around pattern matches, `z` must be defined in both branches; otherwise the
156+
case body would be nonsensical if `z` was referenced within it (see [missing variables](#missing-variables) for a proposed alternative).
157+
158+
Removing the restriction would also allow recursive alternative patterns:
159+
160+
~~~ scala
161+
enum Foo:
162+
case Bar(x: Int)
163+
case Baz(x: Int)
164+
165+
enum Qux:
166+
case Quux(y: Int)
167+
case Corge(x: Foo)
168+
169+
def fun = this match
170+
case Quux(z) | Corge(Bar(z) | Baz(z)) => ... // z: Int
171+
~~~
172+
173+
Using an `Ident` within an `UnApply` is not the only way to introduce a binding within the pattern scope.
174+
We also expect to be able to use an explicit binding using an `@` like this:
175+
176+
~~~ scala
177+
enum Foo:
178+
case Bar()
179+
case Baz(bar: Bar)
180+
181+
def fun = this match
182+
case Baz(x) | x @ Bar() => ... // x: Foo.Bar
183+
~~~
184+
185+
## Types
186+
187+
We propose that the type of each variable introduced in the scope of the pattern be the least upper-bound of the type
188+
inferred within within each branch.
189+
190+
~~~ scala
191+
enum Foo:
192+
case Bar(x: Int)
193+
case Baz(y: String)
194+
195+
def fun = this match
196+
case Bar(x) | Baz(x) => // x: Int | String
197+
~~~
198+
199+
We do not expect any inference to happen between branches. For example, in the case of a GADT we would expect the second branch of
200+
the following case to match all instances of `Bar`, regardless of the type of `A`.
201+
202+
~~~ scala
203+
enum Foo[A]:
204+
case Bar(a: A)
205+
case Baz(i: Int) extends Foo[Int]
206+
207+
def fun = this match
208+
case Baz(x) | Bar(x) => // x: Int | A
209+
~~~
210+
211+
### Given bind variables
212+
213+
It is possible to introduce bindings to the contextual scope within a pattern match branch.
214+
215+
Since most bindings will be anonymous but be referred to within the branches, we expect the _types_ present in the contextual scope for each branch to be the same rather than the _names_.
216+
217+
~~~ scala
218+
case class Context()
219+
220+
def run(using ctx: Context): Unit = ???
221+
222+
enum Foo:
223+
case Bar(ctx: Context)
224+
case Baz(i: Int, ctx: Context)
225+
226+
def fun = this match
227+
case Bar(given Context) | Baz(_, given Context) => run // `Context` appears in both branches
228+
~~~
229+
230+
This begs the question of what to do in the case of an explicit `@` binding where the user binds a variable to the same _name_ but to different types. We can either expose a `String | Int` within the contextual scope, or simply reject the code as invalid.
231+
232+
~~~ scala
233+
enum Foo:
234+
case Bar(s: String)
235+
case Baz(i: Int)
236+
237+
def fun = this match
238+
case Bar(x @ given String) | Baz(x @ given Int) => ???
239+
~~~
240+
241+
To be consistent with the named bindings, we argue that the code should compile and a contextual variable added to the scope with the type of `String | Int`.
242+
243+
### Quoted patterns
244+
245+
[Quoted patterns](https://docs.scala-lang.org/scala3/guides/macros/quotes.html#quoted-patterns) will not be supported in this SIP and the behaviour of quoted patterns will remain the same as currently i.e. any quoted pattern appearing in an alternative pattern binding a variable or type variable will be rejected as illegal.
246+
247+
### Alternatives
248+
249+
#### Enforcing a single type for a bound variable
250+
251+
We could constrain the type for each bound variable within each alternative branch to be the same type. Notably, this is what languages such as Rust, which do not have sub-typing do.
252+
253+
However, since untagged unions are part of Scala 3 and the fact that both are represented by the `|`, it felt more natural to discard this restriction.
254+
255+
#### Type ascriptions in alternative branches
256+
257+
Another suggestion is that an _explicit_ type ascription by a user ought to be defined for all branches. For example, in the currently proposed rules, the following code would infer the return type to be `Int | A` even though the user has written the statement `id: Int`.
258+
259+
~~~scala
260+
enum Foo[A]:
261+
case Bar[A](a: A)
262+
case Baz[A](a: A)
263+
264+
def test = this match
265+
case Bar(id: Int) | Baz(id) => id
266+
~~~
267+
268+
In the author's subjective opinion, it is more natural to view the alternative arms as separate branches — which would be equivalent to the function below.
269+
270+
~~~scala
271+
def test = this match
272+
case Bar(id: Int) => id
273+
case Baz(id) => id
274+
~~~
275+
276+
On the other hand, if it is decided that each bound variable ought to be the same type, then arguably "sharing" explicit type ascriptions across branches would reduce boilerplate.
277+
278+
#### Missing variables
279+
280+
Unlike in other languages, we could assign a type, `A | Null`, to a bind variable which is not present in all of the alternative branches. Rust, for example, is constrained by the fact that the size of a variable must be known and untagged unions do not exist.
281+
282+
Arguably, missing a variable entirely is more likely to be an error — the absence of a requirement for `var` declarations before assigning variables in Python means that beginners can easily assign variables to the wrong variable.
283+
284+
It may be, that the enforcement of having to have the same bind variables within each branch ought to be left to a linter rather thana a hard restriction within the language itself.
285+
286+
## Specification
287+
288+
We do not believe there are any syntax changes since the current specification already allows the proposed syntax.
289+
290+
We propose that the following clauses be added to the specification:
291+
292+
Let $`p_1 | \ldots | p_n`$ be an alternative pattern at an arbitrary depth within a case pattern and $`\Gamma_n`$ is the named scope associated with each alternative.
293+
294+
If `p_i` is a quoted pattern binding a variable or type variable, the alternative pattern is considered invalid. Otherwise, let the named variables introduced within each alternative $`p_n`$, be $`x_i \in \Gamma_n`$ and the unnamed contextual variables within each alternative have the type $`T_i \in \Gamma_n`$.
295+
296+
Each $`p_n`$ must introduce the same set of bindings, i.e. for each $`n`$, $`\Gamma_n`$ must have the same **named** members $`\Gamma_{n+1}`$ and the set of $`{T_0, ... T_n}`$ must be the same.
297+
298+
If $`X_{n,i}`$, is the type of the binding $`x_i`$ within an alternative $`p_n`$, then the consequent type, $`X_i`$, of the
299+
variable $`x_i`$ within the pattern scope, $`\Gamma`$ is the least upper-bound of all the types $`X_{n, i}`$ associated with
300+
the variable, $`x_i`$ within each branch.
301+
302+
## Compatibility
303+
304+
We believe the changes would be backwards compatible.
305+
306+
# Related Work
307+
308+
The language feature exists in multiple languages. Of the more popular languages, Rust added the feature in [2021](https://github.com/rust-lang/reference/pull/957) and
309+
Python within [PEP 636](https://peps.python.org/pep-0636/#or-patterns), the pattern matching PEP in 2020. Of course, Python is untyped and Rust does not have sub-typing
310+
but the semantics proposed are similar to this proposal.
311+
312+
Within Scala, the [issue](https://github.com/scala/bug/issues/182) first raised in 2007. The author is also aware of attempts to fix this issue by [Lionel Parreaux](https://github.com/dotty-staging/dotty/compare/main...LPTK:dotty:vars-in-pat-alts) and the associated [feature request](https://github.com/lampepfl/dotty-feature-requests/issues/12) which
313+
was not submitted to the main dotty repository.
314+
315+
The associated [thread](https://contributors.scala-lang.org/t/pre-sip-bind-variables-for-alternative-patterns/6321) has some extra discussion around semantics. Historically, there have been multiple similar suggestions — in [2023](https://contributors.scala-lang.org/t/qol-sound-binding-in-pattern-alternatives/6226) by Quentin Bernet and in [2021](https://contributors.scala-lang.org/t/could-it-be-possible-to-allow-variable-binging-in-patmat-alternatives-for-scala-3-x/5235) by Alexey Shuksto.
316+
317+
## Implementation
318+
319+
The author has a current in-progress implementation focused on the typer which compiles the examples with the expected types. Interested
320+
parties are welcome to see the WIP [here](https://github.com/lampepfl/dotty/compare/main...yilinwei:dotty:main).
321+
322+
### Further work
323+
324+
#### Quoted patterns
325+
326+
More investigation is needed to see how quoted patterns with bind variables in alternative patterns could be supported.
327+
328+
## Acknowledgements
329+
330+
Many thanks to **Zainab Ali** for proof-reading the draft, **Nicolas Stucki** and **Guillaume Martres** for their pointers on the dotty
331+
compiler codebase.

‎_sips/sips/better-fors.md

Lines changed: 381 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,381 @@
1+
---
2+
layout: sip
3+
permalink: /sips/:title.html
4+
stage: design
5+
status: submitted
6+
title: SIP-62 - For comprehension improvements
7+
---
8+
9+
**By: Kacper Korban (VirtusLab)**
10+
11+
## History
12+
13+
| Date | Version |
14+
|---------------|--------------------|
15+
| June 6th 2023 | Initial Draft |
16+
| Feb 15th 2024 | Reviewed Version |
17+
18+
## Summary
19+
20+
`for`-comprehensions in Scala 3 improved their usability in comparison to Scala 2, but there are still some pain points relating both usability of `for`-comprehensions and simplicity of their desugaring.
21+
22+
This SIP tries to address some of those problems, by changing the specification of `for`-comprehensions. From user perspective, the biggest change is allowing aliases at the start of the `for`-comprehensions. e.g.
23+
24+
```
25+
for {
26+
x = 1
27+
y <- Some(2)
28+
} yield x + y
29+
```
30+
31+
## Motivation
32+
33+
There are some clear pain points related to Scala'3 `for`-comprehensions and those can be divided into two categories:
34+
35+
1. User-facing and code simplicity problems
36+
37+
Specifically, for the following example written in a Haskell-style do-comprehension
38+
39+
```haskell
40+
do
41+
a = largeExpr(arg)
42+
b <- doSth(a)
43+
combineM(a, b)
44+
```
45+
in Scala we would have to write
46+
47+
```scala
48+
val a = largeExpr(b)
49+
for
50+
b <- doSth(a)
51+
x <- combineM(a, b)
52+
yield x
53+
```
54+
55+
This complicates the code, even in this simple example.
56+
2. The simplicity of desugared code
57+
58+
The second pain point is that the desugared code of `for`-comprehensions can often be surprisingly complicated.
59+
60+
e.g.
61+
```scala
62+
for
63+
a <- doSth(arg)
64+
b = a
65+
yield a + b
66+
```
67+
68+
Intuition would suggest for the desugared code will be of the form
69+
70+
```scala
71+
doSth(arg).map { a =>
72+
val b = a
73+
a + b
74+
}
75+
```
76+
77+
But because of the possibility of an `if` guard being immediately after the pure alias, the desugared code is of the form
78+
79+
```scala
80+
doSth(arg).map { a =>
81+
val b = a
82+
(a, b)
83+
}.map { case (a, b) =>
84+
a + b
85+
}
86+
```
87+
88+
These unnecessary assignments and additional function calls not only add unnecessary runtime overhead but can also block other optimizations from being performed.
89+
90+
## Proposed solution
91+
92+
This SIP suggests the following changes to `for` comprehensions:
93+
94+
1. Allow `for` comprehensions to start with pure aliases
95+
96+
e.g.
97+
```scala
98+
for
99+
a = 1
100+
b <- Some(2)
101+
c <- doSth(a)
102+
yield b + c
103+
```
104+
2. Simpler conditional desugaring of pure aliases. i.e. whenever a series of pure aliases is not immediately followed by an `if`, use a simpler way of desugaring.
105+
106+
e.g.
107+
```scala
108+
for
109+
a <- doSth(arg)
110+
b = a
111+
yield a + b
112+
```
113+
114+
will be desugared to
115+
116+
```scala
117+
doSth(arg).map { a =>
118+
val b = a
119+
a + b
120+
}
121+
```
122+
123+
but
124+
125+
```scala
126+
for
127+
a <- doSth(arg)
128+
b = a
129+
if b > 1
130+
yield a + b
131+
```
132+
133+
will be desugared to
134+
135+
```scala
136+
doSth(arg).map { a =>
137+
val b = a
138+
(a, b)
139+
}.withFilter { case (a, b) =>
140+
b > 1
141+
}.map { case (a, b) =>
142+
a + b
143+
}
144+
```
145+
146+
3. Avoiding redundant `map` calls if the yielded value is the same as the last bound value.
147+
148+
e.g.
149+
```scala
150+
for
151+
a <- List(1, 2, 3)
152+
yield a
153+
```
154+
155+
will just be desugared to
156+
157+
```scala
158+
List(1, 2, 3)
159+
```
160+
161+
### Detailed description
162+
163+
#### Ad 1. Allow `for` comprehensions to start with pure aliases
164+
165+
Allowing `for` comprehensions to start with pure aliases is a straightforward change.
166+
167+
The Enumerators syntax will be changed from:
168+
169+
```
170+
Enumerators ::= Generator {semi Enumerator | Guard}
171+
```
172+
173+
to
174+
175+
```
176+
Enumerators ::= {Pattern1 `=' Expr semi} Generator {semi Enumerator | Guard}
177+
```
178+
179+
Which will allow adding 0 or more aliases before the first generator.
180+
181+
When desugaring is concerned, a for comprehension starting with pure aliases will generate a block with those aliases as `val` declarations and the rest of the desugared `for` as an expression. Unless the aliases are followed by a guard, then the desugaring should result in an error.
182+
183+
New desugaring rule will be added:
184+
185+
```scala
186+
For any N:
187+
for (P_1 = E_1; ... P_N = E_N; ...)
188+
==>
189+
{
190+
val x_2 @ P_2 = E_2
191+
...
192+
val x_N @ P_N = E_N
193+
for (...)
194+
}
195+
```
196+
197+
e.g.
198+
199+
```scala
200+
for
201+
a = 1
202+
b <- Some(2)
203+
c <- doSth(a)
204+
yield b + c
205+
```
206+
207+
will desugar to
208+
209+
```scala
210+
{
211+
val a = 1
212+
for
213+
b <- Some(2)
214+
c <- doSth(a)
215+
yield b + c
216+
}
217+
```
218+
219+
#### Ad 2. Simpler conditional desugaring of pure aliases. i.e. whenever a series of pure aliases is not immediately followed by an `if`, use a simpler way of desugaring.
220+
221+
Currently, for consistency, all pure aliases are desugared as if they are followed by an `if` condition. Which makes the desugaring more complicated than expected.
222+
223+
e.g.
224+
225+
The following code:
226+
227+
```scala
228+
for
229+
a <- doSth(arg)
230+
b = a
231+
yield a + b
232+
```
233+
234+
will be desugared to:
235+
236+
```scala
237+
doSth(arg).map { a =>
238+
val b = a
239+
(a, b)
240+
}.map { case (a, b) =>
241+
a + b
242+
}
243+
```
244+
245+
The proposed change is to introduce a simpler desugaring for common cases, when aliases aren't followed by a guard, and keep the old desugaring method for the other cases.
246+
247+
A new desugaring rules will be introduced for simple desugaring.
248+
249+
```scala
250+
For any N:
251+
for (P <- G; P_1 = E_1; ... P_N = E_N; ...)
252+
==>
253+
G.flatMap (P => for (P_1 = E_1; ... P_N = E_N; ...))
254+
255+
And:
256+
257+
for () yield E ==> E
258+
259+
(Where empty for-comprehensions are excluded by the parser)
260+
```
261+
262+
It delegares desugaring aliases to the newly introduced rule from the previous impreovement. i.e.
263+
264+
```scala
265+
For any N:
266+
for (P_1 = E_1; ... P_N = E_N; ...)
267+
==>
268+
{
269+
val x_2 @ P_2 = E_2
270+
...
271+
val x_N @ P_N = E_N
272+
for (...)
273+
}
274+
```
275+
276+
One other rule also has to be changed, so that the current desugaring method, of passing all the aliases in a tuple with the result, will only be used when desugaring a generator, followed by some aliases, followed by a guard.
277+
278+
```scala
279+
For any N:
280+
for (P <- G; P_1 = E_1; ... P_N = E_N; if E; ...)
281+
==>
282+
for (TupleN(P, P_1, ... P_N) <-
283+
for (x @ P <- G) yield {
284+
val x_1 @ P_1 = E_2
285+
...
286+
val x_N @ P_N = E_N
287+
TupleN(x, x_1, ..., x_N)
288+
}; if E; ...)
289+
```
290+
291+
This changes will make the desugaring work in the following way:
292+
293+
```scala
294+
for
295+
a <- doSth(arg)
296+
b = a
297+
yield a + b
298+
```
299+
300+
will be desugared to
301+
302+
```scala
303+
doSth(arg).map { a =>
304+
val b = a
305+
a + b
306+
}
307+
```
308+
309+
but
310+
311+
```scala
312+
for
313+
a <- doSth(arg)
314+
b = a
315+
if b > 1
316+
yield a + b
317+
```
318+
319+
will be desugared to
320+
321+
```scala
322+
doSth(arg).map { a =>
323+
val b = a
324+
(a, b)
325+
}.withFilter { case (a, b) =>
326+
b > 1
327+
}.map { case (a, b) =>
328+
a + b
329+
}
330+
```
331+
332+
#### Ad 3. Avoiding redundant `map` calls if the yielded value is the same as the last bound value.
333+
334+
This change is strictly an optimization. This allows for the compiler to get rid of the final `map` call, if the yielded value is the same as the last bound pattern. The pattern can be either a single variable binding or a tuple.
335+
336+
One desugaring rule has to be modified for this purpose.
337+
338+
```scala
339+
for (P <- G) yield P ==> G
340+
If P is a variable or a tuple of variables and G is not a withFilter.
341+
342+
for (P <- G) yield E ==> G.map (P => E)
343+
Otherwise
344+
```
345+
346+
e.g.
347+
```scala
348+
for
349+
a <- List(1, 2, 3)
350+
yield a
351+
```
352+
353+
will just be desugared to
354+
355+
```scala
356+
List(1, 2, 3)
357+
```
358+
359+
### Compatibility
360+
361+
This change may change the semantics of some programs. It may remove some `map` calls in the desugared code, which may change the program semantics (if the `map` implementation was side-effecting).
362+
363+
For example the following code will now have only one `map` call, instead of two:
364+
```scala
365+
for
366+
a <- doSth(arg)
367+
b = a
368+
yield a + b
369+
```
370+
371+
### Other concerns
372+
373+
As far as I know, there are no widely used Scala 3 libraries that depend on the desugaring specification of `for`-comprehensions.
374+
375+
## Links
376+
377+
1. Scala contributors discussion thread (pre-SIP): https://contributors.scala-lang.org/t/pre-sip-improve-for-comprehensions-functionality/3509/51
378+
2. Github issue discussion about for desugaring: https://github.com/lampepfl/dotty/issues/2573
379+
3. Scala 2 implementation of some of the improvements: https://github.com/oleg-py/better-monadic-for
380+
4. Implementation of one of the simplifications: https://github.com/lampepfl/dotty/pull/16703
381+
5. Draft implementation branch: https://github.com/dotty-staging/dotty/tree/improved-fors

‎_sips/sips/for-comprehension-improvements.md

Lines changed: 0 additions & 7 deletions
This file was deleted.

‎_sips/sips/match-types-amendment-extractors-follow-aliases-and-singletons.md

Lines changed: 0 additions & 7 deletions
This file was deleted.

‎_sips/sips/match-types-spec.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -286,8 +286,18 @@ At the top level, `variance = 1` and `scrutIsWidenedAbstract = false`.
286286
* If `q` is a skolem type `∃α:X`, fail as not specific.
287287
* Otherwise, compute `matchPattern(ti, q.Y, 0, scrutIsWidenedAbstract)`.
288288
* Otherwise, the underlying type definition of `q.Y` is of the form `= U`:
289-
* If `q` is a skolem type `∃α:X` and `U` refers to `α`, fail as not specific.
290-
* Otherwise, compute `matchPattern(ti, U, 0, scrutIsWidenedAbstract)`.
289+
* If `q` is not a skolem type `∃α:X`, compute `matchPattern(ti, U, 0, scrutIsWidenedAbstract)`.
290+
* Otherwise, let `U' = dropSkolem(U)` be computed as follow:
291+
* `dropSkolem(q)` is undefined.
292+
* `dropSkolem(p.T) = p'.T` where `p' = dropSkolem(p)` if the latter is defined. Otherwise:
293+
* If the underlying type of `p.T` is of the form `= V`, then `dropSkolem(V)`.
294+
* Otherwise `dropSkolem(p.T)` is undefined.
295+
* `dropSkolem(p.x) = p'.x` where `p' = dropSkolem(p)` if the latter is defined. Otherwise:
296+
* If the dealiased underlying type of `p.x` is a singleton type `r.y`, then `dropSkolem(r.y)`.
297+
* Otherwise `dropSkolem(p.x)` is undefined.
298+
* For all other types `Y`, `dropSkolem(Y)` is the type formed by replacing each component `Z` of `Y` by `dropSkolem(Z)`.
299+
* If `U'` is undefined, fail as not specific.
300+
* Otherwise, compute `matchPattern(ti, U', 0, scrutIsWidenedAbstract)`.
291301
* If `T` is a concrete type alias to a type lambda:
292302
* Let `P'` be the beta-reduction of `P`.
293303
* Compute `matchPattern(P', X, variance, scrutIsWidenedAbstract)`.

‎_sips/sips/mprove-the-syntax-of-context-bounds-and-givens.md

Lines changed: 0 additions & 7 deletions
This file was deleted.

‎_sips/sips/multiple-assignments.md

Lines changed: 328 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,331 @@
11
---
2-
title: SIP-59 - Multiple assignments
3-
status: waiting-for-implementation
4-
pull-request-number: 73
2+
layout: sip
3+
permalink: /sips/:title.html
54
stage: implementation
6-
5+
status: waiting-for-implementation
6+
presip-thread: https://contributors.scala-lang.org/t/pre-sip-multiple-assignments/6425
7+
title: SIP-59 - Multiple Assignments
78
---
9+
10+
**By: Dimi Racordon**
11+
12+
## History
13+
14+
| Date | Version |
15+
|---------------|--------------------|
16+
| Jan 17th 2024 | Initial Draft |
17+
18+
## Summary
19+
20+
This proposal discusses the syntax and semantics of a construct to assign multiple variables with a single expression.
21+
This feature would simplify the implementation of operations expressed in terms of relationships between multiple variables, such as [`std::swap`](https://en.cppreference.com/w/cpp/algorithm/swap) in C++.
22+
23+
## Motivation
24+
25+
It happens that one has to assign multiple variables "at once" in an algorithm.
26+
For example, let's consider the Fibonacci sequence:
27+
28+
```scala
29+
class FibonacciIterator() extends Iterator[Int]:
30+
31+
private var a: Int = 0
32+
private var b: Int = 1
33+
34+
def hasNext = true
35+
def next() =
36+
val r = a
37+
val n = a + b
38+
a = b
39+
b = n
40+
r
41+
```
42+
43+
The same iterator could be rewritten more concisely if we could assign multiple variables at once.
44+
For example, we can write the following in Swift:
45+
46+
```swift
47+
struct FibonacciIterator: IteratorProtocol {
48+
49+
private var a: Int = 0
50+
private var b: Int = 1
51+
init() {}
52+
53+
mutating func next() -> Int? {
54+
defer { (a, b) = (b, a + b) }
55+
return a
56+
}
57+
58+
}
59+
```
60+
61+
Though the differences may seem frivolous at first glance, they are in fact important.
62+
If we look at a formal definition of the Fibonacci sequence (e.g., on [Wikipedia](https://en.wikipedia.org/wiki/Fibonacci_sequence)), we might see something like:
63+
64+
> The Fibonacci sequence is given by *F(n) = F(n-1) + F(n+1)* where *F(0) = 0* and *F(1) = 1*.
65+
66+
Although this declarative description says nothing about an evaluation order, it becomes a concern in our Scala implementation as we must encode the relationship into multiple operational steps.
67+
This decomposition offers opportunities to get things wrong:
68+
69+
```scala
70+
def next() =
71+
val r = a
72+
a = b
73+
b = a + b // invalid semantics, the value of `a` changed "too early"
74+
r
75+
```
76+
77+
In contrast, our Swift implementation can remain closer to the formal definition and is therefore more legible and less error-prone.
78+
79+
Multiple assignments show up in many general-purpose algorithms (e.g., insertion sort, partition, min-max element, ...).
80+
But perhaps the most fundamental one is `swap`, which consists of exchanging two values.
81+
82+
We often swap values that are stored in some collection.
83+
In this particular case, all is well in Scala because we can ask the collection to swap elements at given positions:
84+
85+
```scala
86+
extension [T](self: mutable.ArrayBuffer[T])
87+
def swapAt(i: Int, j: Int) =
88+
val t = self(i)
89+
self(i) = self(j)
90+
self(j) = t
91+
92+
val a = mutable.ArrayBuffer(1, 2, 3)
93+
a.swapAt(0, 2)
94+
println(a) // ArrayBuffer(3, 2, 1)
95+
```
96+
97+
Sadly, one can't implement a generic swap method that wouldn't rely on the ability to index a container.
98+
The only way to express this operation in Scala is to "inline" the pattern implemented by `swapAt` every time we need to swap two values.
99+
100+
Having to rewrite this boilerplate is unfortunate.
101+
Here is an example in a realistic algorithm:
102+
103+
```scala
104+
extension [T](self: Seq[T])(using Ordering[T])
105+
def minMaxElements: Option[(T, T)] =
106+
import math.Ordering.Implicits.infixOrderingOps
107+
108+
// Return None for collections smaller than 2 elements.
109+
var i = self.iterator
110+
if (!i.hasNext) { return None }
111+
var l = i.next()
112+
if (!i.hasNext) { return None }
113+
var h = i.next()
114+
115+
// Confirm the initial bounds.
116+
if (h < l) { val t = l; l = h; h = l }
117+
118+
// Process the remaining elements.
119+
def loop(): Option[(T, T)] =
120+
if (i.hasNext) {
121+
val n = i.next()
122+
if (n < l) { l = n } else if (n > h) { h = n }
123+
loop()
124+
} else {
125+
Some((l, h))
126+
}
127+
loop()
128+
```
129+
130+
*Note: implementation shamelessly copied from [swift-algorithms](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/MinMax.swift).*
131+
132+
The swap occurs in the middle of the method with the sequence of expressions `val t = l; l = h; h = l`.
133+
To borrow from the words of Edgar Dijskstra [1, Chapter 11]:
134+
135+
> [that] is combersome and ugly compared with the [multiple] assignment.
136+
137+
While `swap` is a very common operation, it's only an instance of a more general class of operations that are expressed in terms of relationships between multiple variables.
138+
The definition of the Fibonacci sequence is another example.
139+
140+
## Proposed solution
141+
142+
The proposed solution is to add a language construct to assign multiple variables in a single expression.
143+
Using this construct, swapping two values can be written as follows:
144+
145+
```scala
146+
var a = 2
147+
var b = 4
148+
(a, b) = (b, a)
149+
println(s"$a$b") // 42
150+
```
151+
152+
The above Fibonacci iterator can be rewritten as follows:
153+
154+
```scala
155+
class FibonacciIterator() extends Iterator[Int]:
156+
157+
private var a: Int = 0
158+
private var b: Int = 1
159+
160+
def hasNext = true
161+
def next() =
162+
val r = a
163+
(a, b) = (b, a + b)
164+
r
165+
```
166+
167+
Multiple assignments also alleviate the need for a swap method on collections, as the same idiomatic pattern can be reused to exchange elements at given indices:
168+
169+
```scala
170+
val a = mutable.ArrayBuffer(1, 2, 3)
171+
(a(0), a(2)) = (a(2), a(0))
172+
println(a) // ArrayBuffer(3, 2, 1)
173+
```
174+
175+
### Specification
176+
177+
A multiple assignment is an expression of the form `AssignTarget ‘=’ Expr` where:
178+
179+
```
180+
AssignTarget ::= ‘(’ AssignTargetNode {‘,’ AssignTargetNode} ‘)’
181+
AssignTargetNode ::= Expr | AssignTarget
182+
```
183+
184+
An assignment target describes a structural pattern that can only be matched by a compatible composition of tuples.
185+
For example, the following program is legal.
186+
187+
```scala
188+
def f: (Boolean, Int) = (true, 42)
189+
val a = mutable.ArrayBuffer(1, 2, 3)
190+
def b = a
191+
var x = false
192+
193+
(x, a(0)) = (false, 1337)
194+
(x, a(1)) = f
195+
((x, a(1)), b(2)) = (f, 9000)
196+
(x) = Tuple1(false)
197+
```
198+
199+
A mismatch between the structure of a multiple assignment's target and the result of its RHS is a type error.
200+
It cannot be detected during parsing because at this stage the compiler would not be able to determine the shape of an arbitrary expression's result.
201+
For example, all multiple assignments in the following program are ill-typed:
202+
203+
```scala
204+
def f: (Boolean, Int) = (true, 42)
205+
val a = mutable.ArrayBuffer(1, 2, 3)
206+
def b = a
207+
var x = false
208+
209+
(a(1), x) = f // type mismatch
210+
(x, a(1), b(2)) = (f, 9000) // structural mismatch
211+
(x) = false // structural mismatch
212+
(x) = (1, 2) // structural mismatch
213+
```
214+
215+
Likewise, `(x) = Tuple1(false)` is _not_ equivalent to `x = Tuple1(false)`.
216+
The former is a multiple assignment while the latter is a regular assignment, as described by the [current grammar](https://docs.scala-lang.org/scala3/reference/syntax.html) (see `Expr1`).
217+
Though this distinction is subtle, multiple assignments involving unary tuples should be rare.
218+
219+
The operational semantics of multiple assignments (aka concurrent assignments) have been studied extensively in scienific literature (e.g., [1, 2]).
220+
A first intuition is that the most desirable semantics can be achieved by fully evaluating the RHS of the assignment before assigning any expression in the LHS [1].
221+
However, additional considerations must be given w.r.t. the independence of the variables on the LHS to guarantee deterministic results.
222+
For example, consider the following expression:
223+
224+
```scala
225+
(x, x) = (1, 2)
226+
```
227+
228+
While one may conclude that such an expression should be an error [1], it is in general difficult to guarantee value independence in a language with pervasive reference semantics.
229+
Further, it is desirable to write expressions of the form `(a(0), a(2)) = (a(2), a(0))`, as shown in the previous section.
230+
Another complication is that multiple assignments should uphold the general left-to-right evaluation semantics of the Scala language.
231+
For example, `a.b = c` requires `a` to be evaluated _before_ `c`.
232+
233+
Note that regular assignments desugar to function calls (e.g., `a(b) = c` is sugar for `a.update(b, c)`).
234+
One property of these desugarings is always the last expression being evaluated before the method performing the assignment is called.
235+
Given this observation, we address the abovementioned issues by defining the following algorithm:
236+
237+
1. Traverse the LHS structure in inorder and for each leaf:
238+
- Evaluate each outermost subexpression to its value
239+
- Form a closure capturing these values and accepting a single argument to perform the desugared assignment
240+
- Associate that closure to the leaf
241+
2. Compute the value of the RHS, which forms a tree
242+
3. Traverse the LHS and RHS structures pairwise in inorder and for each leaf:
243+
- Apply the closure formerly associated to the LHS on RHS value
244+
245+
For instance, consider the following definitions.
246+
247+
```scala
248+
def f: (Boolean, Int) = (true, 42)
249+
val a = mutable.ArrayBuffer(1, 2, 3)
250+
def b = a
251+
var x = false
252+
```
253+
254+
The evaluation of the expression `((x, a(a(0))), b(2)) = (f, 9000)` is as follows:
255+
256+
1. form a closure `f0 = (rhs) => x_=(rhs)`
257+
2. evaluate `a(0)`; result is `1`
258+
3. form a closure `f1 = (rhs) => a.update(1, rhs)`
259+
4. evaluate `b`; result is `a`
260+
5. evaluate `2`
261+
6. form a closure `f2 = (rhs) => a.update(2, rhs)`
262+
7. evaluate `(f, 9000)`; result is `((true, 42), 9000)`
263+
8. evaluate `f0(true)`
264+
9. evaluate `f1(42)`
265+
10. evaluate `f2(9000)`
266+
267+
After the assignment, `x == true` and `a == List(1, 42, 9000)`.
268+
269+
The compiler is allowed to ignore this procedure and generate different code for optimization purposes as long as it can guarantee that such a change is not observable.
270+
For example, given two local variables `x` and `y`, their assignments in `(x, y) = (1, 2)` can be reordered or even performed in parallel.
271+
272+
### Compatibility
273+
274+
This proposal is purely additive and have no backward binary or TASTy compatibility consequences.
275+
The semantics of the proposed new construct is fully expressible in terms of desugaring into current syntax, interpreteted with current semantics.
276+
277+
The proposed syntax is not currently legal Scala.
278+
Therefore no currently existing program could be interpreted with different semantics using a newer compiler version supporting multiple assignments.
279+
280+
### Other concerns
281+
282+
One understandable concern of the proposed syntax is that the semantics of multiple assignments resembles that of pattern matching, yet it has different semantics.
283+
For example:
284+
285+
```scala
286+
val (a(x), b) = (true, "!") // 1
287+
288+
(a(x), b) = (true, "!") // 2
289+
```
290+
291+
If `a` is instance of a type with a companion extractor object, the two lines above have completely different semantics.
292+
The first declares two local bindings `x` and `b`, applying pattern matching to determine their value from the tuple `(true, "!")`.
293+
The second is assigning `a(x)` and `b` to the values `true` and `"!"`, respectively.
294+
295+
Though possibly surprising, the difference in behavior is easy to explain.
296+
The first line applies pattern matching because it starts with `val`.
297+
The second doesn't because it involves no pattern matching introducer.
298+
Further, note that a similar situation can already be reproduced in current Scala:
299+
300+
```scala
301+
val a(x) = true // 1
302+
303+
a(x) = true // 2
304+
```
305+
306+
## Alternatives
307+
308+
The current proposal supports arbitrary tree structures on the LHS of the assignment.
309+
A simpler alternative would be to only support flat sequences, allowing the syntax to dispense with parentheses.
310+
311+
```scala
312+
a, b = b, a
313+
```
314+
315+
While this approach is more lightweight, the reduced expressiveness inhibits potentially interesting use cases.
316+
Further, consistently using tuple syntax on both sides of the equality operator clearly distinguishes regular and multiple assignments.
317+
318+
## Related work
319+
320+
A Pre-SIP discussion took place prior to this proposal (see [here](https://contributors.scala-lang.org/t/pre-sip-multiple-assignments/6425/1)).
321+
322+
Multiple assignments are present in many contemporary languages.
323+
This proposal already illustrated them in Swift, but they are also commonly used in Python.
324+
Multiple assigments have also been studied extensively in scienific literature (e.g., [1, 2]).
325+
326+
## FAQ
327+
328+
## References
329+
330+
1. Edsger W. Dijkstra: A Discipline of Programming. Prentice-Hall 1976, ISBN 013215871X
331+
2. Ralph-Johan Back, Joakim von Wright: Refinement Calculus - A Systematic Introduction. Graduate Texts in Computer Science, Springer 1998, ISBN 978-0-387-98417-9

‎_sips/sips/named-tuples.md

Lines changed: 774 additions & 4 deletions
Large diffs are not rendered by default.

‎_sips/sips/typeclasses-syntax.md

Lines changed: 685 additions & 0 deletions
Large diffs are not rendered by default.

‎_sips/sips/unroll-default-arguments-for-binary-compatibility.md

Lines changed: 0 additions & 7 deletions
This file was deleted.

‎_sips/sips/unroll-default-arguments.md

Lines changed: 934 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)
Please sign in to comment.