Skip to content

rewrite rules mean implicit function variables cannot be safely passed as arguments #10889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
elfprince13 opened this issue Dec 22, 2020 · 9 comments · Fixed by #14210
Closed

Comments

@elfprince13
Copy link

(discussion carried over from #10825 at @LPTK's suggestion).

Minimized code

import scala.annotation.tailrec

object Explode {
  trait Txn {}
  type AtomicOp[Z] = Txn ?=> Z

  object AbortAndRetry extends scala.util.control.ControlThrowable("abort and retry this transaction") {}

  def beginTxn:Txn = new Txn {}
  def commit(using txn:Txn) = throw AbortAndRetry

  @tailrec def atomic[Z](txnOp:AtomicOp[Z]):Z = {
    try {
      given Txn = beginTxn
      val ret:Z = txnOp
      commit
      ret
    } catch {
      case AbortAndRetry => atomic(txnOp)
    }
  }

  def main(args:Array[String]) = {
    Console.println(atomic {
      "hello world"
    })
  }
}

Output

StackOverflowError originating at Explode$.atomic$$anonfun$1(Explode.scala:19):

      case AbortAndRetry => atomic(txnOp)

Expectation

STMs and other optimistic approaches to speculative execution of composite atomic operations typically rely on an abort/retry loop, and are a pretty standard example for the benefits of contextual abstractions (see, e.g, ScalaSTM).

Unfortunately, the current version of dotty uses the following rule for context functions passed as arguments:

Conversely, if the expected type of an expression E is a context function type
(T_1, ..., T_n) ?=> U and E is not already an
context function literal, E is converted to a context function literal by rewriting it to

 (x_1: T1, ..., x_n: Tn) ?=> E

where the names x_1, ..., x_n are arbitrary. This expansion is performed
before the expression E is typechecked, which means that x_1, ..., x_n
are available as givens in E.

This means that even if E is already a variable of type (T_1, ..., T_n) ?=> U, the wrapper is still produced.
In the abort/retry loop handler structure above, this means that (a) a potentially very large number of transient objects are invisibly produced in a performance-sensitive context, (b) the stack space consumed by invoking txnOp grows linearly with the number of abort/retry cycles, and, in the pathological case, potentially side-effecting control flow with a StackOverflowError.

At a more basic level it breaks one of the fundamental assumptions that I think most programmers have about type systems, which is that if I have a variable x:S and function f:S=>T, then invoking f(x) will actually pass the value x to f.

We can work around this by falling back to something which is closer to the Scala 2 style of "implicit functions":

import scala.annotation.tailrec

object Explode {
  trait Txn {}
  type AtomicOp[Z] = Txn ?=> Z
  type VanillaAtomicOp[Z] = Txn => Z

  object AbortAndRetry extends scala.util.control.ControlThrowable("abort and retry this transaction") {}

  def beginTxn:Txn = new Txn {}
  def commit(using txn:Txn) = throw AbortAndRetry

  @tailrec def atomic[Z](txn:VanillaAtomicOp[Z]):Z = {
    try {
      given Txn = beginTxn
      val ret:Z = txn.asInstanceOf[AtomicOp[Z]]
      commit
      ret
    } catch {
      case AbortAndRetry => atomic(txn)
    }
  }

  def main(args:Array[String]) = {
    Console.println(atomic {
      (txn:Txn) =>
        given Txn = txn
        "hello world"
    })
  }
}

But this is clearly a lot less elegant and feels like it should not be necessary. In fact it runs counter to the stated goal of implicit functions:

The new formulation of comp might seem like a minor syntactic twist, but it has far-reaching consequences. Because implicit Ord[T] => Boolean is now a type, it can be abstracted over by giving it a name and using only the name afterwards. This is exploited the following example, which shows how to approach the configuration problem [Kiselyov and Shan 2004] in a type-safe manner. The goal is to propagate a run-time value, the modulus, while doing arithmetic with the usual functions, add and mul, uncluttered.

[ Odersky, Blanvillain, Liu, Biboudis, Miller, and Stucki. 2018. Simplicitly: Foundations and Applications of Implicit Function Types ]

@elfprince13
Copy link
Author

The "obvious" fix to the spec is:

if E is not already an context function literal or an expression of type (x_1: T1, ..., x_n: Tn) ?=> E

and from my (admittedly fast) read through of the Simplicitly paper, I didn't find a theoretical reason not to do this (and the quote above suggests that the version I propose is in fact the actual intent of the paper)

However the next sentence from the docs:

This expansion is performed before the expression E is typechecked, which means that x_1, ..., x_n are available as givens in E.

makes it sound like there are some implementation difficulties to doing this naively.

I've implemented my fair share of interpreters and type checkers for toy languages, but I am by no means a compiler engineer (and certainly not familiar with the dotty code base) so I'm not sure where a reasonable patch would even begin (perhaps conditionally type-checking E before deciding whether it needs to be rewritten); however, I feel very strongly that it would be a shame to leave a wart like this in a language (and language feature!) whose design was largely motivated by removing warts.

@odersky
Copy link
Contributor

odersky commented Dec 22, 2020

This behavior is deeply ingrained in both the theory and implementation. The only possible improvement I see is if we could apply an optimization that shortcuts closure creation and application pairs. That would be a worthwhile project to undertake.

@elfprince13
Copy link
Author

I feel like if it's implemented as an optimization, it would need an annotation like @tailrec to let the programmer force an error if the optimization cannot be applied.

@LPTK
Copy link
Contributor

LPTK commented Dec 22, 2020

@elfprince13 where would you put the annotation? I think it would just make everything more confusing, as it's quite a specialized case.

An optimization like this sounds rather straightforward to implement, and from your example it looks like it could be useful to guarantee that the optimization is always performed.

There was a similar story with for comprehensions and their terminal map calls (they currently occur even when we end up mapping the identity function), which sometimes cause stack overflows. A similar guaranteed optimization would likely be useful there too.

@elfprince13
Copy link
Author

I definitely have no objection to not having to add annotations as long as there is a guarantee somewhere.

@elfprince13
Copy link
Author

Okay, a couple questions.

  1. Is the code that actually does this rewrite the snippet here?
    https://github.com/lampepfl/dotty/blob/3e471ab8e6ed8ebd83819a440d21043f3c05ef92/compiler/src/dotty/tools/dotc/typer/Typer.scala#L2577-L2597

  2. If so, is the main thing needed to implement the optimization a MiniPhase that runs after Typer in dotty.tools.dotc.transform and recognizes the tree structure produced by the above snippet (and undoes it)?

  3. Is this really better than trying to special-case the original rewrite? I took another pass through the Simplicitly paper and found this snippet:

    The typing rule (?→ I) type checks an expression by assuming its expansion to an implicit function, but only if the expected type of the expression is of an implicit function type. This last point is the reason why we do not have a syntax for implicit functions: if we did, this rule would have been applied endlessly! While it is a simple check in the compiler to stop applying this rule if the term is already an implicit function, avoiding to specify such a rule makes the calculus more clear.

    Which I think is talking about this use case?

@odersky
Copy link
Contributor

odersky commented Dec 26, 2020

  1. Is the code that actually does this rewrite the snippet here?

Correct.

  1. If so, is the main thing needed to implement the optimization a MiniPhase that runs after Typer in dotty.tools.dotc.transform and recognizes the tree structure produced by the above snippet (and undoes it)?

Yes, that seems to be the best way forward. We do something similar for call-by-name parameters.

  1. Is this really better than trying to special-case the original rewrite?

You can't special case the original rewrite since at that point you don't have the type of any tree.

@elfprince13
Copy link
Author

You can't special case the original rewrite since at that point you don't have the type of any tree.

Understood.

Yes, that seems to be the best way forward. We do something similar for call-by-name parameters.

Okay, so it looks like the three relevant classes are TransformByNameApply, ByNameClosures, and ElimByName, and in particular, this appears to be the special-case that checks if a by-name argument has already been thunked and avoids rethunking it:
https://github.com/lampepfl/dotty/blob/ac2cd05403813a6b0ab918ad3b7e19b97d7bfa10/compiler/src/dotty/tools/dotc/transform/TransformByNameApply.scala#L52-L54

Presumably the starting point would be to base a class off of TransformByNameApply's boilerplate with a different implementation of transformArg.

Presumably if formal is a context function, I would want to match on some closure, Closure(env, meth, EmptyTree), but what I'm unsure of is if its then sufficient to look at env for a captured value whose type is a subtype of formal, or if I need to somehow get the body of meth and look for an Apply()?

@elfprince13
Copy link
Author

@odersky @LPTK - Any follow-up on the above question? Would be happy to put a PR together for this in the next few weeks while I'm likely working on a conference submission that would benefit from the enhancement.

elfprince13 added a commit to elfprince13/dotty that referenced this issue Oct 14, 2021
… contextual closures if they wrap a context function of the expected type. Add a runtime test to probe for stackoverflow if this optimization is not performed.
odersky added a commit to dotty-staging/dotty that referenced this issue Jan 4, 2022
Fixes scala#10889. Alternative to scala#13750, from which the test is taken.
olsdavis pushed a commit to olsdavis/dotty that referenced this issue Apr 4, 2022
Fixes scala#10889. Alternative to scala#13750, from which the test is taken.
@Kordyjan Kordyjan added this to the 3.1.2 milestone Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants