Skip to content

String interpolation optimization miniphase #3961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions compiler/src/dotty/tools/dotc/Compiler.scala
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,15 @@ import Periods._
import Symbols._
import Types._
import Scopes._
import typer.{FrontEnd, Typer, ImportInfo, RefChecks}
import reporting.{Reporter, ConsoleReporter}
import typer.{FrontEnd, ImportInfo, RefChecks, Typer}
import reporting.{ConsoleReporter, Reporter}
import Phases.Phase
import transform._
import util.FreshNameCreator
import core.DenotTransformers.DenotTransformer
import core.Denotations.SingleDenotation

import dotty.tools.backend.jvm.{LabelDefs, GenBCode, CollectSuperCalls}
import dotty.tools.dotc.transform.localopt.Simplify
import dotty.tools.backend.jvm.{CollectSuperCalls, GenBCode, LabelDefs}
import dotty.tools.dotc.transform.localopt.{Simplify, StringInterpolatorOpt}

/** The central class of the dotc compiler. The job of a compiler is to create
* runs, which process given `phases` in a given `rootContext`.
Expand Down Expand Up @@ -78,6 +77,7 @@ class Compiler {
new PatternMatcher, // Compile pattern matches
new ExplicitOuter, // Add accessors to outer classes from nested ones.
new ExplicitSelf, // Make references to non-trivial self types explicit as casts
new StringInterpolatorOpt, // Optimizes raw and s string interpolators by rewriting them to string concatentations
new CrossCastAnd, // Normalize selections involving intersection types.
new Splitter) :: // Expand selections involving union types into conditionals
List(new ErasedDecls, // Removes all erased defs and vals decls (except for parameters)
Expand Down
10 changes: 10 additions & 0 deletions compiler/src/dotty/tools/dotc/core/Definitions.scala
Original file line number Diff line number Diff line change
Expand Up @@ -583,6 +583,16 @@ class Definitions {
lazy val StringAdd_plusR = StringAddClass.requiredMethodRef(nme.raw.PLUS)
def StringAdd_+(implicit ctx: Context) = StringAdd_plusR.symbol

lazy val StringContextType: TypeRef = ctx.requiredClassRef("scala.StringContext")
def StringContextClass(implicit ctx: Context) = StringContextType.symbol.asClass
lazy val StringContextSR = StringContextClass.requiredMethodRef(nme.s)
def StringContextS(implicit ctx: Context) = StringContextSR.symbol
lazy val StringContextRawR = StringContextClass.requiredMethodRef(nme.raw_)
def StringContextRaw(implicit ctx: Context) = StringContextRawR.symbol
def StringContextModule(implicit ctx: Context) = StringContextClass.companionModule
lazy val StringContextModule_applyR = StringContextModule.requiredMethodRef(nme.apply)
def StringContextModule_apply(implicit ctx: Context) = StringContextModule_applyR.symbol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarter Can you comment on this? Should we expose only the ones used in the optimisation (i.e. StringContextRaw, StringContextS, StringContextModule_apply)? Should these values be inlined in the mini phase and be initialised in prepareForUnit?


lazy val PartialFunctionType: TypeRef = ctx.requiredClassRef("scala.PartialFunction")
def PartialFunctionClass(implicit ctx: Context) = PartialFunctionType.symbol.asClass
lazy val AbstractPartialFunctionType: TypeRef = ctx.requiredClassRef("scala.runtime.AbstractPartialFunction")
Expand Down
2 changes: 2 additions & 0 deletions compiler/src/dotty/tools/dotc/core/StdNames.scala
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,7 @@ object StdNames {
val productElement: N = "productElement"
val productIterator: N = "productIterator"
val productPrefix: N = "productPrefix"
val raw_ : N = "raw"
val readResolve: N = "readResolve"
val reflect : N = "reflect"
val reify : N = "reify"
Expand All @@ -495,6 +496,7 @@ object StdNames {
val runtime: N = "runtime"
val runtimeClass: N = "runtimeClass"
val runtimeMirror: N = "runtimeMirror"
val s: N = "s"
val sameElements: N = "sameElements"
val scala_ : N = "scala"
val scalaShadowing : N = "scalaShadowing"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
package dotty.tools.dotc.transform.localopt

import dotty.tools.dotc.ast.Trees._
import dotty.tools.dotc.ast.tpd
import dotty.tools.dotc.core.Constants.Constant
import dotty.tools.dotc.core.Contexts.Context
import dotty.tools.dotc.core.StdNames._
import dotty.tools.dotc.core.Symbols._
import dotty.tools.dotc.transform.MegaPhase.MiniPhase

/**
* MiniPhase to transform s and raw string interpolators from using StringContext to string
* concatenation. Since string concatenation uses the Java String builder, we get a performance
* improvement in terms of these two interpolators.
*
* More info here:
* https://medium.com/@dkomanov/scala-string-interpolation-performance-21dc85e83afd
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the author attribution. Also add a comment explaining what this phase does

class StringInterpolatorOpt extends MiniPhase {
import tpd._

override def phaseName: String = "stringInterpolatorOpt"

/** Matches a list of constant literals */
private object Literals {
def unapply(tree: SeqLiteral)(implicit ctx: Context): Option[List[Literal]] = {
tree.elems match {
case literals if literals.forall(_.isInstanceOf[Literal]) =>
Some(literals.map(_.asInstanceOf[Literal]))
case _ => None
}
}
}

private object StringContextIdent {
def unapply(tree: Ident)(implicit ctx: Context): Option[Ident] = {
if (tree.symbol.eq(defn.StringContextModule)) Some(tree) else None
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline this extractor into StringContextApply


private object StringContextApply {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this extractor is just a test and does not return any result (but the tree itself). You can return a Boolean instead:

private object StringContextApply {
  def unapply(tree: Select)(implicit ctx: Context): Boolean =
    tree.symbol.eq(defn.StringContextModule_apply) && {
      val qualifier = tree.qualifier
      qualifier.isInstanceOf[Ident] && qualifier.symbol.eq(efn.StringContextModule)
    }
}

def unapply(tree: Select)(implicit ctx: Context): Option[Select] = {
tree match {
case Select(StringContextIdent(_), _)
if tree.symbol.eq(defn.StringContextModule_apply) => Some(tree)
case _ => None
}
}
}

/** Matches an s or raw string interpolator */
private object SOrRawInterpolator {
def unapply(tree: Tree)(implicit ctx: Context): Option[(List[Literal], List[Tree])] = {
if (tree.symbol.eq(defn.StringContextRaw) || tree.symbol.eq(defn.StringContextS)) {
tree match {
case Apply(Select(Apply(StringContextApply(_), List(Literals(strs))), _),
List(SeqLiteral(elems, _))) if elems.length == strs.length - 1 =>
Some(strs, elems)
case _ => None
}
} else None
}
}

/**
* Match trees that resemble s and raw string interpolations. In the case of the s
* interpolator, escapes the string constants. Exposes the string constants as well as
* the variable references.
*/
private object StringContextIntrinsic {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment explaining what this extractor does

def unapply(tree: Apply)(implicit ctx: Context): Option[(List[Literal], List[Tree])] = {
tree match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would first check if tree.symbol is StringContext.s or StringContext.raw for performance

case SOrRawInterpolator(strs, elems) =>
if (tree.symbol == defn.StringContextRaw) Some(strs, elems)
else { // tree.symbol == defn.StringContextS
try {
val escapedStrs = strs.map { str =>
val escapedValue = StringContext.processEscapes(str.const.stringValue)
cpy.Literal(str)(Constant(escapedValue))
}
Some(escapedStrs, elems)
} catch {
case _: StringContext.InvalidEscapeException => None
}
}
case _ => None
}
}
}

override def transformApply(tree: Apply)(implicit ctx: Context): Tree = {
tree match {
case StringContextIntrinsic(strs: List[Literal], elems: List[Tree]) =>
val stri = strs.iterator
val elemi = elems.iterator
var result: Tree = stri.next
def concat(tree: Tree): Unit = {
result = result.select(defn.String_+).appliedTo(tree)
}
while (elemi.hasNext) {
concat(elemi.next)
val str = stri.next
if (!str.const.stringValue.isEmpty) concat(str)
}
result
case _ => tree
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
package dotty.tools.backend.jvm

import org.junit.Assert._
import org.junit.Test

class StringInterpolatorOptTest extends DottyBytecodeTest {
import ASMConverters._

@Test def testRawInterpolator = {
val source =
"""
|class Foo {
| val one = 1
| val two = "two"
| val three = 3.0
|
| def meth1: String = raw"$one plus $two$three\n"
| def meth2: String = "" + one + " plus " + two + three + "\\n"
|}
""".stripMargin

checkBCode(source) { dir =>
val clsIn = dir.lookupName("Foo.class", directory = false).input
val clsNode = loadClassNode(clsIn)
val meth1 = getMethod(clsNode, "meth1")
val meth2 = getMethod(clsNode, "meth2")

val instructions1 = instructionsFromMethod(meth1)
val instructions2 = instructionsFromMethod(meth2)

assert(instructions1 == instructions2,
"the `` string interpolator incorrectly converts to string concatenation\n" +
diffInstructions(instructions1, instructions2))
}
}

@Test def testSInterpolator = {
val source =
"""
|class Foo {
| val one = 1
| val two = "two"
| val three = 3.0
|
| def meth1: String = s"$one plus $two$three\n"
| def meth2: String = "" + one + " plus " + two + three + "\n"
|}
""".stripMargin

checkBCode(source) { dir =>
val clsIn = dir.lookupName("Foo.class", directory = false).input
val clsNode = loadClassNode(clsIn)
val meth1 = getMethod(clsNode, "meth1")
val meth2 = getMethod(clsNode, "meth2")

val instructions1 = instructionsFromMethod(meth1)
val instructions2 = instructionsFromMethod(meth2)

assert(instructions1 == instructions2,
"the `s` string interpolator incorrectly converts to string concatenation\n" +
diffInstructions(instructions1, instructions2))
}
}
}
12 changes: 12 additions & 0 deletions tests/run/interpolation-opt.check
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
1 plus two\nis 3.0
1 plus two
is 3.0
a1two3.0b
a1two3.0b


Hello World
Side effect!
Foo Bar
Side effect n2!
Titi Toto
29 changes: 29 additions & 0 deletions tests/run/interpolation-opt.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
object Test extends App {

val one = 1
val two = "two"
val three = 3.0

// Test escaping
println(raw"$one plus $two\nis $three")
println(s"$one plus $two\nis $three")

// Test empty strings between elements
println(raw"a$one$two${three}b")
println(s"a$one$two${three}b")

// Test empty string interpolators
println(raw"")
println(s"")

// Make sure that StringContext still works with idents
val foo = "Hello"
val bar = "World"
println(StringContext(foo, bar).s(" "))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test for:

def myStringContext= { println("Side effect!"); StringContext }
println(myStringContext(foo, bar).s(" ")) // this shouldn't be optimised away

I am not sure it is handled by your implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test and looks like it handles it correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my example was erroneous. What I had is mind was:

println(myStringContext("Foo", "Bar").s(" ")) // this shouldn't be optimised away

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one too, should not be optimised:

println({ println("Side effect n2!"); StringContext }.apply("Titi", "Toto").s(" ")) // this shouldn't be optimised away

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests and a fix.


def myStringContext= { println("Side effect!"); StringContext }
println(myStringContext("Foo", "Bar").s(" ")) // this shouldn't be optimised away

// this shouldn't be optimised away
println({ println("Side effect n2!"); StringContext }.apply("Titi", "Toto").s(" "))
}