Use explicit destinations in codegen to avoid uselessly jumping around. #14890

sjrd · 2022-04-08T11:03:43Z

~~CI for now. This needs some cleanup and better tests that the bytecode is indeed improved in a number of situations. (notably with pattern matching)~~

Previously, the codegen's main method genLoad always generated code that loaded the value on the stack before continuing. There were a number of situations where genLoad would be directly followed by unconditional jumps to instructions performing more jumps, returns and throws. This generated more spurious jumps than necessary, along with artifact dead code.

We solve these limitations by introducing LoadDestinations that specify the destination of a loaded value:

FallThrough: as previously, load the value on the stack and continue.
Jump(label): load the value on the stack and jump to the given label.
Return: return the value from the enclosing method.
Throw: throw the value.

We generalize genLoad as genLoadTo, taking a specific destination for the loaded value. genLoadTo can "push down" its destination into all control flow structures (except Trys, because of their cleanups). With that, when we get to the end of what amounts to "basic blocks", we know exactly the ultimate destination of the loaded value. We can therefore directly jump, return or throw to the final destination.

This produces less bytecode, notably because fewer labels are necessary. For example, the method:

  def abs(x: Int): Int = if x < 0 then -x else x

previously generated bytecode like

  ILOAD 1
  ICONST_0
  IF_ICMPGE Label(1)
  ILOAD 1
  INEG
  GOTO Label(2)
  Label(1):
  ILOAD 1
  Label(2):
  IRETURN

Now, instead of jumping to Label(2), we directly perform an IRETURN:

  ILOAD 1
  ICONST_0
  IF_ICMPGE Label(1)
  ILOAD 1
  INEG
  IRETURN
  Label(1):
  ILOAD 1
  IRETURN

While the changes are not very impressive on that simple example, they become more important in more complex cases, notably with pattern matching. Examples can be found in the changed bytecode tests.

An added benefit is that genLoadTo knows when loading a value results in an unconditional control flow change (jump, return or throw). It can then avoid inserting any useless adaptation. This removes all the dead bytecode that the codegen used to generate as artifacts of its own compilation scheme. (It will still generate dead bytecode if the original source code/inlined code contains dead code.)

Previously, the codegen's main method `genLoad` always generated code that loaded the value on the stack before continuing. There were a number of situations where `genLoad` would be directly followed by unconditional jumps to instructions performing more jumps, returns and throws. This generated more spurious jumps than necessary, along with artifact dead code. We solve these limitations by introducing `LoadDestination`s that specify the destination of a loaded value: * FallThrough: as previously, load the value on the stack and continue. * Jump(label): load the value on the stack and jump to the given label. * Return: return the value from the enclosing method. * Throw: throw the value. We generalize `genLoad` as `genLoadTo`, taking a specific destination for the loaded value. `genLoadTo` can "push down" its destination into all control flow structures (except `Try`s, because of their cleanups). With that, when we get to the end of what amounts to "basic blocks", we know exactly the ultimate destination of the loaded value. We can therefore directly jump, return or throw to the final destination. This produces less bytecode, notably because fewer labels are necessary. For example, the method: def abs(x: Int): Int = if x < 0 then -x else x previously generated bytecode like ILOAD 1 ICONST_0 IF_ICMPGE Label(1) ILOAD 1 INEG GOTO Label(2) Label(1): ILOAD 1 Label(2): IRETURN Now, instead of jumping to Label(2), we directly perform an IRETURN: ILOAD 1 ICONST_0 IF_ICMPGE Label(1) ILOAD 1 INEG IRETURN Label(1): ILOAD 1 IRETURN While the changes are not very impressive on that simple example, they become more important in more complex cases, notably with pattern matching. Examples can be found in the changed bytecode tests. An added benefit is that `genLoadTo` knows when loading a value results in an unconditional control flow change (jump, return or throw). It can then avoid inserting any useless adaptation. This removes all the dead bytecode that the codegen used to generate as artifacts of its own compilation scheme. (It will still generate dead bytecode if the original source code/inlined code contains dead code.)

sjrd · 2022-04-13T09:54:38Z

compiler/src/dotty/tools/backend/jvm/BCodeBodyBuilder.scala

-          case Labeled(bind, expr) if tpeTK(body) == UNIT =>
-            // this is the shape of tailrec methods
-            val loop = programPoint(bind.symbol)
-            markProgramPoint(loop)
-            genLoad(expr, UNIT)
-            bc goTo loop


This optimization is now performed "by construction" of the generic LoadDestination infrastructure. :)

sjrd · 2022-04-13T09:57:07Z

compiler/src/dotty/tools/backend/jvm/BCodeSkelBuilder.scala

-            case (_: Return) | Block(_, (_: Return)) => ()
-            case (_: Apply) | Block(_, (_: Apply)) if trimmedRhs.symbol eq defn.throwMethod => ()


These two optimizations are also taken care of by the general LoadDestination infrastructure.

sjrd · 2022-04-27T13:11:54Z

Ping @lrytz ?

lrytz

This is great, definitely worth backporting to Scala 2! Not that trivial, but LGTM, I couldn't spot any mistakes.

compiler/src/dotty/tools/backend/jvm/BCodeBodyBuilder.scala

sjrd self-assigned this Apr 8, 2022

sjrd force-pushed the codegen-destinations branch 3 times, most recently from 75c1f23 to d86e68b Compare April 13, 2022 07:30

sjrd added 2 commits April 13, 2022 10:49

Add bytecode tests with the status quo of codegen control flow.

f50f72f

sjrd force-pushed the codegen-destinations branch from d86e68b to 4a2889f Compare April 13, 2022 09:45

sjrd changed the title ~~WiP Use explicit destinations in codegen to avoid uselessly jumping around.~~ Use explicit destinations in codegen to avoid uselessly jumping around. Apr 13, 2022

sjrd marked this pull request as ready for review April 13, 2022 09:47

sjrd requested a review from lrytz April 13, 2022 09:47

sjrd assigned lrytz and unassigned sjrd Apr 13, 2022

sjrd commented Apr 13, 2022

View reviewed changes

lrytz approved these changes Apr 29, 2022

View reviewed changes

compiler/src/dotty/tools/backend/jvm/BCodeBodyBuilder.scala Show resolved Hide resolved

sjrd merged commit 949c704 into scala:main May 2, 2022

sjrd deleted the codegen-destinations branch May 2, 2022 08:33

sjrd mentioned this pull request May 2, 2022

Use explicit destinations in codegen to avoid uselessly jumping around. scala/scala#10022

Merged

smarter mentioned this pull request Jul 5, 2022

Inlining an extension method produces bad bytecode #15585

Open

Kordyjan added this to the 3.2.0 milestone Aug 2, 2023

sjrd mentioned this pull request Feb 19, 2024

Potentially unnecessary athrow is emitted when return is inside a nested block #5064

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use explicit destinations in codegen to avoid uselessly jumping around. #14890

Use explicit destinations in codegen to avoid uselessly jumping around. #14890

Uh oh!

sjrd commented Apr 8, 2022 •

edited

Loading

Uh oh!

sjrd Apr 13, 2022

Uh oh!

sjrd Apr 13, 2022

Uh oh!

sjrd commented Apr 27, 2022

Uh oh!

lrytz left a comment

Uh oh!

Uh oh!

Uh oh!

		case (_: Return) \| Block(_, (_: Return)) => ()
		case (_: Apply) \| Block(_, (_: Apply)) if trimmedRhs.symbol eq defn.throwMethod => ()

Use explicit destinations in codegen to avoid uselessly jumping around. #14890

Use explicit destinations in codegen to avoid uselessly jumping around. #14890

Uh oh!

Conversation

sjrd commented Apr 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjrd Apr 13, 2022

Choose a reason for hiding this comment

Uh oh!

sjrd Apr 13, 2022

Choose a reason for hiding this comment

Uh oh!

sjrd commented Apr 27, 2022

Uh oh!

lrytz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sjrd commented Apr 8, 2022 •

edited

Loading