Performance issues with match type reduction #12267

odersky · 2021-04-29T07:40:53Z

#12153 introduce a performance regression for the stdlib test. I tracked it down to attempts to do match type reductions that seem very slow.

I saw that overall there were 566 attempts at match type reductions, presumably when compiling the Tuple.scala (and maybe also Tuples.scala) files. Without the change to translucentSuperType there were none. But the change is necessary and 566 does not seem an excessive number. Nevertheless these reduction attempts (which all fail in the end) seem responsible for the significant slowdown of stdlib compilation as a whole.

We should get to the bottom of this. As a first step, we should try to get a flamegraph or other profile just for compiling Tuple.scala and Tuples.scala. Once we know where the slowdown happens we should see what we can do to optimize.

liufengyun · 2021-04-30T10:49:19Z

WIP update:

Flamegraph for compiling stdlib

before: https://dotty-bench.epfl.ch/stdlib-before.html
after: https://dotty-bench.epfl.ch/stdlib.html

Compile times after warmup (on my local machine)

  |----------------------------+-----------+-----------|
  |                            |    before |     after |
  |----------------------------+-----------+-----------|
  | stdLib213                  | 11804.010 | 18015.115 |
  |----------------------------+-----------+-----------|
  | Tuple.scala + Tuples.scala |   478.229 |   492.295 |
  |----------------------------+-----------+-----------|
  | Tuples.scala               |   378.240 |   424.442 |
  |----------------------------+-----------+-----------|
  | Tuple.scala                |   171.700 |   170.791 |
  |----------------------------+-----------+-----------|

odersky · 2021-04-30T12:42:49Z

So it's clearly not Tuple.scala where the tryMatchAlias calls are made. Question to ask: for which files are they made? And why do they take so much time? I instrumented everything else. It's really just those ~600 calls to tryMatchAlias that make all the difference. Nothing else is invoked at all.

liufengyun · 2021-04-30T12:58:30Z

From the data we have, it seems that the slowdown comes from the exhaustivity check (mostly redundency check). If we disable exhaustivity check, then the numbers go back as before.

The current flamegraph does not show enough stack depth to see where it reaches MatchType.tryNormalize. I'm collecting new data with higher sampling frequence to see if we can get more info about the calling stack.

odersky · 2021-04-30T13:05:15Z

What I am seeing is

the added tryMatchAlias calls are all made when compiling individual Tuple classes Tuple1 to Tuple22.
at least some of the calls arise from checkAllOverrides, and I suspect that the class that's investigated there is Tuple.scala which defines a lot of match types and functions taking match types.

This seems to be a very specific case. The only classes inheriting from Tuple are Tuple1 to Tuple 22 and a slowdown for them is not indicative of a slowdown elsewhere. On the other hand, we should verify whether there is some bottleneck in match type reduction. Such a big slowdown for ~600 calls looks excessive.

odersky · 2021-04-30T13:22:08Z

I verified that all calls to tryMatchAlias originate from RefChecks, with the stacktrace like this:

     at dotty.tools.dotc.core.TypeComparer$.isSubTypeWhenFrozen(TypeComparer.scala:2619)
at dotty.tools.dotc.core.Types$Type.frozen_$less$colon$less(Types.scala:1026)
at dotty.tools.dotc.core.Types$Type.overrides(Types.scala:1067)
at dotty.tools.dotc.typer.RefChecks$.checkMemberTypesOK$4$$anonfun$3$$anonfun$3(RefChecks.scala:775)
at scala.runtime.function.JProcedure1.apply(JProcedure1.java:15)
at scala.runtime.function.JProcedure1.apply(JProcedure1.java:10)
at scala.collection.immutable.List.foreach(List.scala:333)
at dotty.tools.dotc.typer.RefChecks$.checkMemberTypesOK$5$$anonfun$4(RefChecks.scala:781)
at scala.runtime.function.JProcedure1.apply(JProcedure1.java:15)
at scala.runtime.function.JProcedure1.apply(JProcedure1.java:10)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:553)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:551)
at dotty.tools.dotc.util.HashSet$EntryIterator.foreach(HashSet.scala:162)
at dotty.tools.dotc.util.ReadOnlySet.foreach(ReadOnlySet.scala:16)
at dotty.tools.dotc.typer.RefChecks$.checkMemberTypesOK$1(RefChecks.scala:781)
at dotty.tools.dotc.typer.RefChecks$.dotty$tools$dotc$typer$RefChecks$$$checkAllOverrides(RefChecks.scala:801)
at dotty.tools.dotc.typer.RefChecks.transformTemplate(RefChecks.scala:1225)

So I don't think exhaustivity was to blame for the original slowdown (but I noticed a further slowdown on top of the first one, which might well be due to it).

liufengyun · 2021-04-30T13:44:21Z

So I don't think exhaustivity was to blame for the original slowdown (but I noticed a further slowdown on top of the first one, which might well be due to it).

I agree. The exhaustivity check becomes much more expensive after #12153. In the flamegraph, you can search patmat., a large part is purple. In the flamegraph before the change, it's hardly visible.

odersky · 2021-04-30T14:30:03Z

#12300 gives some more data. @liufengyun can you try to dig deeper on this?

liufengyun · 2021-04-30T14:32:22Z

#12300 gives some more data. @liufengyun can you try to dig deeper on this?

Thanks for the tips, I'll conduct a deeper analysis of the problem.

liufengyun · 2021-05-01T08:25:28Z

[WIP Update]

Before #12153, the maxConstraint.size is 11. After #12153, the maxConstraint.size is 288:

--- a/compiler/src/dotty/tools/dotc/core/ConstraintRunInfo.scala
+++ b/compiler/src/dotty/tools/dotc/core/ConstraintRunInfo.scala
@@ -8,6 +8,7 @@ trait ConstraintRunInfo { self: Run =>
   private var maxSize = 0
   private var maxConstraint: Constraint = _
   def recordConstraintSize(c: Constraint, size: Int): Unit =
+    if size > 100 then println("Constraint.size = " + size)
     if (size > maxSize) {
       maxSize = size
       maxConstraint = c

And we have 4638553 recorded constraints > 100, most of them are around the size of 288. That also explains the different behavior in searching for OrderingConstraint in flamegraphs:

The flamegraph before: the highlighted area (in purple) is hardly visible
The flamegraph after: the highlighted area (in purple) occupies a large area

liufengyun · 2021-05-03T12:02:06Z

The root cause of the performance regression is that refchecks results in a large constraint that is propagated to all later phases. A fix is proposed in a7ff067.

contraint.size = 0 before phase typer
contraint.size = 1 after phase typer
contraint.size = 1 before phase inlinedPositions
contraint.size = 1 after phase inlinedPositions
contraint.size = 1 before phase posttyper
contraint.size = 1 after phase posttyper
contraint.size = 1 before phase pickler
contraint.size = 1 after phase pickler
contraint.size = 1 before phase inlining
contraint.size = 1 after phase inlining
contraint.size = 1 before phase postInlining
contraint.size = 1 after phase postInlining
contraint.size = 1 before phase staging
contraint.size = 1 after phase staging
contraint.size = 1 before phase pickleQuotes
contraint.size = 1 after phase pickleQuotes
contraint.size = 1 before phase MegaPhase{firstTransform, checkReentrant, elimPackagePrefixes, cookComments, checkStatic, betaReduce, inlineVals, expandSAMs}
contraint.size = 1 after phase MegaPhase{firstTransform, checkReentrant, elimPackagePrefixes, cookComments, checkStatic, betaReduce, inlineVals, expandSAMs}
contraint.size = 1 before phase MegaPhase{elimRepeated, protectedAccessors, extmethods, uncacheGivenAliases, byNameClosures, hoistSuperArgs, specializeApplyMethods, refchecks}
contraint.size = 287 after phase MegaPhase{elimRepeated, protectedAccessors, extmethods, uncacheGivenAliases, byNameClosures, hoistSuperArgs, specializeApplyMethods, refchecks}
contraint.size = 287 before phase MegaPhase{elimOpaque, tryCatchPatterns, patternMatcher, explicitOuter, explicitSelf, elimByName, stringInterpolatorOpt}
contraint.size = 287 after phase MegaPhase{elimOpaque, tryCatchPatterns, patternMatcher, explicitOuter, explicitSelf, elimByName, stringInterpolatorOpt}
contraint.size = 287 before phase MegaPhase{pruneErasedDefs, uninitializedDefs, inlinePatterns, vcInlineMethods, seqLiterals, intercepted, getters, specializeFunctions, liftTry, collectNullableFields, elimOuterSelect, resolveSuper, functionXXLForwarders, paramForwarding, genericTuples, letOverApply, arrayConstructors}
contraint.size = 287 after phase MegaPhase{pruneErasedDefs, uninitializedDefs, inlinePatterns, vcInlineMethods, seqLiterals, intercepted, getters, specializeFunctions, liftTry, collectNullableFields, elimOuterSelect, resolveSuper, functionXXLForwarders, paramForwarding, genericTuples, letOverApply, arrayConstructors}
contraint.size = 287 before phase erasure
contraint.size = 287 after phase erasure
contraint.size = 287 before phase MegaPhase{elimErasedValueType, pureStats, vcElideAllocations, arrayApply, elimPolyFunction, tailrec, completeJavaEnums, mixin, lazyVals, memoize, nonLocalReturns, capturedVars}
contraint.size = 287 after phase MegaPhase{elimErasedValueType, pureStats, vcElideAllocations, arrayApply, elimPolyFunction, tailrec, completeJavaEnums, mixin, lazyVals, memoize, nonLocalReturns, capturedVars}
contraint.size = 287 before phase constructors
contraint.size = 287 after phase constructors
contraint.size = 287 before phase MegaPhase{lambdaLift, elimStaticThis, countOuterAccesses}
contraint.size = 287 after phase MegaPhase{lambdaLift, elimStaticThis, countOuterAccesses}
contraint.size = 287 before phase MegaPhase{dropOuterAccessors, flatten, renameLifted, transformWildcards, moveStatic, expandPrivate, restoreScopes, selectStatic, collectSuperCalls, repeatableAnnotations}
contraint.size = 287 after phase MegaPhase{dropOuterAccessors, flatten, renameLifted, transformWildcards, moveStatic, expandPrivate, restoreScopes, selectStatic, collectSuperCalls, repeatableAnnotations}
contraint.size = 287 before phase genBCode
contraint.size = 287 after phase genBCode

To enforce the invariant, we need to ensure that tvars are instantiated at end of phases.

liufengyun · 2021-05-06T17:16:50Z

Fixed in #12333.

odersky assigned liufengyun Apr 29, 2021

liufengyun added a commit to dotty-staging/dotty that referenced this issue May 3, 2021

Fix scala#12267: reset constraint in exploration

ed6a57e

liufengyun added a commit to dotty-staging/dotty that referenced this issue May 3, 2021

Fix scala#12267: reset constraint in exploration

cb25980

liufengyun added a commit to dotty-staging/dotty that referenced this issue May 3, 2021

Fix scala#12267: Constraints should not cross phases

d3331ee

liufengyun added a commit to dotty-staging/dotty that referenced this issue May 3, 2021

Fix scala#12267: Constraints should not cross phases

86d56f9

To enforce the invariant, we need to ensure that tvars are instantiated at end of phases.

liufengyun closed this as completed May 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance issues with match type reduction #12267

Performance issues with match type reduction #12267

odersky commented Apr 29, 2021 •

edited

Loading

liufengyun commented Apr 30, 2021 •

edited

Loading

Uh oh!

odersky commented Apr 30, 2021

Uh oh!

liufengyun commented Apr 30, 2021 •

edited

Loading

Uh oh!

odersky commented Apr 30, 2021 •

edited

Loading

Uh oh!

odersky commented Apr 30, 2021

Uh oh!

liufengyun commented Apr 30, 2021 •

edited

Loading

Uh oh!

odersky commented Apr 30, 2021

Uh oh!

liufengyun commented Apr 30, 2021

Uh oh!

liufengyun commented May 1, 2021

Uh oh!

liufengyun commented May 3, 2021 •

edited

Loading

Uh oh!

liufengyun commented May 6, 2021

Uh oh!

Performance issues with match type reduction #12267

Performance issues with match type reduction #12267

Comments

odersky commented Apr 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

liufengyun commented Apr 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

odersky commented Apr 30, 2021

Uh oh!

liufengyun commented Apr 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

odersky commented Apr 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

odersky commented Apr 30, 2021

Uh oh!

liufengyun commented Apr 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

odersky commented Apr 30, 2021

Uh oh!

liufengyun commented Apr 30, 2021

Uh oh!

liufengyun commented May 1, 2021

Uh oh!

liufengyun commented May 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liufengyun commented May 6, 2021

Uh oh!

odersky commented Apr 29, 2021 •

edited

Loading

liufengyun commented Apr 30, 2021 •

edited

Loading

liufengyun commented Apr 30, 2021 •

edited

Loading

odersky commented Apr 30, 2021 •

edited

Loading

liufengyun commented Apr 30, 2021 •

edited

Loading

liufengyun commented May 3, 2021 •

edited

Loading