Don't retain picklers until backend. #1725

odersky · 2016-11-18T16:24:45Z

Changes to reduce memory footprint and improve statistic information.

The memory footprint captured by pickler seems to be about 1/3rd of total footprint. So we gain a lot by not making this die sooner rather than later.

DarkDimius · 2016-11-18T16:27:15Z

@odersky, this change would break Linker, as I need the TASTY to not be assembled before I finish adding information to it.

DarkDimius · 2016-11-18T16:28:39Z

I don't think this is a source of problem. We used to not assemble tasty until backend from the very start. Unless TASTY recently became a lot less compact than it used to be.

odersky · 2016-11-18T16:35:12Z

Here is some more data. To reproduce:

set the enabled val in Stats to true.
compile with -Yheartbeat

We generate

334K untyped trees
967K typed trees at end of typer, of which 553K are retained (reachable from unit.tpdTree)
3557K typed trees at end of run, of which 852K are retained

That's a lot of trees!

We also generate 5231K cachable types. They are hash-consed into 1232K unique types. That's still a lot of unique types! Have to figure out what they are.

It seems the majority of the memory footprint at end of run is made up of these two components: retained trees and unique types.

DarkDimius · 2016-11-18T17:02:35Z

To add to your data, around 1 year ago unassembled tasty was taking around
1/10th of total retained size of objects reachable through ctx.compilationUnit.

On 18 November 2016 17:35:23 odersky [email protected] wrote:

Here is some more data. To reproduce:

set the enabled val in Stats to true.

compile with -Yheartbeat

We generate

334K untyped trees

967K typed trees at end of typer, of which 553K are retained (reachable
from unit.tpdTree)

3557K typed trees at end of run, of which 852K are retained

That's a lot of trees!

We also generate 5231K cachable types. They are hash-consed into 1232K
unique types. That's still a lot of unique types! Have to figure out what
they are.

It seems the majority of the memory footprint at end of run is made up of
these two components: retained trees and unique types.

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#1725 (comment)

odersky · 2016-11-18T17:23:30Z

More data: junit tests with the commit on my machine now take 525s. Before they took 556s. So it's a noticable improvement: 6% to be exact.

odersky · 2016-11-18T17:27:05Z

@DarkDimius The largest part of retained data is an IdentityHashMap which maps every tree node to an address. Since we have a lot of trees after typer, that's costing us. I believe we get about 25% reduction in memory footprint and 6% increase in speed by not retaining the picklers. Given these improvements I'd like to be able to do them.

How about we retain picklers only if linker is present? What is a good way to communicate that to the pickler?

DarkDimius · 2016-11-18T17:35:30Z

How about we retain picklers only if linker is present? What is a good way to communicate that to the pickler?

Sounds good. I propose to add one more method in phase class writesTasty: Boolean, add propate it similarly how we do with erasedTypes and flatClasses & etc: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Phases.scala#L333 , but this one will accumulate in the other direction(will go through next, instead of prev).

odersky · 2016-11-18T17:36:01Z

@DarkDimius Actually, when we run the tests we do use mutltiple runs, right? And these do leak. So that could explain the large amounts of memory. Dotty boostrap can be done comfortably in 1G, and in fact only has about 400MB of strongly reachable memory. So the baseline memory consumption cannot be the problem.

DarkDimius · 2016-11-18T17:38:08Z

And these do leak.

Sorry, I didn't understand. What leaks? I don't think we leak CompilationUnits.

smarter · 2016-11-18T21:04:04Z

I don't think we leak CompilationUnits.

I think we do, via Context#compilationUnit and outer contexts. Also, I've been looking at the heap dump of #1636 (comment) and pretty much everything that consumes memory comes from the picklers.

DarkDimius · 2016-11-18T21:12:20Z

@smarter, Contexts shouldn't leak as our JUnit tests check for it: https://github.com/lampepfl/dotty/blob/master/test/test/ContextEscapeDetector.java
https://github.com/lampepfl/dotty/blob/master/project/Build.scala#L112

smarter · 2016-11-18T21:16:34Z

src/dotty/tools/dotc/transform/Pickler.scala

-      unit.picklers += (cls -> pickler)
+      if (ctx.settings.YtestPickler.value) {
+        beforePickling(cls) = tree.show
+        picklers(cls) = pickler


I think this goes in the right direction, but in my heap dump I noticed that a massive amount of memory was taken by beforePickling so that may not be enough to get the tests to run smoothly. It may be helpful to do beforePickling(cls) = null and picklers(cls) = null in testUnpickler to free up these resources.

smarter · 2016-11-18T21:19:37Z

@DarkDimius I was talking about leaks across runs (since we run some tests with #runs 2)

DarkDimius · 2016-11-18T21:20:50Z

yes, this would run after each junit test and ensure that there's <=1 context survived.
If test has multiple runs, it should ensure that contextes didn't start accumulating.

smarter · 2016-11-18T21:26:53Z

But context escape detection is disabled: 53cd512

odersky · 2016-11-18T23:42:39Z

I reverted 53cd512 to re-enable context leak detection. Unfortunately, I did not find a leak even though one was clearly present (as verified by Yourkit). The root of the leak was ProtoTypes.dummyTree. Because we do copyOnWrite, dummyTree.withTypeUnchecked stores a type in the dummyTree object and that type can refer back to the context root through lastDenotation.

Question: Can we fix context leak detection? How can we verify that it keeps working? It's the general problem of test infrastructure which goes itself untested.

Re-enable context-leak detection by reverting 53cd512. But leak detection seems to be leaky itself :-)

odersky · 2016-11-18T23:52:48Z

@DarkDimius I was wondering about the need to keep picklers. What exactly is written by the Linker? and at what phase? I am asking because the largest part of memory consumed by the picklers is the map from all tree nodes to their addresses. But as soon as you transform a single tree that map is outdated! So, can we throw it away in pickler?

The lazy val `dummyTree` acquires a type because of copy-on-write and that type can refer via lastDenotation to a context base.

DarkDimius · 2016-11-19T08:32:29Z

The root of the leak was ProtoTypes.dummyTree

the contextEscape detection ensures that there's no more than 1 resident context, we used to have multiple places that cache a single context. I wouldn't call this a leak.

DarkDimius · 2016-11-19T08:37:54Z

I was wondering about the need to keep picklers. What exactly is written by the Linker? and at what phase?

I'm adding new top-level trees, not modifying existing ones, and after the compactify has been called it's no longer possible. I'm doing this in a phase which is the next after Pickler and the approach where I'll call assembleParts it there is perfectly fine with me.

`initInfo` was retained in Symbols. When called from `Namer`, `initInfo` referred to a completer, which referred to a context. With this space leak plugged, we can now compile 1000 times core/Comments.scala (460lines) with -Xmx400M. There still seems to be a space leak on the order of 200KB per run, though. But that seems to have to do with symbols, not contexts.

odersky · 2016-11-19T13:13:30Z

initInfo was retained in Symbols. When called from Namer, initInfo referred to
a completer, which referred to a context.

With this space leak plugged, we can now compile 1000 times core/Comments.scala (460lines)
with -Xmx400M.

There still seems to be a space leak on the order of 200KB per run, though. But that seems
to have to do with symbols, not contexts.

odersky · 2016-11-19T16:48:24Z

the contextEscape detection ensures that there's no more than 1 resident context, we used to have multiple places that cache a single context. I wouldn't call this a leak.

No, there were definitely two ContextBases alive. But it's true that dummyTree alone could only ever capture a single context.

The other space leak involving initInfo was worse. There we captured one context for each run.

Do we have an example where context escape detection detects an escaping context? If not we should disable it again. No use to get a false sense of security.

Previously only the FrontEnd got a fresh FreshNameCreator for each run, the other phases used a global one instead. This means that compiling the same file several times would create different synthetic names and classes on each run.

odersky · 2016-11-19T16:53:20Z

With the last commit, we can now compile Comments.scala 1000 times in 300M. There still seems to be a space leak but it's under 200K per run.

odersky · 2016-11-19T17:11:49Z

I was wondering about the need to keep picklers. What exactly is written by the Linker? and at what phase?

I'm adding new top-level trees, not modifying existing ones, and after the compactify has been called it's no longer possible. I'm doing this in a phase which is the next after Pickler and the approach where I'll call assembleParts it there is perfectly fine with me.

Why not run the phase immediately before pickler then? The problem is, for maximum efficiency we really want to recycle the storage for the huge maps for each file. So we do not even want to keep them alive until the end of pickler, never mind the next phase.

DarkDimius · 2016-11-21T12:09:18Z

Why not run the phase immediately before pickler then? The problem is, for maximum efficiency we really want to recycle the storage for the huge maps for each file. So we do not even want to keep them alive until the end of pickler, never mind the next phase.

Sounds like a good idea, didn't think about it.

smarter · 2016-11-22T00:34:27Z

LGTM, merging because I'm still having memory issues in #1636 and I'd like to see if this help.

odersky added 3 commits November 18, 2016 17:18

Don't keep full picklers around until backend.

3d58ebc

The memory footprint captured by pickler seems to be about 1/3rd of total footprint. So we gain a lot by not making this die sooner rather than later.

Say what is compiled under -verbose

5979e36

More extensive stats about generated trees

9bada40

odersky added the stat:in progress label Nov 18, 2016

smarter reviewed Nov 18, 2016

View reviewed changes

odersky closed this Nov 18, 2016

odersky removed the stat:in progress label Nov 18, 2016

odersky reopened this Nov 18, 2016

odersky added the stat:in progress label Nov 18, 2016

Revert 53cd512

30bd586

Re-enable context-leak detection by reverting 53cd512. But leak detection seems to be leaky itself :-)

Fix memory leak

ddec688

The lazy val `dummyTree` acquires a type because of copy-on-write and that type can refer via lastDenotation to a context base.

Recreate FreshNameCreator for each run.

8932d98

Previously only the FrontEnd got a fresh FreshNameCreator for each run, the other phases used a global one instead. This means that compiling the same file several times would create different synthetic names and classes on each run.

dottybot added the reviewed label Nov 22, 2016

smarter merged commit 34d64f3 into scala:master Nov 22, 2016

smarter removed reviewed labels Nov 22, 2016

smarter mentioned this pull request Nov 22, 2016

Restructuring repo and issues #1636

Merged

allanrenucci deleted the change-pickle-early branch December 14, 2017 16:58

Don't retain picklers until backend. #1725

Don't retain picklers until backend. #1725

Uh oh!

Conversation

odersky commented Nov 18, 2016

Uh oh!

DarkDimius commented Nov 18, 2016

Uh oh!

DarkDimius commented Nov 18, 2016

Uh oh!

odersky commented Nov 18, 2016

Uh oh!

DarkDimius commented Nov 18, 2016

Uh oh!

odersky commented Nov 18, 2016

Uh oh!

odersky commented Nov 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkDimius commented Nov 18, 2016

Uh oh!

odersky commented Nov 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkDimius commented Nov 18, 2016

Uh oh!

smarter commented Nov 18, 2016

Uh oh!

DarkDimius commented Nov 18, 2016

Uh oh!

smarter Nov 18, 2016

Choose a reason for hiding this comment

Uh oh!

smarter commented Nov 18, 2016

Uh oh!

DarkDimius commented Nov 18, 2016

Uh oh!

smarter commented Nov 18, 2016

Uh oh!

odersky commented Nov 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

odersky commented Nov 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DarkDimius commented Nov 19, 2016

Uh oh!

DarkDimius commented Nov 19, 2016

Uh oh!

odersky commented Nov 19, 2016

Uh oh!

odersky commented Nov 19, 2016

Uh oh!

odersky commented Nov 19, 2016

Uh oh!

odersky commented Nov 19, 2016

Uh oh!

DarkDimius commented Nov 21, 2016

Uh oh!

smarter commented Nov 22, 2016

Uh oh!

Uh oh!

odersky commented Nov 18, 2016 •

edited

Loading

odersky commented Nov 18, 2016 •

edited

Loading

odersky commented Nov 18, 2016 •

edited

Loading

odersky commented Nov 18, 2016 •

edited

Loading