Skip to content

Don't retain picklers until backend. #1725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 22, 2016

Conversation

odersky
Copy link
Contributor

@odersky odersky commented Nov 18, 2016

Changes to reduce memory footprint and improve statistic information.

Review by @DarkDimius ?

The memory footprint captured by pickler seems to be about
1/3rd of total footprint. So we gain a lot by not making this
die sooner rather than later.
@DarkDimius
Copy link
Contributor

@odersky, this change would break Linker, as I need the TASTY to not be assembled before I finish adding information to it.

@DarkDimius
Copy link
Contributor

I don't think this is a source of problem. We used to not assemble tasty until backend from the very start. Unless TASTY recently became a lot less compact than it used to be.

@odersky
Copy link
Contributor Author

odersky commented Nov 18, 2016

Here is some more data. To reproduce:

  • set the enabled val in Stats to true.
  • compile with -Yheartbeat

We generate

  • 334K untyped trees
  • 967K typed trees at end of typer, of which 553K are retained (reachable from unit.tpdTree)
  • 3557K typed trees at end of run, of which 852K are retained

That's a lot of trees!

We also generate 5231K cachable types. They are hash-consed into 1232K unique types. That's still a lot of unique types! Have to figure out what they are.

It seems the majority of the memory footprint at end of run is made up of these two components: retained trees and unique types.

@DarkDimius
Copy link
Contributor

To add to your data, around 1 year ago unassembled tasty was taking around
1/10th of total retained size of objects reachable through ctx.compilationUnit.

On 18 November 2016 17:35:23 odersky [email protected] wrote:

Here is some more data. To reproduce:

  • set the enabled val in Stats to true.
  • compile with -Yheartbeat

We generate

  • 334K untyped trees
  • 967K typed trees at end of typer, of which 553K are retained (reachable
    from unit.tpdTree)
  • 3557K typed trees at end of run, of which 852K are retained

That's a lot of trees!

We also generate 5231K cachable types. They are hash-consed into 1232K
unique types. That's still a lot of unique types! Have to figure out what
they are.

It seems the majority of the memory footprint at end of run is made up of
these two components: retained trees and unique types.

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#1725 (comment)

@odersky
Copy link
Contributor Author

odersky commented Nov 18, 2016

More data: junit tests with the commit on my machine now take 525s. Before they took 556s. So it's a noticable improvement: 6% to be exact.

@odersky
Copy link
Contributor Author

odersky commented Nov 18, 2016

@DarkDimius The largest part of retained data is an IdentityHashMap which maps every tree node to an address. Since we have a lot of trees after typer, that's costing us. I believe we get about 25% reduction in memory footprint and 6% increase in speed by not retaining the picklers. Given these improvements I'd like to be able to do them.

How about we retain picklers only if linker is present? What is a good way to communicate that to the pickler?

@DarkDimius
Copy link
Contributor

How about we retain picklers only if linker is present? What is a good way to communicate that to the pickler?

Sounds good. I propose to add one more method in phase class writesTasty: Boolean, add propate it similarly how we do with erasedTypes and flatClasses & etc: https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Phases.scala#L333 , but this one will accumulate in the other direction(will go through next, instead of prev).

@odersky
Copy link
Contributor Author

odersky commented Nov 18, 2016

@DarkDimius Actually, when we run the tests we do use mutltiple runs, right? And these do leak. So that could explain the large amounts of memory. Dotty boostrap can be done comfortably in 1G, and in fact only has about 400MB of strongly reachable memory. So the baseline memory consumption cannot be the problem.

@DarkDimius
Copy link
Contributor

And these do leak.

Sorry, I didn't understand. What leaks? I don't think we leak CompilationUnits.

@smarter
Copy link
Member

smarter commented Nov 18, 2016

I don't think we leak CompilationUnits.

I think we do, via Context#compilationUnit and outer contexts. Also, I've been looking at the heap dump of #1636 (comment) and pretty much everything that consumes memory comes from the picklers.

@DarkDimius
Copy link
Contributor

unit.picklers += (cls -> pickler)
if (ctx.settings.YtestPickler.value) {
beforePickling(cls) = tree.show
picklers(cls) = pickler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes in the right direction, but in my heap dump I noticed that a massive amount of memory was taken by beforePickling so that may not be enough to get the tests to run smoothly. It may be helpful to do beforePickling(cls) = null and picklers(cls) = null in testUnpickler to free up these resources.

@smarter
Copy link
Member

smarter commented Nov 18, 2016

@DarkDimius I was talking about leaks across runs (since we run some tests with #runs 2)

@DarkDimius
Copy link
Contributor

yes, this would run after each junit test and ensure that there's <=1 context survived.
If test has multiple runs, it should ensure that contextes didn't start accumulating.

@smarter
Copy link
Member

smarter commented Nov 18, 2016

But context escape detection is disabled: 53cd512

@odersky
Copy link
Contributor Author

odersky commented Nov 18, 2016

I reverted 53cd512 to re-enable context leak detection. Unfortunately, I did not find a leak even though one was clearly present (as verified by Yourkit). The root of the leak was ProtoTypes.dummyTree. Because we do copyOnWrite, dummyTree.withTypeUnchecked stores a type in the dummyTree object and that type can refer back to the context root through lastDenotation.

Question: Can we fix context leak detection? How can we verify that it keeps working? It's the general problem of test infrastructure which goes itself untested.

Re-enable context-leak detection by reverting 53cd512.

But leak detection seems to be leaky itself :-)
@odersky
Copy link
Contributor Author

odersky commented Nov 18, 2016

@DarkDimius I was wondering about the need to keep picklers. What exactly is written by the Linker? and at what phase? I am asking because the largest part of memory consumed by the picklers is the map from all tree nodes to their addresses. But as soon as you transform a single tree that map is outdated! So, can we throw it away in pickler?

The lazy val `dummyTree` acquires a type because of
copy-on-write and that type can refer via lastDenotation
to a context base.
@DarkDimius
Copy link
Contributor

The root of the leak was ProtoTypes.dummyTree

the contextEscape detection ensures that there's no more than 1 resident context, we used to have multiple places that cache a single context. I wouldn't call this a leak.

@DarkDimius
Copy link
Contributor

I was wondering about the need to keep picklers. What exactly is written by the Linker? and at what phase?

I'm adding new top-level trees, not modifying existing ones, and after the compactify has been called it's no longer possible. I'm doing this in a phase which is the next after Pickler and the approach where I'll call assembleParts it there is perfectly fine with me.

`initInfo` was retained in Symbols. When called from `Namer`, `initInfo` referred to
a completer, which referred to a context.

With this space leak plugged, we can now compile 1000 times core/Comments.scala (460lines)
with -Xmx400M.

There still seems to be a space leak on the order of 200KB per run, though. But that seems
to have to do with symbols, not contexts.
@odersky
Copy link
Contributor Author

odersky commented Nov 19, 2016

initInfo was retained in Symbols. When called from Namer, initInfo referred to
a completer, which referred to a context.

With this space leak plugged, we can now compile 1000 times core/Comments.scala (460lines)
with -Xmx400M.

There still seems to be a space leak on the order of 200KB per run, though. But that seems
to have to do with symbols, not contexts.

@odersky
Copy link
Contributor Author

odersky commented Nov 19, 2016

the contextEscape detection ensures that there's no more than 1 resident context, we used to have multiple places that cache a single context. I wouldn't call this a leak.

No, there were definitely two ContextBases alive. But it's true that dummyTree alone could only ever capture a single context.

The other space leak involving initInfo was worse. There we captured one context for each run.

Do we have an example where context escape detection detects an escaping context? If not we should disable it again. No use to get a false sense of security.

Previously only the FrontEnd got a fresh FreshNameCreator for
each run, the other phases used a global one instead. This means
that compiling the same file several times would create different
synthetic names and classes on each run.
@odersky
Copy link
Contributor Author

odersky commented Nov 19, 2016

With the last commit, we can now compile Comments.scala 1000 times in 300M. There still seems to be a space leak but it's under 200K per run.

@odersky
Copy link
Contributor Author

odersky commented Nov 19, 2016

I was wondering about the need to keep picklers. What exactly is written by the Linker? and at what phase?

I'm adding new top-level trees, not modifying existing ones, and after the compactify has been called it's no longer possible. I'm doing this in a phase which is the next after Pickler and the approach where I'll call assembleParts it there is perfectly fine with me.

Why not run the phase immediately before pickler then? The problem is, for maximum efficiency we really want to recycle the storage for the huge maps for each file. So we do not even want to keep them alive until the end of pickler, never mind the next phase.

@DarkDimius
Copy link
Contributor

Why not run the phase immediately before pickler then? The problem is, for maximum efficiency we really want to recycle the storage for the huge maps for each file. So we do not even want to keep them alive until the end of pickler, never mind the next phase.

Sounds like a good idea, didn't think about it.

@smarter
Copy link
Member

smarter commented Nov 22, 2016

LGTM, merging because I'm still having memory issues in #1636 and I'd like to see if this help.

@smarter smarter merged commit 34d64f3 into scala:master Nov 22, 2016
@allanrenucci allanrenucci deleted the change-pickle-early branch December 14, 2017 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants