-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Don't retain picklers until backend. #1725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The memory footprint captured by pickler seems to be about 1/3rd of total footprint. So we gain a lot by not making this die sooner rather than later.
@odersky, this change would break Linker, as I need the TASTY to not be assembled before I finish adding information to it. |
I don't think this is a source of problem. We used to not assemble tasty until backend from the very start. Unless TASTY recently became a lot less compact than it used to be. |
Here is some more data. To reproduce:
We generate
That's a lot of trees! We also generate 5231K cachable types. They are hash-consed into 1232K unique types. That's still a lot of unique types! Have to figure out what they are. It seems the majority of the memory footprint at end of run is made up of these two components: retained trees and unique types. |
To add to your data, around 1 year ago unassembled tasty was taking around On 18 November 2016 17:35:23 odersky [email protected] wrote:
|
More data: junit tests with the commit on my machine now take 525s. Before they took 556s. So it's a noticable improvement: 6% to be exact. |
@DarkDimius The largest part of retained data is an IdentityHashMap which maps every tree node to an address. Since we have a lot of trees after typer, that's costing us. I believe we get about 25% reduction in memory footprint and 6% increase in speed by not retaining the picklers. Given these improvements I'd like to be able to do them. How about we retain picklers only if linker is present? What is a good way to communicate that to the pickler? |
Sounds good. I propose to add one more method in phase class |
@DarkDimius Actually, when we run the tests we do use mutltiple runs, right? And these do leak. So that could explain the large amounts of memory. Dotty boostrap can be done comfortably in 1G, and in fact only has about 400MB of strongly reachable memory. So the baseline memory consumption cannot be the problem. |
Sorry, I didn't understand. What leaks? I don't think we leak |
I think we do, via |
@smarter, Contexts shouldn't leak as our JUnit tests check for it: https://github.com/lampepfl/dotty/blob/master/test/test/ContextEscapeDetector.java |
unit.picklers += (cls -> pickler) | ||
if (ctx.settings.YtestPickler.value) { | ||
beforePickling(cls) = tree.show | ||
picklers(cls) = pickler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this goes in the right direction, but in my heap dump I noticed that a massive amount of memory was taken by beforePickling
so that may not be enough to get the tests to run smoothly. It may be helpful to do beforePickling(cls) = null
and picklers(cls) = null
in testUnpickler
to free up these resources.
@DarkDimius I was talking about leaks across runs (since we run some tests with |
yes, this would run after each junit test and ensure that there's <=1 context survived. |
But context escape detection is disabled: 53cd512 |
I reverted 53cd512 to re-enable context leak detection. Unfortunately, I did not find a leak even though one was clearly present (as verified by Yourkit). The root of the leak was Question: Can we fix context leak detection? How can we verify that it keeps working? It's the general problem of test infrastructure which goes itself untested. |
@DarkDimius I was wondering about the need to keep picklers. What exactly is written by the Linker? and at what phase? I am asking because the largest part of memory consumed by the picklers is the map from all tree nodes to their addresses. But as soon as you transform a single tree that map is outdated! So, can we throw it away in pickler? |
The lazy val `dummyTree` acquires a type because of copy-on-write and that type can refer via lastDenotation to a context base.
the contextEscape detection ensures that there's no more than 1 resident context, we used to have multiple places that cache a single context. I wouldn't call this a leak. |
I'm adding new top-level trees, not modifying existing ones, and after the |
`initInfo` was retained in Symbols. When called from `Namer`, `initInfo` referred to a completer, which referred to a context. With this space leak plugged, we can now compile 1000 times core/Comments.scala (460lines) with -Xmx400M. There still seems to be a space leak on the order of 200KB per run, though. But that seems to have to do with symbols, not contexts.
With this space leak plugged, we can now compile 1000 times core/Comments.scala (460lines) There still seems to be a space leak on the order of 200KB per run, though. But that seems |
No, there were definitely two ContextBases alive. But it's true that The other space leak involving Do we have an example where context escape detection detects an escaping context? If not we should disable it again. No use to get a false sense of security. |
Previously only the FrontEnd got a fresh FreshNameCreator for each run, the other phases used a global one instead. This means that compiling the same file several times would create different synthetic names and classes on each run.
With the last commit, we can now compile |
Why not run the phase immediately before pickler then? The problem is, for maximum efficiency we really want to recycle the storage for the huge maps for each file. So we do not even want to keep them alive until the end of pickler, never mind the next phase. |
Sounds like a good idea, didn't think about it. |
LGTM, merging because I'm still having memory issues in #1636 and I'd like to see if this help. |
Changes to reduce memory footprint and improve statistic information.
Review by @DarkDimius ?