Increase the maximum heap size on Jenkins #1030

smarter · 2016-01-16T14:02:07Z

smarter · 2016-01-16T14:50:00Z

[info] OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000006ded80000, 2463629312, 0) failed; error='Cannot allocate memory' (errno=12)
[info] #
[info] # There is insufficient memory for the Java Runtime Environment to continue.
[info] # Native memory allocation (mmap) failed to map 2463629312 bytes for committing reserved memory.
[info] # An error report file with more information is saved as:
[info] # /home/jenkins/workspace/dotty-master-validate-partest/hs_err_pid24921.log

I think this means that we're trying to allocate more memory than available on the Jenkins VM.

@SethTisue, @adriaanm : How much memory do the VMs have? Is this something that could be increased?

In the meantime, I'll try reducing -Xmx a bit and see if I can find a middle ground.

DarkDimius · 2016-01-16T15:04:57Z

@smarter

Quoting @adriaanm

behemoths are c4.2xlarge, the others c4.xlarge

c4.2xlarge = 15 Gb Ram,
c4.xlarge = 7.5 Gb ram

smarter · 2016-01-16T17:21:11Z

@DarkDimius : can we reduce the parallelism so that we use less memory?

We're getting a lot of OutOfMemoryException when the maximum size is 1 GB, but we cannot increase it too much without using up all the memory available on the Jenkins instances, let's see if 1.1 GB is enough. Also stop using a custom -Xss, the default of 1 MB should be good enough.

smarter · 2016-01-16T20:15:02Z

OK, I've done four runs with the maximum heap size increased from 1 GB to 1.1 GB for Jenkins and they all succeeded, so I propose we merge this now and worry about a longer-term solution later. I didn't change the memory settings used when running the tests locally because decreasing the heap size there would probably slow down the tests.

odersky · 2016-01-16T22:38:19Z

Can we give it 1.5G? That would give us some margin for more complicated
tests.

Martin

On Sat, Jan 16, 2016 at 9:15 PM, Guillaume Martres <[email protected]

wrote:

OK, I've done four runs with the maximum heap size increased from 1 GB to
1.1 GB for Jenkins and they all succeeded, so I propose we merge this now
and worry about a longer-term solution later. I didn't change the memory
settings used when running the tests locally because decreasing the heap
size there would probably slow down the tests.

—
Reply to this email directly or view it on GitHub
#1030 (comment).

Martin Odersky
EPFL

smarter · 2016-01-16T22:45:14Z

Can we give it 1.5G? That would give us some margin for more complicated tests.

Unfortunately not, I tried that at first but got the failure described in #1030 (comment) because apparently we're using up all the RAM available on the VM instance, 1.1G is the max I could get away with, even 1.25G is too much. So this is really a short-term fix.

smarter · 2016-01-16T22:48:19Z

Note that if the VM instances had a swap partition then at least we wouldn't get a crash when using too much RAM, it would just be very slow, but an even better fix would be to figure out how we're using the 15GB of RAM of the VM instances and fix this (I have no idea how many JVMs we run in parallel and what controls this)

DarkDimius · 2016-01-16T22:50:06Z

I have no idea how many JVMs we run in parallel and what controls this

We run 1 JVM per core.
Those Amazon EC2 VMs have 1.875 GB RAM per core.

smarter · 2016-01-16T22:51:55Z

We run 1 JVM per core.

You mean, partest starts 1 JVM per-core? But you also have to take into account the JVM used by sbt itself right?

smarter · 2016-01-16T22:52:47Z

If N=Number of cores, could we try running partest with N-1 JVMs instead of N JVMs?

DarkDimius · 2016-01-16T22:53:54Z

c4.2xlarge:
8 Cores,
15 GB Ram.

Even if we run 9 JVMs, that's 1.66GB per VM.
End this is upper bound. I wouldn't expect SBT to actually allocate all allowed memory while compiling Dotty. Nor would I expect Dotty to use so much memory. It simply shouldn't. Even when compiling itself.
Unless we have a memory leak.

smarter · 2016-01-16T22:56:16Z

You also have to take into account that some ram is used by the OS itself, and the total memory usage of a JVM is bigger than its heap size (but I don't know how much), so we might be close to the limit.

smarter · 2016-01-16T23:11:04Z

(One thing that might be worth investigating: in the error I showed above the amount of memory that the JVM tried to "commit" was 2463629312 bytes = 2.35 GB, this is much bigger than the maximum heap size of 1.5G I specified, so something weird may be going on here, we might have more details once scala/scala-jenkins-infra#157 is merged and we can take a look at the log file).

odersky · 2016-01-17T11:46:07Z

Wait: Partest runs N threads in one VM. Otherwise we would not have seen
data races. Each thread runs a full compiler, and a full compiler needs a
couple 100 mb to compile a significant chunk of sources. So it seems to me
that if max heap if fixed we could still reduce memory pressure by reducing
N. What is N at the moment?

On Sun, Jan 17, 2016 at 12:11 AM, Guillaume Martres <
[email protected]> wrote:

(One thing that might be worth investigating: in the error I showed above
the amount of memory that the JVM tried to "commit" was 2463629312 bytes =
2.35 GB, this is much bigger than the maximum heap size of 1.5G I
specified, so something weird may be going on here, we might have more
details once scala/scala-jenkins-infra#157
scala/scala-jenkins-infra#157 is merged and we
can take a look at the log file).

—
Reply to this email directly or view it on GitHub
#1030 (comment).

Martin Odersky
EPFL

smarter · 2016-01-17T14:33:56Z

What is N at the moment?

It's the number of cores on the machine, here's an attempt at reducing this: #1032

All of our recent memory-related tests failures since scala#1030 was merged seem to be caused by t7880.scala. It tries to intentionally trigger an OutOfMemoryError, however since we don't pass -Xmx to our run tests it's possible that this we fill up the memory of our host before we reach the maximum heap size of the JVM. Ideally, we would specify a -Xmx for run tests (scalac uses 1 GB), unfortunately in the version of partest we use this is tricky because we need to set the system property "partest.java_opts". If we upgrade our partest to the latest release, we can instead specify it by setting the argument `javaOpts` of the constructor of `SuiteRunner`, see scala/scala-partest@7c4659e

SethTisue · 2016-01-18T17:04:40Z

@SethTisue, @adriaanm : How much memory do the VMs have? Is this something that could be increased?

Adriaan has some numbers at https://github.com/scala/scala-jenkins-infra/blob/master/doc/design.md, not entirely sure they're current. If you want to poke around yourself, we should probably get you access to the Jenkins nodes so you can ssh in and investigate on your own, try experiments from the command line, etc. see https://github.com/scala/scala-jenkins-infra/blob/master/doc/client-setup.md#hosts-and-ssh-config

smarter · 2016-01-18T17:20:12Z

@SethTisue : That could be useful in the future yes, I guess you'll need me to give you an ssh key, if so you can use: http://guillaume.martres.me/id_rsa.pub

SethTisue · 2016-01-20T20:19:34Z

@adriaanm could you help Guillaume with this? I realized this is part of this process I don't know about, the docs just say "Send Adriaan your public key... He will use it to encrypt your credentials"

adriaanm · 2016-01-20T23:59:13Z

Here's the section where you documented how to do this: https://github.com/scala/scala-jenkins-infra/blob/master/doc/client-setup.md#get-your-public-key-added :-)

SethTisue · 2016-01-21T00:38:24Z

You can lead a horse to documentation, but you can't make him drink. Even if you are the horse.

SethTisue · 2016-01-21T00:45:45Z

Hmm, I merged scala/scala-jenkins-infra#159, ran knife cookbook upload scala-jenkins-infra as usual, and ran chef-client on jenkins-master, but I didn't see anything about the change in the resulting output, and I don't see Guillaume's key in ~ec2-user/.ssh/authorized_keys. Is there something else we need to do to make the change kick in?

adriaanm · 2016-01-21T19:55:02Z

default['authorized_keys']['jenkins'] is for the jenkins user

SethTisue · 2016-01-21T20:06:53Z

hmm. do I need to run chef-client on the workers?

adriaanm · 2016-01-21T20:12:21Z

yes, but they should run that on a schedule

SethTisue · 2016-01-21T20:15:14Z

ok. @smarter, if this isn't already working for you, let us know.

This was referenced Jan 16, 2016

Change early typeparams2 #1028

Closed

Change early typeparams, take 3 #1031

Merged

smarter force-pushed the add/more-memory branch from 958b1bb to 28179f2 Compare January 16, 2016 16:57

smarter force-pushed the add/more-memory branch 3 times, most recently from 977681d to 29ea17c Compare January 16, 2016 18:54

smarter force-pushed the add/more-memory branch 2 times, most recently from 1e0965f to 6e8fcde Compare January 16, 2016 20:12

smarter changed the title ~~Use the same memory options everywhere~~ Increase the maximum heap size on Jenkins Jan 16, 2016

odersky merged commit 6e8fcde into scala:master Jan 17, 2016

smarter mentioned this pull request Jan 17, 2016

Stop crashes because we're out of memory by disabling t7880 #1033

Merged

allanrenucci deleted the add/more-memory branch December 14, 2017 16:57

Increase the maximum heap size on Jenkins #1030

Increase the maximum heap size on Jenkins #1030

Uh oh!

Conversation

smarter commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

DarkDimius commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

odersky commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

DarkDimius commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

DarkDimius commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

smarter commented Jan 16, 2016

Uh oh!

odersky commented Jan 17, 2016

Uh oh!

smarter commented Jan 17, 2016

Uh oh!

SethTisue commented Jan 18, 2016

Uh oh!

smarter commented Jan 18, 2016

Uh oh!

SethTisue commented Jan 20, 2016

Uh oh!

adriaanm commented Jan 20, 2016

Uh oh!

SethTisue commented Jan 21, 2016

Uh oh!

SethTisue commented Jan 21, 2016

Uh oh!

adriaanm commented Jan 21, 2016

Uh oh!

SethTisue commented Jan 21, 2016

Uh oh!

adriaanm commented Jan 21, 2016

Uh oh!

SethTisue commented Jan 21, 2016

Uh oh!

Uh oh!