One of our service is failing due to out of memory #5973
Labels
bug
This issue is a bug.
closed-for-staleness
p2
This is a standard priority issue
response-requested
Waiting on additional info and feedback. Will move to "closing-soon" in 10 days.
Uh oh!
There was an error while loading. Please reload this page.
Bug Description
We are encountering a service failure due to a memory leak.
This graph shows the gradual increase in heap usage for our service. The service eventually OOMs and then k8s restarts, which is why you see a new line coming up after one ends.
Upon investigation, we found that AWS SDK v2 is trying to register connections in a HashMap within
IdleConnectionReaper
and deregister it later, but heap dump shows a huge retained space not getting picked by the GC.aws-sdk-java-v2/http-clients/apache-client/src/main/java/software/amazon/awssdk/http/apache/internal/conn/IdleConnectionReaper.java
Line 36 in b2fe89e
Above is the screenshot of heapdump visualised in VisualVM. You can see 15k+ objects of
PoolingHttpClientConnectionManager
, causing 512MB+ of retained space, which could have been garbage collected.My suspicion is that the
deregister
method is not getting called once the API call to AWS is complete.We are using AWS SDK v2 to talk to S3 and EMR Serverless API.
Regression Issue
Expected Behavior
The
deregisterConnectionManager
method should be called every time the connection manager the task is completed. This would ensure proper memory management by allowing garbage collection to free up memory, preventing memory leaks that ultimately lead to service failures.Current Behavior
Currently, when the
connectionManager
is registered, thederegisterConnectionManager
method is not called, preventing garbage collection from releasing unused memory. This results in a gradual memory buildup, eventually leading to memory failure. (Reference: [IdleConnectionReaper.java - Line 36](aws-sdk-java-v2/http-clients/apache-client/src/main/java/software/amazon/awssdk/http/apache/internal/conn/IdleConnectionReaper.java
Line 36 in b2fe89e
Reproduction Steps
registerConnectionManager
.deregisterConnectionManager
method is invoked.Possible Solution
No response
Additional Information/Context
No response
AWS SDK for Java Version
awsV2SdkVers = '2.20.38'
JDK Version
ENV JAVA_VERSION="21.0.6+7-1~20.04.1"
Operating System and Version
Ubuntu 20.04
The text was updated successfully, but these errors were encountered: