Skip to content

DynamoDB connection pool tied up when interrupting #2936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
andrewyoo opened this issue Dec 22, 2021 · 6 comments
Closed

DynamoDB connection pool tied up when interrupting #2936

andrewyoo opened this issue Dec 22, 2021 · 6 comments
Assignees
Labels
bug This issue is a bug. closed-for-staleness p2 This is a standard priority issue

Comments

@andrewyoo
Copy link

andrewyoo commented Dec 22, 2021

Describe the bug

In my service, I was time limiting a block of code which involved dynamodb queries and eventually after enough timeouts, I am seeing the following error: SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool. It appears that if i interrupt the ddb client, then the connection is forever tied up and there is no available connection to make further calls

After several runs, i noticed that if i run future.cancel(false) (no interrupt) vs future.cancel(true) the service remains stable, but the threads are allowed to finish. I verified with the java sdk client metrics that LeasedConcurrency goes up and never goes back down.

Expected behavior

When ddb client aborts due to interrupt (AbortedException), the http connection is released.

Current behavior

When ddb client aborts due to interrupt, the http connection is NOT released therefore eventually depleting the http pool connection.

Steps to Reproduce

Something like this:

final Future future = executorService.submit(() -> {
  // some longer running ddb calls
}

final Object response;
try {
    response = future.get(MS_LIMIT_TO_RESPOND, TimeUnit.MILLISECONDS);
} catch (TimeoutException exception) {
    LOG.warn("Took longer than allotted {} ms to generate response.", MS_LIMIT_TO_RESPOND, exception);
} catch (InterruptedException | ExecutionException exception) {
    LOG.error("Failed to generate response", exception);
} finally {
    // If .cancel(true), then the thread will try to be interrupted, causing the issue.
    future.cancel(true);
    LOG.info("Returning response {}", response);
}

Possible Solution

Other tickets i saw with connection pool timeouts were regarding s3 and closing the object to ensure connection is released. I think upon sdk AbortedException or whatever exception for interrupt, the ddb connection should be closed.

Context

I was trying to limit the execution time on my service. If it didn't complete within a time limit, it would return an empty response.

AWS Java SDK version used

2

JDK version used

1.8

Operating System and version

Amazon Linux

@andrewyoo andrewyoo added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Dec 22, 2021
@jocull
Copy link

jocull commented Mar 7, 2022

My team is also encountering this problem. The investigation we have done seems to indicate that connections leased from the pool are mishandled in org.apache.http.impl.execchain.MainClientExec#execute

If the process is interrupted at the wrong time the connections will be lost. What can we do to fix this? Is it a known Apache client issue?

@debora-ito debora-ito self-assigned this Apr 12, 2022
@debora-ito
Copy link
Member

@andrewyoo @jocull I'm sorry for losing track of this. Are you still experiencing the issue?

Are you closing the data stream after it's consumed from the query response? Issues with connections that are not being released are usually associated with the resources not being properly closed.

@debora-ito debora-ito added p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. and removed needs-triage This issue or PR still needs to be triaged. labels Mar 29, 2023
@andrewyoo
Copy link
Author

@debora-ito I don't understand your question with regards to this ticket. In my use case, I had a ddb client (DynamoDbClient.create()) and i was interrupting a query. Because i was interrupting early, there was no response or data stream to close.

As for am I still experiencing it? I avoiding interrupting the ddb requests so I wouldn't have this issue, so i can't confirm if it still is a problem.

@jocull
Copy link

jocull commented Mar 30, 2023

Our situation was the same as Andrew's - setting an interrupt on the thread running the request. We resorted to making requests with the async SDK and blocking on the results, but I would honestly prefer not to. It has been a year and we have not tried this again.

I did mention above the suspect code in the Apache library. It's possible that has been patched now but I have not revisited Apache change logs.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Mar 30, 2023
@debora-ito
Copy link
Member

We released a fix via #4066.

The fix is available on Java SDK version 2.20.83.

@andrewyoo @jocull I know the fix is not easy to test due to the nature of the issue and because you changed to async, but let us know of any other issues you find after upgrading to a newer version.

@debora-ito debora-ito added the closing-soon This issue will close in 4 days unless further comments are made. label Jun 21, 2023
@github-actions github-actions bot added closed-for-staleness and removed closing-soon This issue will close in 4 days unless further comments are made. labels Jun 23, 2023
@jocull
Copy link

jocull commented Jun 29, 2023

@debora-ito We applied the synchronous SDK again with the new version and tested both locally and in a load tested environment. We could not reproduce the issue this time so I believe it fixed. Thank you! 😄

aws-sdk-java-automation added a commit that referenced this issue Mar 27, 2024
…c23fff38b

Pull request: release <- staging/c49f7241-9953-4bac-9425-ff8c23fff38b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. closed-for-staleness p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants