Skip to content

Fixed a bug in the netty client, where a future may not always be completed. #1217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 19, 2019

Conversation

millems
Copy link
Contributor

@millems millems commented Apr 19, 2019

If a service closes a connection between when a channel is acquired and handlers are attached, channel writes could disappear and the response future would never be completed. This change introduces health checks and retries for channel acquisition to fix the majority of cases without failing requests, as well as one last check after handlers are added to ensure the channel hasn't been closed since the channel pool health check. Fixes #1207.

@millems millems force-pushed the millem/fix-unfinished-futures branch 2 times, most recently from 1d9d0a0 to 738bc71 Compare April 19, 2019 21:11
public HealthCheckedChannelPool(EventLoopGroup eventLoopGroup,
NettyConfiguration configuration,
ChannelPool delegate) {
this.eventLoopGroup = eventLoopGroup;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to use the whole group instead of a single loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Load balancing? It didn't seem like it would be terrible to try to better balance things across the whole group instead of an individual event loop.

* Time out the provided acquire future, if it hasn't already been completed.
*/
private void timeoutAcquire(Promise<Channel> resultFuture) {
if (!resultFuture.isDone()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check shouldn't be necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I didn't want the overhead of allocating the IOException (filling in the stack trace is apparently pretty expensive), because I wasn't cancelling the timeout on success. Now it's not really as needed, since I'm cancelling the timeout task, so I don't mind removing it.

@@ -224,6 +225,10 @@ private ChannelPool createChannelPool(Bootstrap bootstrap, ChannelPipelineInitia
// from the underlying pool, the channel is closed and released.
channelPool = new CancellableAcquireChannelPool(bootstrap.config().group().next(), channelPool);

// Wrap the channel pool to guarantee all channels checked out are healthy, and all unhealthy channels checked in are
// closed.
channelPool = new HealthCheckedChannelPool(bootstrap.config().group(), configuration, channelPool);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should switch with CancellableAcquireChannelPool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

…pleted.

If a service closes a connection between when a channel is acquired and handlers are attached, channel writes could disappear and the response future would never be completed. This change introduces health checks and retries for channel acquisition to fix the majority of cases without failing requests, as well as one last check after handlers are added to ensure the channel hasn't been closed since the channel pool health check. Fixes #1207.
@millems millems force-pushed the millem/fix-unfinished-futures branch from 738bc71 to 79801aa Compare April 19, 2019 21:35
Copy link
Contributor

@dagnir dagnir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@codecov-io
Copy link

codecov-io commented Apr 19, 2019

Codecov Report

Merging #1217 into master will increase coverage by 0.01%.
The diff coverage is 78.84%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1217      +/-   ##
============================================
+ Coverage     59.01%   59.02%   +0.01%     
- Complexity     4612     4632      +20     
============================================
  Files           746      747       +1     
  Lines         23134    23182      +48     
  Branches       1733     1737       +4     
============================================
+ Hits          13653    13684      +31     
- Misses         8786     8804      +18     
+ Partials        695      694       -1
Impacted Files Coverage Δ Complexity Δ
...awssdk/http/nio/netty/NettyNioAsyncHttpClient.java 66.66% <100%> (+0.21%) 22 <0> (ø) ⬇️
.../http/nio/netty/internal/NettyRequestExecutor.java 61.95% <22.22%> (-2.29%) 29 <0> (ø)
...p/nio/netty/internal/HealthCheckedChannelPool.java 90.47% <90.47%> (ø) 16 <16> (?)
...o/netty/internal/utils/BetterFixedChannelPool.java 52.48% <0%> (-6.63%) 14% <0%> (ø)
...p/nio/netty/internal/DelegatingEventLoopGroup.java 25% <0%> (+3.12%) 6% <0%> (+1%) ⬆️
...nio/netty/internal/OldConnectionReaperHandler.java 90.9% <0%> (+9.09%) 13% <0%> (+1%) ⬆️
...netty/internal/SslCloseCompletionEventHandler.java 100% <0%> (+14.28%) 5% <0%> (+1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 64a1c78...79801aa. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sending SQS message using async client sometimes never completes the Future
3 participants