Skip to content

EndOfStreamException in ResetConnectionAsync with Amazon Aurora #208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sherloqholmes opened this issue Mar 27, 2017 · 13 comments
Closed

EndOfStreamException in ResetConnectionAsync with Amazon Aurora #208

sherloqholmes opened this issue Mar 27, 2017 · 13 comments
Labels

Comments

@sherloqholmes
Copy link

sherloqholmes commented Mar 27, 2017

When running a lot of queries (SELECT, INSERT, UPDATE, DELETE) at the same moment on the same server we are seeing an increasing amount of EndOfStreamExceptions.

We were able to reproduce this issue on a server with ±100 concurrent users.

What happend

  1. A load test was running on an endpoint that was doing a simple select on the database.
  2. Some other requests were coming in that resulted in an insert operation into the database, these operations failed.

Stacktrace

System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 MySql.Data.ValueTaskExtensions+<>c__DisplayClass0_0`2.<ContinueWith>b__0(Task`1 task):0
 System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke():36
 System.Threading.Tasks.Task.Execute():16
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 MySql.Data.Serialization.MySqlSession.TryAsyncContinuation(Task`1 task):21
 System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke():36
 System.Threading.Tasks.Task.Execute():16
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 System.Runtime.CompilerServices.TaskAwaiter`1.GetResult():0
 System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1+ConfiguredValueTaskAwaiter.GetResult():30
 MySql.Data.Serialization.MySqlSession+<ResetConnectionAsync>d__23.MoveNext():1323
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 MySql.Data.MySqlClient.ConnectionPool+<GetSessionAsync>d__0.MoveNext():917
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 MySql.Data.MySqlClient.MySqlConnection+<CreateSessionAsync>d__57.MoveNext():276
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 MySql.Data.MySqlClient.MySqlConnection+<OpenAsync>d__11.MoveNext():225
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 Microsoft.EntityFrameworkCore.Storage.Internal.MySqlRelationalConnection+<OpenAsync>d__47.MoveNext() in C:\Bamboo-home\xml-data\build-dir\SC-AD-JOB1\src\ThirdParty\Pomelo.EntityFrameworkCore.MySql\src\Pomelo.EntityFrameworkCore.MySql\Storage\Internal\MySqlRelationalConnection.cs:368
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 Microsoft.EntityFrameworkCore.Update.Internal.MySqlBatchExecutor+<ExecuteAsync>d__1.MoveNext() in C:\Bamboo-home\xml-data\build-dir\SC-AD-JOB1\src\ThirdParty\Pomelo.EntityFrameworkCore.MySql\src\Pomelo.EntityFrameworkCore.MySql\Update\Internal\MySqlBatchExecutor.cs:52
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager+<SaveChangesAsync>d__54.MoveNext():139
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager+<SaveChangesAsync>d__52.MoveNext():265
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 Microsoft.EntityFrameworkCore.DbContext+<SaveChangesAsync>d__35.MoveNext():216
 System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task):38
 System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task):40
 System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task):8
@sherloqholmes sherloqholmes changed the title [EndOfStreamException: Attempted to read past the end of the stream.] EndOfStreamException: Attempted to read past the end of the stream Mar 27, 2017
@bgrainger
Copy link
Member

What version of MySQL Server are you using? What is the OS of the client and server? What is your connection string (without the password)?

@sherloqholmes
Copy link
Author

MySQL Server
5.6.10

Server OS
Windows Server 2012 R2

Connection String
"Server=xxx.us-east-1.rds.amazonaws.com;database=xxx;uid=xxx;pwd=xxx;Convert Zero Datetime=true;"

@bgrainger
Copy link
Member

Given the IL offset in the exception call stack, I suspect this line is throwing the exception.

I haven't been able to reproduce with MySQL Server 5.6.29 locally (with 100 concurrent readers and one inserter/deleter).

Given the server hostname, I assume your server is Amazon RDS? It's possible this could be an RDS-specific bug (similar to #169 for Azure). I will need to set up a RDS instance for testing to confirm this.

Finally, I see Microsoft.EntityFrameworkCore.DbContext in the call stack. Are you using EF Core? Which library and version? (And which version of MySqlConnector?)

@sherloqholmes
Copy link
Author

That's correct, server is on Amazon RDS.

We are using EF Core (Pomelo) for MySQL with the latest version of MySqlConnector (0.15.1)

@bgrainger
Copy link
Member

Is it Amazon Aurora, MariaDB, or MySQL? (All of these speak the MySQL protocol but could have (very) different backends.)

@sherloqholmes
Copy link
Author

sherloqholmes commented Mar 29, 2017

Aurora

@bgrainger
Copy link
Member

I set up an db.r3.xlarge Aurora RDS instance and executed a high-concurrency test with up to 150 readers and multiple writers (on a simple table schema), but was unable to reproduce the exception yet. (Note that my test didn't use EF Core, so that library wasn't tested.)

@sherloqholmes
Copy link
Author

Hmm strange...
Let me see if I can reproduce it with some more info.

@torsten-github
Copy link

I have the same problem with RDS Aurora (Amazon). All latest versions.

@bgrainger
Copy link
Member

Are you using EF Core (Pomelo), or just MySqlConnector? Can you describe your scenario (that might help us develop a repro) or provide a small code sample that demonstrates it?

@bnabholz
Copy link
Contributor

Short version: Putting a try/catch around session.ResetConnectionAsync() in ConnectionPool.GetSessionAsync (and falling back to creating a new MySqlSession) effectively works around this. I will submit a PR for review, but I thought others might benefit from it being documented here.

Long version:

I've been battling this problem since switching to Aurora from SQL Server, and it's almost been frustrating enough to switch back. We've experienced it on both read and write queries, and when it happens it will affect connections from multiple app servers in our farm, for several seconds. I am all but certain it is a problem with Aurora. It looks like after Aurora challenges the client for authentication and the client responds, it will abruptly close the connection. It is reminiscent of #169 but instead of "Access denied for user", it is sending an RST to close the connection.

Without any "on-demand" way to reproduce it I'm not sure how to go about confirming it’s an Aurora bug. Cloning our database and running various stress tests got me nowhere, but during normal usage we experience this problem several times per day. We have ~50 connections and ~15% utilization on an r3.large instance. We do have frequent connections due to setting Connection Lifetime to 5 seconds, so that failover to the writable primary instance happens more quickly. In my testing, the default Connection Lifetime setting of 0 (forever), as well as Pooling=False, made no difference.

One interesting thing is that I did not experience this problem in MySqlConnector if TLS was disabled via Ssl Mode=None. Since the default mode is "Preferred" I would guess that most users' connections are using TLS. So that might be a quick workaround for some users, but I wanted to keep TLS so I’ve just been putting up with it.

MySql.Data experiences this regardless of TLS being on or off, which is actually what let me to start using MySqlConnector in the first place. There are some issues on MySQL's bug tracker about similar problems. In the first one the user is using RDS but it's not clear if it's Aurora or not:

MySQL Bugs: #76597: Reading from stream failed

The problem is narrowed down to a Stream.Read returning 0 but doesn't get any further than that. Another issue was opened:

MySQL Bugs: #86056: MySqlException (0x80004005): Reading from the stream has failed.

where @bgrainger rightly points out that the client code is doing the correct thing - if you read 0 bytes you don't have much choice but to throw, close the connection, or both.

Here are the exceptions I’ve observed from different clients / configurations:

MySql.Data, TLS enabled:

System.Security.Authentication.AuthenticationException: A call to SSPI failed, see inner exception. ---> System.ComponentModel.Win32Exception: The message received was unexpected or badly formatted
   --- End of inner exception stack trace ---
   at System.Net.Security.SslState.StartSendAuthResetSignal(ProtocolToken message, AsyncProtocolRequest asyncRequest, Exception exception)
   at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ForceAuthentication(Boolean receiveFirst, Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessAuthentication(LazyAsyncResult lazyResult)
   at MySql.Data.MySqlClient.NativeDriver.StartSSL()
   at MySql.Data.MySqlClient.NativeDriver.Open()
   at MySql.Data.MySqlClient.Driver.Open()
   at MySql.Data.MySqlClient.Driver.Create(MySqlConnectionStringBuilder settings)
   at MySql.Data.MySqlClient.MySqlPool.CreateNewPooledConnection()
   at MySql.Data.MySqlClient.MySqlPool.GetPooledConnection()
   at MySql.Data.MySqlClient.MySqlPool.TryToGetDriver()
   at MySql.Data.MySqlClient.MySqlPool.GetConnection()
   at MySql.Data.MySqlClient.MySqlConnection.Open()

MySql.Data, TLS disabled via Ssl Mode=None:

MySql.Data.MySqlClient.MySqlException (0x80004005): Authentication to host ‘xxxxxx-cluster.cluster-xxxxxx.us-east-2.rds.amazonaws.com' for user 'test user' using method 'mysql_native_password' failed with message: Reading from the stream has failed. ---> MySql.Data.MySqlClient.MySqlException (0x80004005): Reading from the stream has failed. ---> System.IO.EndOfStreamException: Attempted to read past the end of the stream.
   at MySql.Data.MySqlClient.MySqlStream.ReadFully(Stream stream, Byte[] buffer, Int32 offset, Int32 count)
   at MySql.Data.MySqlClient.MySqlStream.LoadPacket()
   at MySql.Data.MySqlClient.MySqlStream.LoadPacket()
   at MySql.Data.MySqlClient.MySqlStream.ReadPacket()
   at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.ReadPacket()
   at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.AuthenticationFailed(Exception ex)
   at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.ReadPacket()
   at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.Authenticate(Boolean reset)
   at MySql.Data.MySqlClient.NativeDriver.Authenticate(String authMethod, Boolean reset)
   at MySql.Data.MySqlClient.NativeDriver.Open()
   at MySql.Data.MySqlClient.Driver.Open()
   at MySql.Data.MySqlClient.Driver.Create(MySqlConnectionStringBuilder settings)
   at MySql.Data.MySqlClient.MySqlConnection.Open()

MySqlConnector, TLS enabled (basically identical to what @sherloqholmes provided):

System.IO.EndOfStreamException: Attempted to read past the end of the stream.
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at MySql.Data.ValueTaskExtensions.ContinueWith[T,TResult](ValueTask`1 valueTask, Func`2 continuation)
   at MySql.Data.Protocol.Serialization.ProtocolUtility.ReadPayloadAsync(BufferedByteReader bufferedByteReader, IByteHandler byteHandler, Func`1 getNextSequenceNumber, ArraySegment`1 previousPayloads, ProtocolErrorBehavior protocolErrorBehavior, IOBehavior ioBehavior)
   at MySql.Data.Protocol.Serialization.StandardPayloadHandler.ReadPayloadAsync(ProtocolErrorBehavior protocolErrorBehavior, IOBehavior ioBehavior)
   at MySql.Data.Serialization.MySqlSession.TryAsync(Func`3 func, IOBehavior ioBehavior, CancellationToken cancellationToken)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MySql.Data.Serialization.MySqlSession.TryAsyncContinuation(Task`1 task)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult()
   at MySql.Data.Serialization.MySqlSession.<ResetConnectionAsync>d__46.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MySql.Data.MySqlClient.ConnectionPool.<GetSessionAsync>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MySql.Data.MySqlClient.MySqlConnection.<CreateSessionAsync>d__66.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MySql.Data.MySqlClient.MySqlConnection.<OpenAsync>d__17.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at MySql.Data.MySqlClient.MySqlConnection.Open()

As I was reviewing logs yesterday, I realized that every single EndOfStreamException stack trace has ResetConnectionAsync in it. I figured the easy approach would be to just try/catch around ResetConnectionAsync and if it throws, go ahead and release the connection from the pool and create a new MySqlSession. But while I was at it I put try/catches in MySqlSession.ResetConnectionAsync around each ReceiveReplyAsync call to figure out where things were going wrong.

2017-09-13 01:24:12,395 INFO MySqlSession: ResetConnectionAsync failed trying new challenge mysql_native_password.
2017-09-13 01:24:12,396 INFO ConnectionPool: ResetConnectionAsync failed, will instantiate new session.
System.IO.EndOfStreamException: Expected at least 4 bytes, got 0 bytes.
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at MySql.Data.ValueTaskExtensions.<>c__DisplayClass0_0`2.<ContinueWith>b__0(Task`1 task)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at MySql.Data.Serialization.MySqlSession.TryAsyncContinuation(Task`1 task)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult()
   at MySql.Data.Serialization.MySqlSession.<ResetConnectionAsync>d__53.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ConfiguredTaskAwaitable.ConfiguredTaskAwaiter.GetResult()
   at MySql.Data.MySqlClient.ConnectionPool.<GetSessionAsync>d__1.MoveNext()

Walking through the ResetConnectionAsync method:

  1. The first check is for MySQL server version >= 5.7.3, which is required to support ResetConnectionPayload. Aurora is MySQL 5.6.10 compatible, so ResetConnectionPayload isn't being sent.

  2. Instead, ChangeUserPayload is being sent, with the password hashed from the contents of AuthPluginData. As noted in the code comments, MariaDB is OK with this but MySQL is not.

  3. As expected, the server responds with HeaderByte set to AuthenticationMethodSwitchRequestPayload.Signature.

  4. A new authentication response is sent, and the next ReceiveReplyAsync throws because the server has closed its side of the connection and the read from the stream yields 0 bytes. I'm still not sure why. It might be a question for the Aurora team if there's enough evidence this is an Aurora-only problem.

I initially thought this situation was triggered by trying to reuse the challenge from the initial handshake, but I don't believe the MySql.Data client does that and it appears to affect both MySql.Data and MySqlConnector. Is this the right way to fix this (working around a possible server bug)? I don't know, but in the end it makes my application work better and I don't see much downside to covering up a failed connection reset / authentication and returning a new session. Curious to hear from anyone else who has experienced this problem.

bnabholz added a commit to bnabholz/MySqlConnector that referenced this issue Sep 13, 2017
bgrainger added a commit that referenced this issue Sep 13, 2017
Try/catch ResetConnectionAsync (works around #208).
@bgrainger
Copy link
Member

@bnabholz Thanks for the detailed investigation and patch!

Gracefully recovering from failure to reset the connection (then opening a new connection) sounds like the right approach to me. It's obviously less-than-ideal that it takes a few network roundtrips for the reset connection to eventually fail, but there's not much we can do about that.

It does sound like an Aurora bug (from your description) but if it only happens with TLS it would be difficult to get a packet capture to "prove" it.

@bgrainger bgrainger added the bug label Oct 19, 2017
@bgrainger bgrainger changed the title EndOfStreamException: Attempted to read past the end of the stream EndOfStreamException in ResetConnectionAsync with Amazon Aurora Oct 19, 2017
@bgrainger
Copy link
Member

This was fixed in 0.26.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants