Skip to content

Commit f327058

Browse files
committed
packets: Check connection liveness before writing query
This commit contains a potential fix to the issue reported in #657. As a summary: when a MySQL server kills a connection on the server-side (either because it is actively pruning connections, or because the connection has hit the server-side timeout), the Go MySQL client does not immediately become aware of the connection being dead. Because of the way TCP works, the client cannot know that the connection has received a RST packet from the server (i.e. the server-side has closed) until it actually reads from it. This causes an unfortunate bug wherein a MySQL idle connection is pulled from the connection pool, a query packet is written to it without error, and then the query fails with an "unexpected EOF" error when trying to read the response packet. Since the initial write to the socket does not fail with an error, it is generally not safe to return `driver.ErrBadConn` when the read fails, because in theory the write could have arrived to the server and could have been committed. Returning `ErrBadConn` could lead to duplicate inserts on the database and data corruption because of the way the Go SQL package performs retries. In order to significantly reduce the circumstances where this "unexpected EOF" error is returned for stale connections, this commit performs a liveness check before writing a new query. When do we check? ----------------- This check is not performed for all writes. Go 1.10 introduced a new `sql/driver` interface called `driver.SessionResetter`, which calls the `ResetSession` method on any connections _when they are returned to the connection pool_. Since performing the liveness check during `ResetSession` is not particularly useful (the connection can spend a long time in the pool before it's checked out again, and become stale), we simply mark the connection with a `reset` flag instead. This `reset` flag is then checked from `mysqlConn.writePacket` to perform the liveness checks. This ensures that the liveness check will only be performed for the first query on a connection that has been checked out of the connection pool. These are pretty much the semantics we want: a fresh connection from the pool is more likely to be stale, and it has not performed any previous writes that could cause data corruption. If a connection is being consistently used by the client (i.e. through an open transaction), we do NOT perform liveness checks. If MySQL Server kills such active connection, we want to bubble up the error to the user because any silent retrying can and will lead to data corruption. Since the `ResetSession` interface is only available in Go 1.10+, the liveness checks will only be performed starting with that Go version. How do we check? ---------------- To perform the actual liveness test on the connection, we use the new `syscall.Conn` interface which is available for all `net.Conn`s since Go 1.9. The `SyscallConn` method returns a `RawConn` that lets us read directly from the connection's file descriptor using syscalls, and skipping the default read pipeline of the Go runtime. When reading directly from the file descriptor using `syscall.Read`, we pass in a 1-length buffer, as passing a 0-length buffer will always result in a 0-length read, and the 1-length buffer will never be filled because we're not expecting any reads from MySQL before we have written any request packets in a fresh connection. All sockets created in the Go runtime are set to non-blocking (O_NONBLOCK). Consequently, we can detect a socket that has been closed on the server-side because the `read` syscall will return a 0-length read _and_ no error. We assume that any other errors returned from the `read` also mean the connection is in a bad state, except for `EAGAIN`/`EWOULDBLOCK`, which is the expected return for a healthy non-blocking socket in this circumstance. Because of the dependency on `syscall.Conn`, liveness checks can only be performed in Go 1.9+. This restriction however overlaps with the fact that we only mark connections as having been reset in Go 1.10+, as explained in the previous section.
1 parent 2c9d54f commit f327058

File tree

6 files changed

+124
-0
lines changed

6 files changed

+124
-0
lines changed

Diff for: AUTHORS

+1
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,7 @@ Zhenye Xie <xiezhenye at gmail.com>
8888

8989
Barracuda Networks, Inc.
9090
Counting Ltd.
91+
GitHub Inc.
9192
Google Inc.
9293
InfoSum Ltd.
9394
Keybase Inc.

Diff for: conncheck.go

+53
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
// Go MySQL Driver - A MySQL-Driver for Go's database/sql package
2+
//
3+
// Copyright 2019 The Go-MySQL-Driver Authors. All rights reserved.
4+
//
5+
// This Source Code Form is subject to the terms of the Mozilla Public
6+
// License, v. 2.0. If a copy of the MPL was not distributed with this file,
7+
// You can obtain one at http://mozilla.org/MPL/2.0/.
8+
9+
// +build !windows
10+
11+
package mysql
12+
13+
import (
14+
"errors"
15+
"io"
16+
"net"
17+
"syscall"
18+
)
19+
20+
var errUnexpectedRead = errors.New("unexpected read from socket")
21+
22+
func connCheck(c net.Conn) error {
23+
var (
24+
n int
25+
err error
26+
buff [1]byte
27+
)
28+
29+
sconn, ok := c.(syscall.Conn)
30+
if !ok {
31+
return nil
32+
}
33+
rc, err := sconn.SyscallConn()
34+
if err != nil {
35+
return err
36+
}
37+
rerr := rc.Read(func(fd uintptr) bool {
38+
n, err = syscall.Read(int(fd), buff[:])
39+
return true
40+
})
41+
switch {
42+
case rerr != nil:
43+
return rerr
44+
case n == 0 && err == nil:
45+
return io.EOF
46+
case n > 0:
47+
return errUnexpectedRead
48+
case err == syscall.EAGAIN || err == syscall.EWOULDBLOCK:
49+
return nil
50+
default:
51+
return err
52+
}
53+
}

Diff for: conncheck_test.go

+38
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
// Go MySQL Driver - A MySQL-Driver for Go's database/sql package
2+
//
3+
// Copyright 2013 The Go-MySQL-Driver Authors. All rights reserved.
4+
//
5+
// This Source Code Form is subject to the terms of the Mozilla Public
6+
// License, v. 2.0. If a copy of the MPL was not distributed with this file,
7+
// You can obtain one at http://mozilla.org/MPL/2.0/.
8+
9+
// +build go1.10,!windows
10+
11+
package mysql
12+
13+
import (
14+
"testing"
15+
"time"
16+
)
17+
18+
func TestStaleConnectionChecks(t *testing.T) {
19+
runTests(t, dsn, func(dbt *DBTest) {
20+
dbt.mustExec("SET @@SESSION.wait_timeout = 2")
21+
22+
if err := dbt.db.Ping(); err != nil {
23+
dbt.Fatal(err)
24+
}
25+
26+
// wait for MySQL to close our connection
27+
time.Sleep(3 * time.Second)
28+
29+
tx, err := dbt.db.Begin()
30+
if err != nil {
31+
dbt.Fatal(err)
32+
}
33+
34+
if err := tx.Rollback(); err != nil {
35+
dbt.Fatal(err)
36+
}
37+
})
38+
}

Diff for: conncheck_windows.go

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
package mysql
2+
3+
// Go MySQL Driver - A MySQL-Driver for Go's database/sql package
4+
//
5+
// Copyright 2019 The Go-MySQL-Driver Authors. All rights reserved.
6+
//
7+
// This Source Code Form is subject to the terms of the Mozilla Public
8+
// License, v. 2.0. If a copy of the MPL was not distributed with this file,
9+
// You can obtain one at http://mozilla.org/MPL/2.0/.
10+
11+
import "net"
12+
13+
func connCheck(c net.Conn) error {
14+
return nil
15+
}

Diff for: connection.go

+3
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ import (
2222
type mysqlConn struct {
2323
buf buffer
2424
netConn net.Conn
25+
rawConn net.Conn
2526
affectedRows uint64
2627
insertId uint64
2728
cfg *Config
@@ -40,6 +41,7 @@ type mysqlConn struct {
4041
finished chan<- struct{}
4142
canceled atomicError // set non-nil if conn is canceled
4243
closed atomicBool // set when conn is closed, before closech is closed
44+
reset atomicBool // set when the Go SQL package calls ResetSession
4345
}
4446

4547
// Handles parameters set in DSN after the connection is established
@@ -639,5 +641,6 @@ func (mc *mysqlConn) ResetSession(ctx context.Context) error {
639641
if mc.closed.IsSet() {
640642
return driver.ErrBadConn
641643
}
644+
mc.reset.Set(true)
642645
return nil
643646
}

Diff for: packets.go

+14
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,19 @@ func (mc *mysqlConn) writePacket(data []byte) error {
9696
return ErrPktTooLarge
9797
}
9898

99+
if mc.reset.IsSet() {
100+
mc.reset.Set(false)
101+
conn := mc.netConn
102+
if mc.rawConn != nil {
103+
conn = mc.rawConn
104+
}
105+
if err := connCheck(conn); err != nil {
106+
errLog.Print("closing bad idle connection: ", err)
107+
mc.Close()
108+
return driver.ErrBadConn
109+
}
110+
}
111+
99112
for {
100113
var size int
101114
if pktLen >= maxPacketSize {
@@ -332,6 +345,7 @@ func (mc *mysqlConn) writeHandshakeResponsePacket(authResp []byte, plugin string
332345
if err := tlsConn.Handshake(); err != nil {
333346
return err
334347
}
348+
mc.rawConn = mc.netConn
335349
mc.netConn = tlsConn
336350
mc.buf.nc = tlsConn
337351
}

0 commit comments

Comments
 (0)