packets: implemented compression protocol #649

bLamarche413 · 2017-08-11T18:50:25Z

Description

Implemented the SQL compression protocol. This new feature is used by using "compress=1".

Checklist

Code compiles correctly
Created tests which fail without the change (if possible)
All tests passing
Extended the README / documentation, if necessary
Added myself / the copyright holder to the AUTHORS file

compress.go

methane · 2017-08-12T09:35:51Z

Could you show benchmark result for compressed, uncompressed (new) and uncompressed (current)?

compress.go

…kwards compatibility

bLamarche413 · 2017-08-16T18:54:35Z

For reference, here is the documentation I consulted.

Benchmarks:
Here are the old benchmark stats:

BenchmarkQueryContext/1-4  	   20000	     98673 ns/op	     729 B/op	      22 allocs/op
BenchmarkQueryContext/2-4  	   20000	     89651 ns/op	     728 B/op	      22 allocs/op
BenchmarkQueryContext/3-4  	   20000	     91343 ns/op	     729 B/op	      22 allocs/op
BenchmarkQueryContext/4-4  	   20000	     88788 ns/op	     729 B/op	      22 allocs/op
BenchmarkExecContext/1-4   	   20000	     97542 ns/op	     728 B/op	      22 allocs/op
BenchmarkExecContext/2-4   	   20000	     91997 ns/op	     728 B/op	      22 allocs/op
BenchmarkExecContext/3-4   	   20000	     89143 ns/op	     728 B/op	      22 allocs/op
BenchmarkExecContext/4-4   	   10000	    104725 ns/op	     728 B/op	      22 allocs/op
BenchmarkQuery-4           	   20000	     89338 ns/op	     748 B/op	      23 allocs/op
BenchmarkExec-4            	   20000	     81922 ns/op	      67 B/op	       3 allocs/op
BenchmarkRoundtripTxt-4    	   10000	    198159 ns/op	   15938 B/op	      16 allocs/op
BenchmarkRoundtripBin-4    	   10000	    162701 ns/op	     695 B/op	      18 allocs/op
BenchmarkInterpolation-4   	 2000000	       820 ns/op	     176 B/op	       1 allocs/op
BenchmarkParseDSN-4        	  200000	      9042 ns/op	    6896 B/op	      63 allocs/op
PASS
ok  	github.com/mysql	53.699s

Here are the new benchmark stats:
BenchmarkQueryCompression does the same thing as BenchmarkQuery, but with compression turned on.

BenchmarkQueryContext/1-4     	   10000	    100547 ns/op	     732 B/op	      22 allocs/op
BenchmarkQueryContext/2-4     	   20000	     91023 ns/op	     728 B/op	      22 allocs/op
BenchmarkQueryContext/3-4     	   20000	     90626 ns/op	     729 B/op	      22 allocs/op
BenchmarkQueryContext/4-4     	   10000	    100878 ns/op	     729 B/op	      22 allocs/op
BenchmarkExecContext/1-4      	   10000	    157982 ns/op	     728 B/op	      22 allocs/op
BenchmarkExecContext/2-4      	   10000	    104514 ns/op	     728 B/op	      22 allocs/op
BenchmarkExecContext/3-4      	   10000	    101186 ns/op	     728 B/op	      22 allocs/op
BenchmarkExecContext/4-4      	   20000	     99318 ns/op	     728 B/op	      22 allocs/op
BenchmarkQuery-4              	   10000	    105718 ns/op	     752 B/op	      23 allocs/op
BenchmarkQueryCompression-4   	   10000	    104518 ns/op	     972 B/op	      27 allocs/op
BenchmarkExec-4               	   20000	     82831 ns/op	      67 B/op	       3 allocs/op
BenchmarkRoundtripTxt-4       	   10000	    247710 ns/op	   15938 B/op	      16 allocs/op
BenchmarkRoundtripBin-4       	   10000	    163471 ns/op	     695 B/op	      18 allocs/op
BenchmarkInterpolation-4      	 1000000	      1303 ns/op	     176 B/op	       1 allocs/op
BenchmarkParseDSN-4           	  100000	     13798 ns/op	    6896 B/op	      63 allocs/op
PASS
ok  	github.com/mysql	46.113s

Thank you for your comments on my pull request!

julienschmidt · 2017-08-17T03:31:36Z

connection.go

+	parseTime           bool
+	strict              bool
+	reader              packetReader
+	writer              io.Writer


Do we really need to carry around both reader and writer as well as buf and netConn?
Can't those be unified into one interface?

This interface might also have some sort of sequence reset function. Then sequence and compressionSequence could be internal states of the respective struct implementing the interface.

My understanding of your comment is that you suggest merging reader and buf, and writer and netConn. Because both buf and netConn are used in various capacities outside of simple reading and writing (e.g. setting deadlines or getting the complete buffer) , it's not easily possible to subsume buf and netConn into reader and writer respectively.

I wrote things thusly to avoid having the addition of compression significantly affect buffer.go. While this refactoring sounds like a great idea, I think that it may be out of the scope of this ticket, and would be better undertaken in a separate ticket with some design/brainstorming done beforehand. This would also avoid obfuscating the scope of this pull request, which was only the addition of the compression protocol.

In re: sequence numbers:
Because sequence and compression sequence need to be reset on the basis of higher level actions (eg. issuing a new mysql command), it seems that a writer-level abstraction should not know about specific mysql connection logic.

methane · 2017-08-17T07:49:56Z

compress.go

+}
+
+func (cw *compressedWriter) writeComprPacketToNetwork(data []byte, uncomprLength int) error {
+	data = append(cw.header, data...)


This won't avoid allocation.

To avoid allocation, you can require 7byte space is reserved in payload to caller.
Caller can reserve it like:

cHeader := make([]byte, 7) var cBuff = new(bytes.Buffer) zw := zlib.NewWriter(cBuff) ... cBuff.Reset() cBuff.Write(cHeader) zw.Reset(cBuff) ...

compress.go

methane · 2017-08-17T10:02:21Z

compress.go

+
+	maxPayloadLength := maxPacketSize - 4
+
+	for length >= maxPayloadLength {


Why both of length := len(data) - 4 and maxPayloadLength := maxPacketSize - 4 required?
How about this?

maxPayload := maxPacketSize - 4 for len(data) > maxPayload { ... data = data[maxPayload:] }

methane · 2017-08-17T10:08:19Z

compress.go

+		lenSmall := len(dataSmall)
+
+		var b bytes.Buffer
+		writer := zlib.NewWriter(&b)


bytes.Buffer and zlib's writer are resettable.
At least, move them before loop.

I can cache those on the compressedWriter and reset both. This is a more significant gain that simply declaring them before the loop, as in a majority of use cases the loop is not actually hit.

methane · 2017-08-17T10:11:00Z

compress.go

+	return totalBytes, nil
+}
+
+func (cw *compressedWriter) writeComprPacketToNetwork(data []byte, uncomprLength int) error {


This name seems bit long. writeCompressedPacket() may be better name.
payload is better than data too.

I'll change data to payload. As for compressedWriter's writeComprPacketToNetwork, I was thinking of changing its name to writeToNetwork for readability and brevity reasons. Having compressedWriter have both a Write and a WriteCompressedPacket function would not clarify the difference between the two functions -- namely, that it is writeComprPacketToNetwork does the actual network write, whereas Write is the top level function that does all the needed actions.

methane · 2017-08-17T10:21:57Z

compress.go

+}
+
+func (cw *compressedWriter) writeComprPacketToNetwork(data []byte, uncomprLength int) error {
+	data = append(cw.header, data...)


To avoid allocation, you can require 7byte space is reserved in payload to caller.
Caller can reserve it like:

cHeader := make([]byte, 7) var cBuff = new(bytes.Buffer) zw := zlib.NewWriter(cBuff) ... cBuff.Reset() cBuff.Write(cHeader) zw.Reset(cBuff) ...

methane · 2017-08-22T09:39:24Z

FYI, current benchmark is using too small data to know zlib performance.

julienschmidt · 2017-10-07T08:53:23Z

ping

This pull request needs some changes (and a rebase to resolve the conflicts)

bLamarche413 · 2017-10-08T18:34:34Z

I've made all changes suggested above except for changing the Benchmark tests. I wanted to ask how it would be recommended I best do this. Generally, the utility of the compression becomes apparent only when several rows of information are received at once. However, I'm not sure how I would do this with the go-sql-driver, as it appears that rows are obtained one by one using a cursor. Would you suggest a test where only one row is looked at but it has a very large string within it that row?

julienschmidt

The general concept seems fine to me and with some minor changes it could be merged.

But I still think that with smarter abstraction the implementation could integrate a bit better with the existing code.

Anyway, thank you for working on this long missing feature.

julienschmidt · 2017-10-09T18:22:25Z

connection.go

+	sequence            uint8
+	compressionSequence uint8
+	parseTime           bool
+	reader              packetReader


please move all interfaces (reader and writer to the top (after netConn)

julienschmidt · 2017-10-09T18:24:20Z

dsn.go

@@ -57,6 +57,7 @@ type Config struct {
 	MultiStatements         bool // Allow multiple statements in one query
 	ParseTime               bool // Parse time values to time.Time
 	RejectReadOnly          bool // Reject read-only connections
+	Compression             bool // Compress packets


The struct fields should have the same name as the DSN param

julienschmidt · 2017-10-09T18:25:57Z

compress.go

+)
+
+
+type packetReader interface {


This is used also when no compression is used. It is probably better to move this interface to connection.go

julienschmidt · 2017-10-12T18:27:19Z

compress.go

+	return &compressedWriter{
+		connWriter: connWriter,
+		mc:         mc,
+		zw:         zlib.NewWriter(new(bytes.Buffer)),


Does it really need to introduce two extra buffers?
Would it make sense to combine the reader and writer (e.g. as a virtual buffer) and share some data, like the mc and the buffer?

It has been a while since I've looked at my own code so I may be wrong about this, but between compressedReader and compressedWriter, it appears the only thing they are sharing is a reference to mc? The reader uses a slice, but that slice needs to act as memory between reads so I don't think it would make sense to share it. Please let me know what you meant by this comment!

The reader uses a slice, but that slice needs to act as memory between reads so I don't think it would make sense to share it

That answers my question. Currently we only have one buffer for each connection, which is shared between reads and writes.

julienschmidt · 2017-10-12T18:31:02Z

compress.go

+
+	defer cr.zr.Close()
+
+	//use existing capacity in bytesBuf if possible


space missing

julienschmidt · 2017-10-12T19:59:04Z

Please also run go fmt before pushing code. The CI builds are currently not successful since it finds some not properly formatted code.

julienschmidt · 2018-03-26T11:20:49Z

@methane:
BenchmarkSmall-4: 86824 ns/op vs 99420 ns/op
That's roughly +15%. I wouldn't call that negligible.

dolmen · 2018-06-15T10:12:51Z

Please rebase your own master branch on top of origin/master instead of merging origin/master into yours:

git remote update
git rebase origin/master
git push -f

methane · 2018-06-15T11:25:16Z

No need to rebase, since we use "Squash and merge".
"merge master" helps reviewing changes from last review.

methane · 2018-07-30T07:26:49Z

I run again the benchmark on more stable Linux machine. I can't see significant difference:

benchmark            old ns/op     new ns/op     delta
BenchmarkSmall-6     56154         54070         -3.71%
BenchmarkPlain-6     2502622       2464086       -1.54%

Cr4 -- removing buffer from mysqlConn

iambudi · 2020-04-13T23:42:04Z

How is it going now?

mhemmings · 2020-04-29T12:55:08Z

Any updates on this? What's left to do, and is there anything I can help with?

ghost · 2020-08-03T03:44:34Z

Any updates on this PR ?

deltacat · 2020-12-30T09:17:46Z

thanks for all your great efforts.
just want to know is there anyone still work on this important feature?

kolkov · 2021-01-02T16:39:47Z

Hi! Any updates on this PR?

cassya-remitly · 2021-01-20T00:00:26Z

Am also interested in this PR 😄

bLamarche413 · 2021-02-05T14:53:03Z

Hello everyone -- I will likely not be picking this up again. Things have gotten hectic in my life and I can't give this the proper attention. However, anyone is welcome to pick it up where I left off!

methane · 2024-03-11T13:12:03Z

continued in #1487

bLamarche413 force-pushed the master branch from 7fc0808 to 7b42091 Compare August 11, 2017 19:16

packets: implemented compression protocol

e6c682c

bLamarche413 force-pushed the master branch from 7b42091 to e6c682c Compare August 11, 2017 19:39

methane reviewed Aug 12, 2017

View reviewed changes

compress.go Outdated Show resolved Hide resolved

methane requested changes Aug 12, 2017

View reviewed changes

compress.go Show resolved Hide resolved

compress.go Show resolved Hide resolved

compress.go Outdated Show resolved Hide resolved

compress.go Outdated Show resolved Hide resolved

compress.go Outdated Show resolved Hide resolved

packets: implemented compression protocol CR changes

77f6792

bLamarche413 force-pushed the master branch from dda1105 to 77f6792 Compare August 16, 2017 18:01

packets: implemented compression protocol: remove bytes.Reset for bac…

a0cf94b

…kwards compatibility

julienschmidt reviewed Aug 17, 2017

View reviewed changes

methane reviewed Aug 17, 2017

View reviewed changes

compress.go Outdated Show resolved Hide resolved

methane requested changes Aug 17, 2017

View reviewed changes

methane closed this Aug 22, 2017

methane reopened this Aug 22, 2017

julienschmidt added this to the v1.4 milestone Sep 13, 2017

julienschmidt added enhancement waiting for reply labels Oct 7, 2017

Brigitte Lamarche and others added 4 commits October 8, 2017 14:18

Merge branch 'master' of https://github.com/go-sql-driver/mysql

4cdff28

reading working

d0ea1a4

writerly changes

477c9f8

PR 649: adding compression (second code review)

996ed2d

julienschmidt requested changes Oct 12, 2017

View reviewed changes

julienschmidt added code review and removed waiting for reply labels Oct 12, 2017

methane mentioned this pull request Apr 10, 2018

implement SessionResetter. #779

Merged

5 tasks

Brigitte Lamarche added 2 commits July 23, 2018 12:25

cr4 changes

26ea544

Merge branch 'master' into cr4

f339392

Brigitte Lamarche and others added 4 commits September 12, 2018 10:45

saving work with SimpleReader present

3e559a8

removed buf from mysqlConn

6ceaef6

Merge pull request #1 from bLamarche413/cr4

97afd8d

Cr4 -- removing buffer from mysqlConn

removed comment

f617170

julienschmidt modified the milestones: v1.5.0, v1.6.0 Apr 24, 2019

julienschmidt modified the milestones: v1.6.0, v1.7.0 Jun 1, 2020

methane mentioned this pull request Nov 12, 2021

Can support mysql compress protocol? #1282

Closed

zhanghaiyang9999 mentioned this pull request Nov 15, 2021

Would you please merge the compression feature #1283

Closed

This was referenced Oct 2, 2023

Implement compression #1484

Closed

Implement zlib compression #1487

Merged

methane modified the milestones: v1.8.0, v1.9.0 Mar 6, 2024

methane closed this Mar 11, 2024


		maxPayloadLength := maxPacketSize - 4

		for length >= maxPayloadLength {


		defer cr.zr.Close()

		//use existing capacity in bytesBuf if possible

packets: implemented compression protocol #649

packets: implemented compression protocol #649

Conversation

bLamarche413 commented Aug 11, 2017 • edited Loading

Description

Checklist

methane commented Aug 12, 2017

bLamarche413 commented Aug 16, 2017 • edited Loading

Choose a reason for hiding this comment

bLamarche413 Aug 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

methane commented Aug 22, 2017

julienschmidt commented Oct 7, 2017

bLamarche413 commented Oct 8, 2017

julienschmidt left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

julienschmidt Oct 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

julienschmidt commented Oct 12, 2017

julienschmidt commented Mar 26, 2018

dolmen commented Jun 15, 2018 • edited Loading

methane commented Jun 15, 2018

methane commented Jul 30, 2018

iambudi commented Apr 13, 2020

mhemmings commented Apr 29, 2020

ghost commented Aug 3, 2020

deltacat commented Dec 30, 2020

kolkov commented Jan 2, 2021

cassya-remitly commented Jan 20, 2021

bLamarche413 commented Feb 5, 2021

methane commented Mar 11, 2024

bLamarche413 commented Aug 11, 2017 •

edited

Loading

bLamarche413 commented Aug 16, 2017 •

edited

Loading

bLamarche413 Aug 17, 2017 •

edited

Loading

julienschmidt left a comment •

edited

Loading

julienschmidt Oct 12, 2017 •

edited

Loading

dolmen commented Jun 15, 2018 •

edited

Loading