Skip to content

MySQL 8.3 will introduce new GTID format #845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 3 tasks
lance6716 opened this issue Jan 18, 2024 · 8 comments · Fixed by #990
Open
2 of 3 tasks

MySQL 8.3 will introduce new GTID format #845

lance6716 opened this issue Jan 18, 2024 · 8 comments · Fixed by #990

Comments

@lance6716
Copy link
Collaborator

lance6716 commented Jan 18, 2024

The new GTID format is UUID:<TAG>:NUMBER, in which <TAG> is an arbitrary string up to 8 characters long

https://dev.mysql.com/doc/relnotes/mysql/8.3/en/news-8-3-0.html

thanks for @dveeden to remind us

@dveeden
Copy link
Collaborator

dveeden commented Nov 13, 2024

An example event from mysqlbinlog --hexdump

# at 220825
#241112 21:51:23 server id 1  end_log_pos 220908 CRC32 0x71013dac 
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags 
# 00035e99 cb bf 33 67   2a   01 00 00 00   53 00 00 00   ec 5e 03 00   00 00
# 00035eac 02 78 00 00 00 02 aa aa  aa aa 88 88 66 66 44 44 |.x..........ffDD|
# 00035ebc 22 22 22 22 22 22 04 73  20 1b 06 14 73 65 63 6f |.......s....seco|
# 00035ecc 6e 64 74 65 73 74 08 c1  0e 0a d1 0e 0c 7f 23 35 |ndtest.........5|
# 00035edc 48 61 bd 26 06 10 59 04  12 a3 ff 0a ac 3d 01 71 |Ha....Y........q|
# 	GTID	last_committed=472	sequence_number=474	rbr_only=yes	original_committed_timestamp=1731444683060515	immediate_commit_timestamp=1731444683060515	transaction_length=278
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=1731444683060515 (2024-11-12 21:51:23.060515 CET)
# immediate_commit_timestamp=1731444683060515 (2024-11-12 21:51:23.060515 CET)
/*!80001 SET @@session.original_commit_timestamp=1731444683060515*//*!*/;
/*!80014 SET @@session.original_server_version=90100*//*!*/;
/*!80014 SET @@session.immediate_server_version=90100*//*!*/;
SET @@SESSION.GTID_NEXT= '55555555-4444-3333-2222-111111111111:secondtest:111111'/*!*/;

The event is serialized with the new mysql::serialization library. The docs for that can be found here: https://dev.mysql.com/doc/dev/mysql-server/latest/PageLibsMysqlSerialization.html

The gtid_flags here are the 00 00 00 that follow 02 78. Other events set this to 00 00 02 which means rbr_only=no.

The UUID of the GTID is encoded as aa aa aa aa 88 88 66 66 44 44 22 22 22 22 22 22.

A >> 1 seems to decode this for this specific value but fails for other values.

>>> print(f'{0xaaaaaaaa888866664444222222222222 >> 1:0x}');
55555555444433332222111111111111

The checksum here is ac 3d 01 71 which matches the decoded CRC32 0x71013dac.

According to libs/mysql/binlog/event/control_events.h the fields are like this:

  • gtid_flags (00 00 00)
  • UUID (aa aa aa aa 88 88 66 66 44 44 22 22 22 22 22 22
  • GNO (must be part of 04 73 20 1b 06 14)
  • TAG (73 65 63 6f 6e 64 74 65 73 74)
  • last_committed
  • sequence_number
  • immediate_commit_timestamp
  • original_commit_timestamp
  • transaction_length
  • immediate_server_version
  • original_server_version
  • commit_group_ticket

@dveeden
Copy link
Collaborator

dveeden commented Nov 21, 2024

The PREVIOUS_GTIDS_LOG_EVENT seems to change in MySQL 9.1 related to the tagged format.

Examples:

MySQL 9.1

# at 127
#241120 13:05:13 server id 1  end_log_pos 199 CRC32 0xcac3bcd2 
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags 
# 0000007f 79 d0 3d 67   23   01 00 00 00   48 00 00 00   c7 00 00 00   80 00
# 00000092 01 01 00 00 00 00 00 01  89 6e 78 82 18 fe 11 ef |.........nx.....|
# 000000a2 ab 88 22 22 2d 34 d4 11  00 01 00 00 00 00 00 00 |.....4..........|
# 000000b2 00 01 00 00 00 00 00 00  00 04 00 00 00 00 00 00 |................|
# 000000c2 00 d2 bc c3 ca                                   |.....|
# 	Previous-GTIDs
# 896e7882-18fe-11ef-ab88-22222d34d411:1-3

Note the 01 01 00 00 00 00 00 01

# at 127
#241120 10:36:02 server id 1  end_log_pos 158 CRC32 0xa7f45443 
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags 
# 0000007f 82 ad 3d 67   23   01 00 00 00   1f 00 00 00   9e 00 00 00   80 00
# 00000092 01 00 00 00 00 00 00 01  43 54 f4 a7             |........CT..|
# 	Previous-GTIDs
# [empty]

Note the 01 00 00 00 00 00 00 01

MySQL 8.0

# at 126
#241120 21:07:16 server id 1  end_log_pos 197 CRC32 0x7441e971 
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags 
# 0000007e 74 41 3e 67   23   01 00 00 00   47 00 00 00   c5 00 00 00   80 00
# 00000091 01 00 00 00 00 00 00 00  9d 44 42 ab a7 7a 11 ef |.........DB..z..|
# 000000a1 b2 08 d2 88 cc 2a 5b 7d  01 00 00 00 00 00 00 00 |................|
# 000000b1 01 00 00 00 00 00 00 00  02 00 00 00 00 00 00 00 |................|
# 000000c1 71 e9 41 74                                      |q.At|
# 	Previous-GTIDs
# 9d4442ab-a77a-11ef-b208-d288cc2a5b7d:1

Note the 01 00 00 00 00 00 00 00

# at 126
#241120 21:05:05 server id 1  end_log_pos 157 CRC32 0x27158e8a 
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags 
# 0000007e f1 40 3e 67   23   01 00 00 00   1f 00 00 00   9d 00 00 00   80 00
# 00000091 00 00 00 00 00 00 00 00  8a 8e 15 27             |............|
# 	Previous-GTIDs
# [empty]

Note the 00 00 00 00 00 00 00 00

@dveeden
Copy link
Collaborator

dveeden commented Nov 21, 2024

A first attempt in decoding this. Some info on this is available here: https://github.com/mysql/mysql-server/blob/61a3a1d8ef15512396b4c2af46e922a19bf2b174/sql/rpl_gtid_set.cc#L1364

package main

import (
	"encoding/binary"
	"fmt"
)

func main() {
	cases := []struct {
		b []byte
	}{
		{
			[]byte{1, 2, 0, 0, 0, 0, 0, 1}, // tagged, sidnr=2
		},
		{
			[]byte{1, 1, 0, 0, 0, 0, 0, 1}, // tagged, sidnr=1
		},
		{
			[]byte{1, 0, 0, 0, 0, 0, 0, 1}, // tagged, sidnr=0
		},
		{
			[]byte{0, 0, 0, 0, 0, 0, 0, 0}, // untagged, sidnr=0
		},
		{
			[]byte{1, 0, 0, 0, 0, 0, 0, 0}, // untagged, sidnr=1
		},
	}

	for _, c := range cases {
		fmt.Printf("\nraw bytes= %08b\n", c.b)
		if c.b[7] == 1 {
			fmt.Println("tagged gtid format")

			sid_mask := []byte{0, 255, 255, 255, 255, 255, 255, 0}

			// Apply the mask
			for i, _ := range c.b {
				c.b[i] &= sid_mask[i]
			}
			c.b = append(c.b, 0)

			// sidnr, encoded
			fmt.Printf("sidnr    = %08b\n", c.b[1:])

			// sidnr
			n := binary.LittleEndian.Uint64(c.b[1:])
			fmt.Printf("sidnr    = %d\n", n)
		} else {
			fmt.Println("classic gtid format")
			n := binary.LittleEndian.Uint64(c.b)
			fmt.Printf("sidnr    = %d\n", n)
		}
	}
}

dveeden added a commit to dveeden/go-mysql that referenced this issue Nov 21, 2024
Issue: closes go-mysql-org#845

The `PreviousGTIDsEvent` / `PREVIOUS_GTIDS_LOG_EVENT` has changed to
work with tagged GTIDs.

First the `uuidCount` has changed, it encodes the GTID format. Here
format 1 is tagged and format 0 is untagged.

Then each entry may have a tag. If there is a tag then the uuid itself
isn't printed but the tag is appended to the last entry.

Examples:
- `896e7882-18fe-11ef-ab88-22222d34d411:1-3`
  regular format, compatible with both formats
- `896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1`
  tagged format. Combination of
  `896e7882-18fe-11ef-ab88-22222d34d411:1-4` and
  `896e7882-18fe-11ef-ab88-22222d34d411:aaaa:1`
- `896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1:abc:1-3:bbbbb:1:bbbbbb:1:x:1,896e7882-18fe-11ef-ab88-22222d34d412:1-2`
  Combination of:
  ```
  896e7882-18fe-11ef-ab88-22222d34d411:1-4
                                      :aaaa:1
                                      🔤1-3
                                      :bbbbb:1
                                      :bbbbbb:1
                                      ❌1,
  896e7882-18fe-11ef-ab88-22222d34d412:1-2
  ```

Please also see: `mysqlbinlog --read-from-remote-server --hexdump $binlogfile` to see how MySQL encodes/decodes this.

See also:
- https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html
dveeden added a commit to dveeden/go-mysql that referenced this issue Nov 21, 2024
Issue: closes go-mysql-org#845

The `PreviousGTIDsEvent` / `PREVIOUS_GTIDS_LOG_EVENT` has changed to
work with tagged GTIDs.

First the `uuidCount` has changed, it encodes the GTID format. Here
format 1 is tagged and format 0 is untagged.

Then each entry may have a tag. If there is a tag then the uuid itself
isn't printed but the tag is appended to the last entry.

Examples:

`896e7882-18fe-11ef-ab88-22222d34d411:1-3`

regular format, compatible with both formats

`896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1`

tagged format.

Combination of

- `896e7882-18fe-11ef-ab88-22222d34d411:1-4`
- `896e7882-18fe-11ef-ab88-22222d34d411:aaaa:1`

`896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1:abc:1-3:bbbbb:1:bbbbbb:1:x:1,896e7882-18fe-11ef-ab88-22222d34d412:1-2`

Combination of:
```
896e7882-18fe-11ef-ab88-22222d34d411:1-4
                                    :aaaa:1
                                    🔤1-3
                                    :bbbbb:1
                                    :bbbbbb:1
                                    ❌1,
896e7882-18fe-11ef-ab88-22222d34d412:1-2
```

Please also see: `mysqlbinlog --read-from-remote-server --hexdump $binlogfile` to see how MySQL encodes/decodes this.

See also:
- https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html
dveeden added a commit to dveeden/go-mysql that referenced this issue Nov 21, 2024
Issue: ref go-mysql-org#845

The `PreviousGTIDsEvent` / `PREVIOUS_GTIDS_LOG_EVENT` has changed to
work with tagged GTIDs.

First the `uuidCount` has changed, it encodes the GTID format. Here
format 1 is tagged and format 0 is untagged.

Then each entry may have a tag. If there is a tag then the uuid itself
isn't printed but the tag is appended to the last entry.

Examples:

`896e7882-18fe-11ef-ab88-22222d34d411:1-3`

regular format, compatible with both formats

`896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1`

tagged format.

Combination of

- `896e7882-18fe-11ef-ab88-22222d34d411:1-4`
- `896e7882-18fe-11ef-ab88-22222d34d411:aaaa:1`

`896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1:abc:1-3:bbbbb:1:bbbbbb:1:x:1,896e7882-18fe-11ef-ab88-22222d34d412:1-2`

Combination of:
```
896e7882-18fe-11ef-ab88-22222d34d411:1-4
                                    :aaaa:1
                                    🔤1-3
                                    :bbbbb:1
                                    :bbbbbb:1
                                    ❌1,
896e7882-18fe-11ef-ab88-22222d34d412:1-2
```

Please also see: `mysqlbinlog --read-from-remote-server --hexdump $binlogfile` to see how MySQL encodes/decodes this.

See also:
- https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html
@dveeden
Copy link
Collaborator

dveeden commented Nov 21, 2024

Related: https://bugs.mysql.com/bug.php?id=116747

lance6716 added a commit that referenced this issue Dec 17, 2024
* replication: Support GTID tag in PreviousGTIDsEvent

Issue: ref #845

The `PreviousGTIDsEvent` / `PREVIOUS_GTIDS_LOG_EVENT` has changed to
work with tagged GTIDs.

First the `uuidCount` has changed, it encodes the GTID format. Here
format 1 is tagged and format 0 is untagged.

Then each entry may have a tag. If there is a tag then the uuid itself
isn't printed but the tag is appended to the last entry.

Examples:

`896e7882-18fe-11ef-ab88-22222d34d411:1-3`

regular format, compatible with both formats

`896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1`

tagged format.

Combination of

- `896e7882-18fe-11ef-ab88-22222d34d411:1-4`
- `896e7882-18fe-11ef-ab88-22222d34d411:aaaa:1`

`896e7882-18fe-11ef-ab88-22222d34d411:1-4:aaaa:1:abc:1-3:bbbbb:1:bbbbbb:1:x:1,896e7882-18fe-11ef-ab88-22222d34d412:1-2`

Combination of:
```
896e7882-18fe-11ef-ab88-22222d34d411:1-4
                                    :aaaa:1
                                    🔤1-3
                                    :bbbbb:1
                                    :bbbbbb:1
                                    ❌1,
896e7882-18fe-11ef-ab88-22222d34d412:1-2
```

Please also see: `mysqlbinlog --read-from-remote-server --hexdump $binlogfile` to see how MySQL encodes/decodes this.

See also:
- https://dev.mysql.com/doc/refman/8.4/en/replication-gtids-concepts.html

* Add test

* Add more tests

* Update replication/event.go

Co-authored-by: lance6716 <[email protected]>

* Update based on review

* add my suggestions

Signed-off-by: lance6716 <[email protected]>

* add test for long tag

Signed-off-by: lance6716 <[email protected]>

---------

Signed-off-by: lance6716 <[email protected]>
Co-authored-by: lance6716 <[email protected]>
@dveeden
Copy link
Collaborator

dveeden commented Feb 11, 2025

One step of progress:

  • I can now encode/decode UUIDs in the same way
package main

import (
	"encoding/hex"
	"fmt"
	"strings"
)

// Example Event:
// # at 158
// #250211 17:53:57 server id 1  end_log_pos 240 CRC32 0x4031f7db
// # Position  Timestamp   Type   Source ID        Size      Source Pos    Flags
// # 0000009e a5 80 ab 67   2a   01 00 00 00   52 00 00 00   f0 00 00 00   00 00
// # 000000b1 02 76 00 00 02 02 25 02  dc f0 09 02 30 f9 03 22 |.v..........0...|
// # 000000c1 bd 03 ad 02 21 02 44 44  5a 68 51 03 22 04 04 06 |......DDZhQ.....|
// # 000000d1 0c 61 61 62 62 63 63 08  00 0a 04 0c 7f be e9 c3 |.aabbcc.........|
// # 000000e1 ab e0 2d 06 10 39 03 12  c3 02 0b db f7 31 40    |.....9.......1.|
// # 	GTID	last_committed=0	sequence_number=1	rbr_only=no	original_committed_timestamp=1739292837931454	immediate_commit_timestamp=1739292837931454	transaction_length=206
// # original_commit_timestamp=1739292837931454 (2025-02-11 17:53:57.931454 CET)
// # immediate_commit_timestamp=1739292837931454 (2025-02-11 17:53:57.931454 CET)
// /*!80001 SET @@session.original_commit_timestamp=1739292837931454*//*!*/;
// /*!80014 SET @@session.original_server_version=90200*//*!*/;
// /*!80014 SET @@session.immediate_server_version=90200*//*!*/;
// SET @@SESSION.GTID_NEXT= '896e7882-18fe-11ef-ab88-22222d34d411:aabbcc:1'/*!*/;

// In this event:
// 25 02 dc f0 09 02 30 f9 03 22 bd 03 ad 02 21 02 44 44 5a 68 51 03 22
// is the serialized version of this
// 896e7882-18fe-11ef-ab88-22222d34d411

func main() {
	// UUID encoded as part of a MySQL GTID with a tag
	// As used in Binlog Event Type: 0x23
	uuid := "896e7882-18fe-11ef-ab88-22222d34d411"

	uuid = strings.ReplaceAll(uuid, "-", "")
	fmt.Println(uuid)

	v2, _ := hex.DecodeString(uuid)
	fmt.Printf("\n%s\t%s\n", "Decoded", "Encoded")
	for _, c := range v2 {
		fmt.Printf("%x", c)
		if c < 0x80 {
			fmt.Printf("\t%02x\n", c*2)
		} else if c < 0xc0 {
			v := c - 0x80
			fmt.Printf("\t%02x 02\n", (v<<2)+1)
		} else if c <= 0xff {
			v := c - 0xc0
			fmt.Printf("\t%02x 03\n", (v<<2)+1)
		} else {
			fmt.Printf("\n")
		}
	}
}

Output:

896e788218fe11efab8822222d34d411

Decoded Encoded
89      25 02
6e      dc
78      f0
82      09 02
18      30
fe      f9 03
11      22
ef      bd 03
ab      ad 02
88      21 02
22      44
22      44
2d      5a
34      68
d4      51 03
11      22

(See also https://dev.mysql.com/doc/dev/mysql-server/9.2.0/PageLibsMysqlSerialization.html )

This is the event body: (excluding header and checksum) (so 59 bytes)

// # 000000b1 02 76 00 00 02 02 25 02  dc f0 09 02 30 f9 03 22 |.v..........0...|
// # 000000c1 bd 03 ad 02 21 02 44 44  5a 68 51 03 22 04 04 06 |......DDZhQ.....|
// # 000000d1 0c 61 61 62 62 63 63 08  00 0a 04 0c 7f be e9 c3 |.aabbcc.........|
// # 000000e1 ab e0 2d 06 10 39 03 12  c3 02 0b

The event body has:

  • 02: decoded is 01, which is the serialization_version_number
  • 76: decoded (0x76/0x2=3B) is 59, which is the length (excluding header and checksum)
  • 00: last_non_ignorable_field_id
  • 00 02 02: currently unknown
  • 25 02 dc f0 09 02 30 f9 03 22 bd 03 ad 02 21 02 44 44 5a 68 51 03 22: Encoded version of 896e7882-18fe-11ef-ab88-22222d34d411
  • 04 04 06: currently unknown
  • 0c: Tag Length
  • 61 61 62 62 63 63: Tag
  • The rest is unknown, but should include the sequence number, maybe some timestamps and maybe versions.

@dveeden
Copy link
Collaborator

dveeden commented Feb 11, 2025

The fields are defined here:
https://github.com/mysql/mysql-server/blob/6b6d3ed3d5c6591b446276184642d7d0504ecc86/libs/mysql/binlog/event/control_events.h#L1107-L1178

With this I think I now have the format:

Event:

# at 158
#250211 11:39:29 server id 1  end_log_pos 250 CRC32 0x4b2c9430
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags
# 0000009e e1 28 ab 67   2a   01 00 00 00   5c 00 00 00   fa 00 00 00   00 00
# 000000b1 02 8a 00 00 02 02 ee 81  02 c1 02 01 03 41 03 81 |.............A..|
# 000000c1 03 c1 03 c5 03 22 22 ee  fd 03 ee fd 03 ee fd 03 |................|
# 000000d1 04 d9 03 06 18 61 61 61  61 62 62 62 62 63 63 63 |.....aaaabbbbccc|
# 000000e1 63 08 00 0a 04 0c 7f 39  7d 89 70 db 2d 06 10 5d |c......9..p.....|
# 000000f1 03 12 c3 02 0b 30 94 2c  4b                      |.....0..K|
# 	GTID	last_committed=0	sequence_number=1	rbr_only=no	original_committed_timestamp=1739270369410361	immediate_commit_timestamp=1739270369410361	transaction_length=215
# original_commit_timestamp=1739270369410361 (2025-02-11 11:39:29.410361 CET)
# immediate_commit_timestamp=1739270369410361 (2025-02-11 11:39:29.410361 CET)
/*!80001 SET @@session.original_commit_timestamp=1739270369410361*//*!*/;
/*!80014 SET @@session.original_server_version=90200*//*!*/;
/*!80014 SET @@session.immediate_server_version=90200*//*!*/;
SET @@SESSION.GTID_NEXT= '77a0b0c0-d0e0-f0f1-1111-77ff77ff77ff:aaaabbbbcccc:123'/*!*/;

Body:

# 000000b1 02 8a 00 00 02 02 ee 81  02 c1 02 01 03 41 03 81 |.............A..|
# 000000c1 03 c1 03 c5 03 22 22 ee  fd 03 ee fd 03 ee fd 03 |................|
# 000000d1 04 d9 03 06 18 61 61 61  61 62 62 62 62 63 63 63 |.....aaaabbbbccc|
# 000000e1 63 08 00 0a 04 0c 7f 39  7d 89 70 db 2d 06 10 5d |c......9..p.....|
# 000000f1 03 12 c3 02 0b

Decoded format:

02: version 1
8a: length
00: last ignorable field
Field
00: gtid_flags      02 (rbr_only=no)
02: uuid            ee 81  02 c1 02 01 03 41 03 81 03 c1 03 c5 03 22 22 ee fd 03 ee fd 03 ee fd 03
04: gno             d9 03 // 123
06: tag             18 61 61 61 61 62 62 62 62 63 63 63
08: last commited   00
0a: sequencenr      04
0c: im commit ts    7f 39  7d 89 70 db 2d 06
0e: orig commit ts  <missing>
10: tx len          5d 03    // 215
12: im srv ver      c3 02 0b // 90200
14: orig srv ver    <missing>
16: group ticket    <missing>

@dveeden
Copy link
Collaborator

dveeden commented Feb 11, 2025

For the number encoding:

Encoded Decoded Method Raw Shifted Marker
d9 03 123 0x03d9>>3 00000011 11011001 00000000 01111011 001
04 1 0x04>>2 00000100 00000001 00
7f 39 7d 89 70 db 2d 06 1739270369410361 0x062ddb70897d397f>>8
5d 03 215 0x035d>>2 00000011 01011101 00000000 11010111 01
c3 02 0b 90200 0x0b02c3>>3 00001011 00000010 11000011 00000001 01100000 01011000 011

@dveeden
Copy link
Collaborator

dveeden commented Mar 5, 2025

mysql.ParseMysqlGTIDSet() doesn't yet support this right?

@dveeden dveeden reopened this Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants