Skip to content

Trailing zeros in fixed-length binary fields are dropped #400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
oseemann opened this issue May 11, 2023 · 1 comment · Fixed by #401
Closed

Trailing zeros in fixed-length binary fields are dropped #400

oseemann opened this issue May 11, 2023 · 1 comment · Fixed by #401

Comments

@oseemann
Copy link
Contributor

Observed behaviour:
Values of columns with fixed-length binary type (e.g. binary(4)) with trailing zeros are not properly restored when decoding the binlog. E.g. the value \x01\x02\x00\x00 stored in the a binary(4) column is captured as \x01\x02.

Expected behaviour:
The row event should capture the same value as is returned in MySQL. E.g. in the above case the value should be \x01\x02\x00\x00.

Further information:
See https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html

When BINARY values are stored, they are right-padded with the pad value to the specified length. The pad value is 0x00 (the zero byte). Values are right-padded with 0x00 for inserts, and no trailing bytes are removed for retrievals.

It appears that the 0-padding for fixed-length binary fields happens at retrieval time and not at write time. I.e. any trailing zeros are NOT written to disk (and the binlog), because they are padded at retrieval anyway.

The following test cases demonstrate the issue:

diff --git a/pymysqlreplication/tests/test_data_type.py b/pymysqlreplication/tests/test_data_type.py
index 0b3c81a..1b261ea 100644
--- a/pymysqlreplication/tests/test_data_type.py
+++ b/pymysqlreplication/tests/test_data_type.py
@@ -101,6 +101,18 @@ class TestDataType(base.PyMySQLReplicationTestCase):

         return event

+    def test_varbinary(self):
+        create_query = "CREATE TABLE test(b VARBINARY(4))"
+        insert_query = "INSERT INTO test VALUES(UNHEX('ff010000'))"
+        event = self.create_and_insert_value(create_query, insert_query)
+        self.assertEqual(event.rows[0]["values"]["b"], b'\xff\x01\x00\x00')
+
+    def test_fixed_width_binary(self):
+        create_query = "CREATE TABLE test(b BINARY(4))"
+        insert_query = "INSERT INTO test VALUES(UNHEX('ff010000'))"
+        event = self.create_and_insert_value(create_query, insert_query)
+        self.assertEqual(event.rows[0]["values"]["b"], b'\xff\x01\x00\x00')
+

Here the test_varbinary test suceeds, but the test_fixed_width_binary tests fails, with

>       self.assertEqual(event.rows[0]["values"]["b"], b'\xff\x01\x00\x00')
E       AssertionError: b'\xff\x01' != b'\xff\x01\x00\x00'

From the column_schema we know when a field is of fixed-length binary type and can do the zero-padding at retrieval time in the library as well. I'll be working on a fix.

oseemann added a commit to oseemann/python-mysql-replication that referenced this issue May 13, 2023
MySQL is zero-padding fixed-length binary fields [1], but is not storing
trailing zeros to the binlog. Consequently, when reading values of
fixed-length binary fields, the value must be padded with zeros up to
the specified length.

The `test_fixed_length_binary` test case exemplifies the issue.
`varbinary` columns are not padded and not affected.

This commit extends the `information_schema` query in
`__get_table_information` to fetch information about the length
limitation of the field and stores this information in the
`fixed_binary_length` attribute of `Column` instances.

Upon decoding of binary fields (which share the same binlog field type
value of `254` as strings), the value is zero padded at the end, when it
does not meet the specified length as retrieved from the information
schema.

Fixes julien-duponchelle#400.

[1] https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html
julien-duponchelle pushed a commit that referenced this issue Jun 3, 2023
MySQL is zero-padding fixed-length binary fields [1], but is not storing
trailing zeros to the binlog. Consequently, when reading values of
fixed-length binary fields, the value must be padded with zeros up to
the specified length.

The `test_fixed_length_binary` test case exemplifies the issue.
`varbinary` columns are not padded and not affected.

This commit extends the `information_schema` query in
`__get_table_information` to fetch information about the length
limitation of the field and stores this information in the
`fixed_binary_length` attribute of `Column` instances.

Upon decoding of binary fields (which share the same binlog field type
value of `254` as strings), the value is zero padded at the end, when it
does not meet the specified length as retrieved from the information
schema.

Fixes #400.

[1] https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html
julien-duponchelle pushed a commit that referenced this issue Jun 3, 2023
MySQL is zero-padding fixed-length binary fields [1], but is not storing
trailing zeros to the binlog. Consequently, when reading values of
fixed-length binary fields, the value must be padded with zeros up to
the specified length.

The `test_fixed_length_binary` test case exemplifies the issue.
`varbinary` columns are not padded and not affected.

This commit extends the `information_schema` query in
`__get_table_information` to fetch information about the length
limitation of the field and stores this information in the
`fixed_binary_length` attribute of `Column` instances.

Upon decoding of binary fields (which share the same binlog field type
value of `254` as strings), the value is zero padded at the end, when it
does not meet the specified length as retrieved from the information
schema.

Fixes #400.

[1] https://dev.mysql.com/doc/refman/5.7/en/binary-varbinary.html
@julien-duponchelle
Copy link
Owner

I published a 0.41 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants