Skip to content

Untrusted binlog row snapshot data >=v1.0 #510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jongwony opened this issue Oct 6, 2023 · 9 comments · Fixed by #516
Closed

Untrusted binlog row snapshot data >=v1.0 #510

jongwony opened this issue Oct 6, 2023 · 9 comments · Fixed by #516
Assignees

Comments

@jongwony
Copy link

jongwony commented Oct 6, 2023

Binlog triggers events well, but not all snapshot data is received.

same log position with version 0.43, 1.0.1 but snapshot is different contains only null value

in 0.43

{"before_values": {"id": 2877297, "user_id": 669862, "send_type": "push", "category": "marketing", "is_approved": 0, "create_time": "2021-12-09T10:20:24", "update_time": "2023-08-24T17:16:53"}, "after_values": {"id": 2877297, "user_id": 669862, "send_type": "push", "category": "marketing", "is_approved": 0, "create_time": "2021-12-09T10:20:24", "update_time": "2023-10-06T14:36:24"}}

in 1.0.1

{"before_values": {"null": "2023-08-24T17:16:53"}, "after_values": {"null": "2023-10-06T14:36:24"}}
mysql> show variables like '%binlog%';
+------------------------------------------------+----------------------+
| Variable_name                                  | Value                |
+------------------------------------------------+----------------------+
| aurora_binlog_reserved_event_bytes             | 1024                 |
| binlog_cache_size                              | 32768                |
| binlog_checksum                                | CRC32                |
| binlog_direct_non_transactional_updates        | OFF                  |
| binlog_encryption                              | OFF                  |
| binlog_error_action                            | ABORT_SERVER         |
| binlog_expire_logs_seconds                     | 0                    |
| binlog_format                                  | ROW                  |
| binlog_group_commit_sync_delay                 | 0                    |
| binlog_group_commit_sync_no_delay_count        | 0                    |
| binlog_gtid_simple_recovery                    | ON                   |
| binlog_max_flush_queue_time                    | 0                    |
| binlog_order_commits                           | ON                   |
| binlog_rotate_encryption_master_key_at_startup | OFF                  |
| binlog_row_event_max_size                      | 8192                 |
| binlog_row_image                               | FULL                 |
| binlog_row_metadata                            | MINIMAL              |
| binlog_row_value_options                       |                      |
| binlog_rows_query_log_events                   | OFF                  |
| binlog_stmt_cache_size                         | 32768                |
| binlog_transaction_compression                 | OFF                  |
| binlog_transaction_compression_level_zstd      | 3                    |
| binlog_transaction_dependency_history_size     | 25000                |
| binlog_transaction_dependency_tracking         | COMMIT_ORDER         |
| innodb_api_enable_binlog                       | OFF                  |
| log_statements_unsafe_for_binlog               | ON                   |
| max_binlog_cache_size                          | 18446744073709547520 |
| max_binlog_size                                | 134217728            |
| max_binlog_stmt_cache_size                     | 18446744073709547520 |
| sync_binlog                                    | 1                    |
+------------------------------------------------+----------------------+
30 rows in set (0.02 sec)

I think starting from 1.0, it seems that binlog needs to be changed from binlog_row_metadata=MINIMAL to FULL, but compatibility seems to be necessary even if it was previously MINIMAL.

We have not yet tested whether this situation occurs even when binlog_row_metadata=FULL, and this is an issue that occurred during only mysql-replication package deployment.

@sean-k1
Copy link
Collaborator

sean-k1 commented Oct 6, 2023

@jongwony

can u try 1.0.2 version and 0.45.1

1.0.1 is a yanked version

@jongwony
Copy link
Author

@sean-k1 I try both version 0.45.1, 1.0.2.
but 0.45.1

{"before_values": {"id": 2877297, "user_id": 669862, "send_type": "push", "category": "marketing", "is_approved": 0, "create_time": "2021-12-09T10:20:24", "update_time": "2023-08-24T17:16:53"}, "after_values": {"id": 2877297, "user_id": 669862, "send_type": "push", "category": "marketing", "is_approved": 0, "create_time": "2021-12-09T10:20:24", "update_time": "2023-10-06T14:36:24"}}

1.0.2

{"before_values": {"null": "2023-08-24T17:16:53"}, "after_values": {"null": "2023-10-06T14:36:24"}}

same problem occurred

@sean-k1
Copy link
Collaborator

sean-k1 commented Oct 10, 2023

@jongwony
Thanks for reporting Issue.

The Json parsing logic part is the exactly same 0.45.1 and 1.0.2.
So I think it's a different issue.

Can you let me know about your BinlogStreamReader setting

stream = BinLogStreamReader(connection_settings = mysql_settings, server_id=100)

for binlogevent in stream:
    # add your case

My case

1.0.2 version
스크린샷 2023-10-10 오후 1 38 39

sql query

CREATE TABLE test_json_v2 (
                            id INT primary KEY,
                            c JSON
                        );

INSERT INTO test_json_v2 VALUES (102, '{"id": 2877297, "user_id": 669862, "send_type": "push", "category": "marketing", "is_approved": 0, "create_time": "2021-12-09T10:20:24", "update_time": "2023-08-24T17:16:53"}');

UPDATE test_json_v2 SET c = '{"id": 2877297, "user_id": 669862, "send_type": "push", "category": "marketing", "is_approved": 0, "create_time": "2021-12-09T10:20:24", "update_time": "2023-10-06T14:36:24"}' WHERE id = 102;

dumps like this


=== QueryEvent ===
Date: 2023-10-10T04:14:49
Log position: 28079
Event size: 239
Read bytes: 239
Schema: b'pymysqlreplication_test'
Execution time: 0
Query: CREATE TABLE test_json_v2 (
                            id INT primary KEY,
                            c JSON
                        )

=== QueryEvent ===
Date: 2023-10-10T04:28:54
Log position: 28252
Event size: 71
Read bytes: 71
Schema: b'pymysqlreplication_test'
Execution time: 0
Query: BEGIN

=== TableMapEvent ===
Date: 2023-10-10T04:28:54
Log position: 28331
Event size: 56
Read bytes: 56
Table id: 3183
Schema: pymysqlreplication_test
Table: test_json_v2
Columns: 2

=== WriteRowsEvent ===
Date: 2023-10-10T04:28:54
Log position: 28551
Event size: 197
Read bytes: 12
Table: pymysqlreplication_test.test_json_v2
Affected columns: 2
Changed rows: 1
Column Name Information Flag: False
Values:
--
* None : {b'id': 2877297, b'user_id': 669862, b'category': b'marketing', b'send_type': b'push', b'create_time': b'2021-12-09T10:20:24', b'is_approved': 0, b'update_time': b'2023-08-24T17:16:53'}

=== XidEvent ===
Date: 2023-10-10T04:28:54
Log position: 28582
Event size: 8
Read bytes: 8
Transaction ID: 5998

=== QueryEvent ===
Date: 2023-10-10T04:28:54
Log position: 28764
Event size: 80
Read bytes: 80
Schema: b'pymysqlreplication_test'
Execution time: 0
Query: BEGIN

=== TableMapEvent ===
Date: 2023-10-10T04:28:54
Log position: 28843
Event size: 56
Read bytes: 56
Table id: 3183
Schema: pymysqlreplication_test
Table: test_json_v2
Columns: 2

=== UpdateRowsEvent ===
Date: 2023-10-10T04:28:54
Log position: 29249
Event size: 383
Read bytes: 13
Table: pymysqlreplication_test.test_json_v2
Affected columns: 2
Changed rows: 1
Column Name Information Flag: False
Values:
--
*None:{b'id': 2877297, b'user_id': 669862, b'category': b'marketing', b'send_type': b'push', b'create_time': b'2021-12-09T10:20:24', b'is_approved': 0, b'update_time': b'2023-08-24T17:16:53'}=>{b'id': 2877297, b'user_id': 669862, b'category': b'marketing', b'send_type': b'push', b'create_time': b'2021-12-09T10:20:24', b'is_approved': 0, b'update_time': b'2023-10-06T14:36:24'}

=== XidEvent ===
Date: 2023-10-10T04:28:54
Log position: 29280
Event size: 8
Read bytes: 8
Transaction ID: 5999

I think this case is normal case when we set metadata = "MINIMAL"

*None:{b'id': 2877297, b'user_id': 669862, b'category': b'marketing', b'send_type': b'push', b'create_time': b'2021-12-09T10:20:24', b'is_approved': 0, b'update_time': b'2023-08-24T17:16:53'}=>{b'id': 2877297, b'user_id': 669862, b'category': b'marketing', b'send_type': b'push', b'create_time': b'2021-12-09T10:20:24', b'is_approved': 0, b'update_time': b'2023-10-06T14:36:24'}

I don't understand {"before_values": {"null": "2023-08-24T17:16:53"}, "after_values": {"null": "2023-10-06T14:36:24"}}
this case

Can you let me know how can i reproduce this case?

@jongwony
Copy link
Author

Sorry for the confusion. This data is not an example of a JSON column, but a general case.

It seems that the problem about MINIMAL you mentioned is correct, and what I said is that the key of before_values ​​and after_values ​​should be c, not None.

In the previous version 0.45.1, it appears that columns were tracked even if metadata=MINIMAL.

@jongwony
Copy link
Author

If you re-run the test case you showed with a different version for the same binlog position, I expect the column to have a value other than None.

@jongwony
Copy link
Author

0.45.1 event.dump()

image

1.0.2 event.dump()

image

@sean-k1
Copy link
Collaborator

sean-k1 commented Oct 10, 2023

@jongwony
Now i understand this problem!

# row_event.py 
        for i in range(0, nb_columns):
            column = self.columns[i]
            name = self.table_map[self.table_id].columns[i].name
            unsigned = self.table_map[self.table_id].columns[i].unsigned

            values[name] = self.__read_values_name( <--- Problem 
                column,
                null_bitmap,
                null_bitmap_index,
                cols_bitmap,
                unsigned,
                i,
            )

we don't know column_name so it is null
python dict values[name] key null always overwrite value object!
so You just find only last order object Value about update_time

so If you change binlog_row_metadata Minimal => FULL
it will resolve the Issue.

@mindler-sasu
Copy link

binlog_row_metadata is not a valid variable for MySQL 5.7.

@sean-k1
Copy link
Collaborator

sean-k1 commented Oct 11, 2023

binlog_row_metadata is not a valid variable for MySQL 5.7.

@mindler-sasu

Yes so we support 0.45.1 version for mysql 5.7 version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants