Skip to content

Strange error in event_row.py #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bohnster opened this issue Dec 18, 2013 · 18 comments
Closed

Strange error in event_row.py #56

bohnster opened this issue Dec 18, 2013 · 18 comments

Comments

@bohnster
Copy link

Hi,

Nice library.

Here is a problem that has started happening, and I don't know why. Here is the Python error:

File "/usr/lib/python2.6/site-packages/pymysqlreplication/binlogstream.py", line 146, in fetchone
self.use_checksum)
File "/usr/lib/python2.6/site-packages/pymysqlreplication/packet.py", line 93, in __init

ctl_connection)
File "/usr/lib/python2.6/site-packages/pymysqlreplication/row_event.py", line 491, in init
column_schema = self.column_schemas[i]
IndexError: tuple index out of range

I think this is due to some MySQL internal query that pymysqlreplication has trouble with. The EventType that appears to trigger the failure is a QueryEvent, but the event is an internal MySQL event. Here some events that occur just before the failure. You can see that these are internal MySQL queries:

=== QueryEvent ===
Date: 2013-12-11T16:31:15
Log position: 30510
Event size: 80
Read bytes: 80
Schema: mysql
Execution time: 0
Query: TRUNCATE TABLE time_zone_transition
=== QueryEvent ===
Date: 2013-12-11T16:31:15
Log position: 30614
Event size: 85
Read bytes: 85
Schema: mysql
Execution time: 0
Query: TRUNCATE TABLE time_zone_transition_type
=== QueryEvent ===
Date: 2013-12-11T16:31:15
Log position: 30683
Event size: 50
Read bytes: 50
Schema: mysql
Execution time: 0
Query: BEGIN

^^ error occurs immediately after this

You can see that the date is 2013.12.11.

I looked at the source code where the error occurs:

485 for i in range(0, len(column_types)):
486 column_type = column_types[i]
487 column_schema = self.column_schemas[i]
488 col = Column(byte2int(column_type), column_schema, from_packet)
489 self.columns.append(col)

The problem is that self.column_schemas is None, so index fails.

Any ideas on this? Thanks.

@bjoernhaeuser
Copy link
Collaborator

At which position do you start?

@bohnster
Copy link
Author

Thanks for the quick reply. This is the first QueryEvent processed:

=== QueryEvent ===
Date: 2013-01-07T03:28:53
Log position: 211
Event size: 85
Read bytes: 85
Schema: test
Execution time: 0
Query: create table cb(id int, name varchar(32))

That creates a table.

I think this is a variant of Issue #36. It may be that MySQL itself creates/drops an internal table that causes this error where the TableMap cannot occur because the table no longer exists. Did issue 36 ever get fixed? If so, I can try the fix and see if that fixes this problem.

Thanks.

@bjoernhaeuser
Copy link
Collaborator

You are missing the TABLE_MAP_EVENTf for this table. Without that this errors happens.

@bohnster
Copy link
Author

How did the TABLE_MAP_EVENT come to be missing? I don't understand how it was working fine for a long time, and then all of a sudden it stops with this error.

@bjoernhaeuser
Copy link
Collaborator

You are not starting at a valid position in the logfile. Can you show us some code where you connect?

@bohnster
Copy link
Author

Hi, here is the Python code to startup the reader:

    binlogfile, binlogpos, resume_stream = (None, None, False)
    # Initialize stream reader:
    stream = BinLogStreamReader(connection_settings=self.dsn,
                                server_id=1016683,
                                log_file=binlogfile,
                                log_pos=binlogpos,
                                resume_stream=resume_stream,
                                only_events=[DeleteRowsEvent, WriteRowsEvent, UpdateRowsEvent, QueryEvent],
                                blocking=True)

Then here is the loop that processes the events:

def getStream(self, eventHandler):

    """                                 
    Gets the stream of events from the binlog

    """                                 
    for binlogevent in self.sReader:    
        print binlogevent.dump()    <-- to help debug this
        try:                            
            method = getattr(eventHandler, 'handle_'+binlogevent.__class__.__name__)
            method(binlogevent, self.sReader.log_file, self.sReader.log_pos)
        except: # should never happen since we are only processing events we specify
            method = getattr(eventHandler, 'logUnknownEvent')
            self.logger.logInfo(method(binlogevent, self.sReader.log_file, self.sReader.log_pos))

@banks
Copy link
Contributor

banks commented Jul 3, 2014

I've found at least 2 other ways this error can occur, one is a fundamental limitation of this libraries design, one is a bug:

  1. If you are replaying old binary logs then the library may see a "new" table id and will look up //current// schema. If the table had more columns before when log was written than it does now when you are playing it back then you will run into this error. This is a fundamental limitation of this approach - it only really works if you don't change schema OR don't rely on being able to start replaying old logs from before the schema change (which if you think about it is in practice the same as never changing schema given that it's asynchronous). If you only ever add columns to end of table and never remove them it's also safe.
  2. This one is a bug: table id in replication log is NOT a persistent mapping in MySQL. That means every time MySQL server restarts you might see a table id re-used in binary log that was used for another table in previous log files. This library doesn't take that into account so if you run through several log files in one script execution where the old log files were before a restart and the later ones were after restart then eventually you will hit this error (or other harder ones) because it's trying to decode row schema based on a totally different table. There is a relatively simple fix for this: whenever MySQL restarts it will rotate to a new log file so if you just clear the table map in memory on each rotate event then this problem can't happen.

I can submit a PR for the fix if required but it's literally just self.table_map = {} in the fetchone method right after the new file and position are set by rotate event.

@julien-duponchelle
Copy link
Owner

For point 1 i agree it's limitation of the design, no idea about how to fix it.

For point 2 i will appreciate a PR if you have time because, you can appear as a contributor in the project. It's one liner but a tricky bug.

@banks
Copy link
Contributor

banks commented Jul 7, 2014

Yeah, regarding point 1, I'm not sure there is a better option in a general purpose library like this.

FWIW I'm planning a change capture system for MySQL that will not eventually use this library but I came across it in the process.

In my design, I plan to have each table that is being replicated manually set up initially by taking the CREATE TABLE SQL from the current schema and actually creating a replica on a local MySQL instance (no data will be written just empty copy of table). I'll then use DESCRIBE TABLE or query the schema info like this lib does to derive a custom encoding schema for each row (for example an Avro schema). To keep up with schema changes on master, I plan to listen for Query events in binlog that contain ALTER TABLE syntax. Instead of parsing that myself, I'll just run those queries on the MySQL table I have locally and then re-generate the new row schema from that table.

So to be clear the local tables will never have any data in, just have the structure and allow me to "playback" the ALTER statements such that I can have an up to date schema for each row rather than rely on being able to get the schema back from master without race whenever I read an alter. Since my system will need to maintain causal consistency, each update will have a monotonically increasing "transaction id" that will survive master failover etc (with some effort). I can then maintain a list of row schemas and which ranges of transaction ids each one applies to so that clients can always decode with correct schema without storing it on a per-row basis.

Using something like Avro along with a log of which rows/updates correspond to which version allow schema changes to be handled relatively nicely by clients - no synchronisation to recompile stuff when schema changes for a table and can be consumed easily by strongly-typed languages as well as row snapshots stored compactly since no default values or field names need to be present.

I made pull request for issue 2 as requested. Hope this is helpful!

@julien-duponchelle
Copy link
Owner

Your solution should work and it's very smart

@dmariassy
Copy link

Hi,

and thanks for the very handy library! 👍

Although this thread has been closed for a while, I think that the issue pointed out by @banks continues to pose a serious problem for many potential users. The ability to reliably replay events that were created before the streaming process started (or while the process was temporarily down) is required in almost all real-world use-cases.

But I think that adding this functionality would not be that difficult if we (as suggested by banks and implemented by debezium.io):

  1. Captured QueryEvents that changed the underlying schema,
  2. Executed the statements on a second MySQL instance,
  3. Pointed the BinLogStreamReader's __get_table_information method to this server for table mapping.

This way, the RowEvent that is currently being processed would be always validated against the schema that was in place at the time of its creation.

As a matter of fact, this is the design I followed when monkey-pathching your library, which is now used in production at our company :)

I have taken a few minutes to apply these changes to your project's source code and would be happy to submit a PR if you think that adding this feature would be useful.

However, I have not yet come up with a way that would allow me to test this feature in your environment. We're using a pretty different test set-up, which includes all the components we use in our data pipeline, and I wouldn't be able to simply transfer our scripts to your CI env. In any case, I think that testing this feature would require using Docker, Vagrant or some other tool that allows for the total sandboxing of components given the necessity to launch two MySQL servers.

Let me know what you think!

@banks
Copy link
Contributor

banks commented Aug 4, 2016

Nitpick:

as suggested by banks and implemented by debezium.io

Debezium actually don't use a second MySQL instance but rolled their own MySQL DDL parser and build schema mappings that way. It's nice in that they don't need operational complexity of a MySQL instance to manage schemas, but like the Yelp guys I was always wary of the effort required to do that and keep it perfectly in sync and bug free compared to MySQL releases/behaviour. I can't comment on their implementation though.

Nice that the Yelp guys ended up with almost identical solution to the one I described above though over 2 years ago... I never built mine so nice to see the idea was sound!

Just a note for anyone else who digs up this thread.

@julien-duponchelle
Copy link
Owner

Deploying a second instance seem not possible for some people. Do you think it's possible to implement it as an optionnal behavior?

@banks
Copy link
Contributor

banks commented Aug 5, 2016

This is an old closed thread if you heave feature suggestions the authors are more likely to notice on a new issue...

FWIW what I described above and Yelp mentioned recently is not really a "second instance" - there is no copy of the data. It's just a way to parse the schema changes in DDL statements without rolling your own parser.

If you really don't like the overhead of that solution, debezium.io already rolled the other one.

Also I think python-MySQL-replication is too low level for this kind of feature as I mentioned - I suggest you check out a higher level framework like debezium if you don't want to build that stuff out yourself.

On 5 Aug 2016, at 10:28, Julien Duponchelle [email protected] wrote:

Deploying a second instance seem not possible for some people. Do you think it's possible to implement it as an optionnal behavior?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@felexkemboi
Copy link

Is there a way I can get the raw sql queries so that i can replay again for postresql

@julien-duponchelle
Copy link
Owner

julien-duponchelle commented Jun 28, 2019 via email

@felexkemboi
Copy link

My problem is would like to get the raw queries then consume them.Like get the raw sql queries then write those queries in another database

@felexkemboi
Copy link

=== WriteRowsEvent === Date: 2019-06-28T03:21:53 Log position: 1838 Event size: 35 Read bytes: 12 Table: student.marks Affected columns: 5 Changed rows: 1 Values:
this is the output i am getting when i run my script.I would love to extract the raw sql query that did that change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants