Skip to content

Proposal: Implicit commit detection #1529

Open
@steverukuts

Description

@steverukuts

Overview

  1. MySQL silently commits your transaction when you manipulate a temporary table after starting a transaction.
  2. The MySQL protocol provides a way to detect this mistake, but I don't think this driver uses it.
  3. The proposed solution however is quite radical and deserves considerable debate.
  4. Perhaps opening up the query pipeline for extension could be an alternative move.

Problem statement

Let's take a look at an example of how you can corrupt your data by accident:

-- We set up an example database and table:
CREATE DATABASE `example`;
USE `example`;
CREATE TABLE `DemoData` (Value VARCHAR(20));

-- And now we start a transaction
BEGIN TRANSACTION;

-- Let's create some temporary storage:
CREATE TEMPORARY TABLE temp (value VARCHAR(20));

-- And now let's add an index on our temporary table:
CREATE INDEX temp_value ON temp (value);

-- Now let's add a sample row to DemoData (the real table, defined before we started the transaction)
INSERT INTO `DemoData` VALUES ('example value');

-- Change of plan, let's undo all that work. DemoData should be empty again, right?
ROLLBACK;

-- Surprise! You actually already committed your changes, and your rollback didn't work:
SELECT * FROM DemoData; -- returns one row!

The correct thing for the developer to do here is to set up the table outside of the transaction scope, but if they miss that, they will silently commit the transaction, and every subsequent command is run outside of the transaction.

To the developer moving from SQL Server to MySQL, it looks like you are within a transaction, and you may not discover the truth for many years. This is just an example from a real bug I have seen, but there are other ways to implicitly commit a transaction in this way.

Solution overview

Fortunately, the MySQL protocol provides a way for us to find this problem. The OkPayload can be used to detect this problem after executing a query. You can look at the ServerStatus bitfield, which contains an InTransaction flag.

The InTransaction flag will be in the OkPacket of every query in the transaction until the next query that is not in the transaction. To refer back to my earlier example:

CREATE TEMPORARY TABLE temp (value VARCHAR(20));   -- InTransaction
CREATE INDEX temp_value ON temp (value);           -- InTransaction (this statement causes an implicit commit)
INSERT INTO `DemoData` VALUES ('example value');   -- Not InTransaction

So, when the developer starts a transaction, the driver should check the OkPayload's ServerStatus flags for InTransaction, and if it finds that the transaction is no longer active despite not disposing of the transaction yet, an exception can be raised to notify the developer that the last command actually committed the transaction.

Example API usage:

connection.Execute("CREATE TABLE `DemoData` (Value VARCHAR(20));");  // not InTransaction, but we don't care as we aren't in a transaction scope

using (var txn = connection.BeginTransaction())
{
    txn.Execute("CREATE TEMPORARY TABLE temp (value VARCHAR(20));"); // InTransaction
    txn.Execute("CREATE INDEX temp_value ON temp (value);");         // InTransaction
    txn.Execute("INSERT INTO `DemoData` VALUES ('example value');"); // throws an exception - "the last statement caused an implicit commit!"
}

This behaviour should not be enabled by default. Even though it represents a bug in the developer's code, the unfortunate thing about receiving the transaction state in the OK packet means that the last INSERT statement was actually committed to the database, so enabling this behaviour as a default will definitely cause more problems than it solves.

For discussion

  1. Does the project team agree that this idea has merit?
  2. I am happy to implement this if desired but wanted to know that the work would have a chance of being merged.
  3. Is there an alternative way to implement this, perhaps via some kind of middleware or add-on? I did go looking for an alternative way to implement this without changing the driver itself but couldn't figure that out.
  4. Is there some server or client configuration that mitigates this mistake that I have somehow overlooked?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions