Skip to content

Transaction support for connections #586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 4, 2013
Merged

Transaction support for connections #586

merged 10 commits into from
Oct 4, 2013

Conversation

jflitton
Copy link
Contributor

@jflitton jflitton commented Sep 9, 2013

I've added beginTransaction, commit, and rollback functions to the connection. I've also added transaction state tracking so that calling beginTransaction, commit, and rollback multiple times will not result in additional commands to the database.

The pool's releaseConnection method will now auto-rollback if a transaction has begun but not yet been committed.

Does this align approximately with how you had planned on implementing transactions? Happy to change anything that needs changing.

Thanks!

Jeff Flitton

…rollback to releaseConnection at the pool level.
Connection.prototype.beginTransaction = function(cb) {
if (this.inTransaction) {
// Already in a transaction, call back
return cb();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't an error be passed into the cb here? Same for the other similar conditions below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It certainly could be. I chose this silent approach because it allows me to mindlessly always begin a transaction in function B even though function A might have already opened one. Certainly I could check connection.inTransaction everywhere in my data-access layer though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective transactions are supposed to be a mechanism for being very precise about encapsulating a group of operations. Being able to accidentally merge two transactions into one seems rather undesirable to me, even if convenient in some cases. Would be good to hear how other people feel about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I agree, if there is going to be an API for transactions, they should keep the user from doing things like starting a transaction while one is in progress. This would probably signify some faulty program logic.

@felixge
Copy link
Collaborator

felixge commented Sep 9, 2013

LGTM. Just to make sure I understand this properly: This API is needed so the connection pool can be aware of transactions? It's not really useful for individual connections, right?

@felixge
Copy link
Collaborator

felixge commented Sep 9, 2013

@jflitton I also just added you as a contributor on the github repo. plz don't merge yet until we've discussed this a little further so

@jflitton
Copy link
Contributor Author

jflitton commented Sep 9, 2013

That's correct, this is really only useful when using connection pooling. If a single, shared connection was in use throughout the application transactions would be unusable.

@felixge
Copy link
Collaborator

felixge commented Sep 9, 2013

If a single, shared connection was in use throughout the application transactions would be unusable.

Huh? I understand why you wouldn't need a high level API for them, but not why they'd be unusable.

@felixge
Copy link
Collaborator

felixge commented Sep 9, 2013

Anyway, this patch could also use some docs.

Also, I've just realized another problem: All the this.inTransaction is wrong as you're setting it when the methods are called. However, the connection is not in a transaction (or out of one) until the enqueued query has actually been executed.

@jflitton
Copy link
Contributor Author

jflitton commented Sep 9, 2013

If inTransaction isn't updated until the queries execute, then checking inTransaction at runtime to determine if it's appropriate to being a transaction becomes problematic, as that query might be queued. This is another symptom of the approach I'm using to "mix and match" various functions that sometimes create their own transaction and sometimes are part of another, higher-level transaction. I think passing errors on beginTransaction, commit, and rollback and changing inTransaction to toggle during the callback is the right choice, and I plan to change my project to use this approach.

@dresende
Copy link
Collaborator

I like this and need it. Although a pool is needed in a concurrent app, it's also important in a single connection (otherwise how can you rollback?).

@felixge
Copy link
Collaborator

felixge commented Sep 10, 2013

I think the inTransaction logic is still wrong. Consider this case:

  • User calls beginTransaction()
  • User queues a few queries and a commit()
  • The beginTransaction() finishes, so the inTransaction is true
  • The user now wants to queue another beginTransaction and receives an error, even so this seems perfectly ok given that he's already queued the closing of the previous transaction.

Question: How does MySQL handle this by itself? What happens if you try to begin multiple transactions? If MySQL gives us an error in this case, we should forward that to the user, and consider it as a fatal error that closes the connection.

As far as the pool is concerned: IIRC there is a way to "reset" the state of a MySQL connection using the CHANGE_USER command. Unless the mysql server people fucked up, this should also rollback any non-commited transactions. It would also allow us to avoid this additional API. Sorry for just bringing this up now, it's been a while that I've thought about these issues.

@dresende
Copy link
Collaborator

I think calling START TRANSACTION multiple times works (creating nested transactions), no errors are given. Also, I think using non transactional tables will make a ROLLBACK useless without the user knowing (no errors given, no rollback done).

@kai-koch
Copy link
Collaborator

This patch looks promising!

But I got a few points to be considered before merging (Mainly from a documentation point of view and overlapping with issues @felixge and @dresende have.):

  • For clarity reasons: START TRANSACTION instead of the alias BEGIN should be used to build the query in .beginTransaction. Although it is an allowed alias, the BEGIN statement is also used in the "Compound-Statement Syntax” and could be a source of confusion, IMO.
  • From the MySQL-manual http://dev.mysql.com/doc/refman/5.5/en/commit.html :

If you use tables that are not transaction-safe within a transaction, changes to those tables are stored at once, regardless of the status of autocommit mode.

While the user should know on what kind of tables transaction can be used, an additional link to the MySQL documentation would not hurt.

If you issue a ROLLBACK statement after updating a nontransactional table within a transaction, an ER_WARNING_NOT_COMPLETE_ROLLBACK warning occurs. Changes to transaction-safe tables are rolled back, but not changes to nontransaction-safe tables.

How this warning is handled, should also be documented. (Does the user have to catch this, does the API drop it silently?)

  • As I under stand the code of this patch, consequent calls to .beginTransaction do not enqueue START TRANSACTION, until the .inTransaction property is set to false by a COMMIT or ROLLBACK, but simply ignores it.

This is different from the default MySQL behaviour., where a second START TRANSACTION before a COMMIT or ROLLBACK implies an auto-COMMIT.
It needs to be documented!

LOCK TABLES is not transaction-safe and implicitly commits any active transaction before attempting to lock the tables.

If a user enqueues a LOCK TABLES statement, all statements before that are auto-commited, with out the API noticing it in the .inTransaction property. This might result in an incomplete and unexpected ROLLBACK result or an error (Does a ``COMMITorROLLBACK` with out an `START TRANSACTION` give an error?).
Again the user is responsible for not doing or keeping track of this.
But even for users having used other DBMS-systems with transactions, there should be a big "RTFM" on "MySQL transactions before using them" in the documentation. Or else node-mysql has a lot of whole new issues opened on a daily basis. ;-)

I do not want to do nit-picking and I know this project is not about educating the users on "using their DBMS right".
But in the long run avoiding future issues, with solid documentation and a link to the MYSQL documentation here and there, can only benefit the time spent by the core developers on "new" issues and the community around this project as whole.

@jflitton
Copy link
Contributor Author

Thanks for the feedback everyone!

As far as handling the incomplete rollback, is there any precedent within the project for handling warnings? I'm inclined to silently catch the error, but a non-fatal error is also an option. Open to other ideas.

As far as detecting an implicit commit that occurs as a result of LOCK TABLES. The problem is bigger in that almost any DDL statement will perform an implicit commit as well. I'm not sure if there's a practical solution at this time to detect all these. Perhaps a warning in the documentation and a link to MySQL's transaction documentation is sufficient?

@ravi
Copy link

ravi commented Sep 10, 2013

This might be of interest:

http://dev.mysql.com/doc/refman/5.0/en/implicit-commit.html

Transactions cannot be nested. This is a consequence of the implicit commit
performed for any current transaction when you issue a START TRANSACTION
statement or one of its synonyms.

@ravi
Copy link

ravi commented Sep 10, 2013

Also, 2 cents regarding @felixge's point about user queueing a new transaction once a commit() has been queued. I am not sure I see the underlying issue as any different from the standard node async callback model. Admittedly without an understanding of how the MySQL protocol works w.r.t a queued commit message, it seems to me that the user should begin a new transaction only upon successful completion of the commit.

@kai-koch
Copy link
Collaborator

I do not think, that the implicit-commits can be avoided. We would need a complete SQL parser, that keeps track of every statement and the current state.
I would suggest stating in the documentation, that this is only basic transaction support for e.g a couple of inserts to minimize HD access or the like and that the user has to implement own logic to keep track of LOCK-, DDL-statements etc.

@felixge
Copy link
Collaborator

felixge commented Sep 10, 2013

@ravi node-mysql queues the operations the user wants to complete for convenience. So asking a mysql connection to perform a new operation before the previous one has completed is quite common and needs to be supported well.

Anyway, I'm short on time so I won't be able to contribute much more to this discussion this week. I trust @kai-koch and @dresende to make the right calls here.

@dresende
Copy link
Collaborator

Reading comments from @kai-koch and @ravi I think the best approach is:

  • Do an auto-commit when a second transaction is started without committing the first one. Later on we can make an option to trigger an error or something but for now we should mimic mysql client behaviour (it just accepts the second one, auto-commiting the first one);
  • Add documentation and references to mysql about nontransactional tables and rollback and locks. Keep this stuff out of the code for now;

We can later on see if we can have a better approach, perhaps dealing with the warnings, but this simple API is what people should expect from a "raw" mysql node client.

@jflitton
Copy link
Contributor Author

If we proceed with this plan, it sounds to me like just binding beginTransaction(), commit(), and rollback() to the queries "START TRANSACTION", "COMMIT", and "ROLLBACK", respectively would work. This means that no transaction state checking would be performed when these functions are called. The API would not throw an error when beginTransaction() is called multiple times in a row, nor would it throw one when commit() or rollback() are called without a beginTransaction(). These are the current behaviors in MySQL 5.x. Furthermore, it would not return an error or warning when non-transactional tables are in use.

One remaining question is: Should releasing a connection back to the pool result in an implicit commit or rollback?

If there's a consensus on this approach, that's fine with me. I can make the code and documentation changes accordingly.

@dresende
Copy link
Collaborator

My plan is to keep it simple for now so everyone agrees to merge :)

We can then add some inteligence, like throwing when calling .beginTransaction twice if perhaps an option is not set, or .release() then connection on commit/rollback (once again, with an option).

Have you tried making a rollback in a MyISAM table? The server doesn't return anything?

@ravi
Copy link

ravi commented Sep 11, 2013

FWIW: I agree with @dresende's reasoning w.r.t implementation/rollout.

@felixge your point about queueing is well taken, but I think a transaction (START TRANSACTION or COMMIT/ROLLBACK) is different… a user should (for his/her own sake) treat it as an explicit synchronisation point. Please ignore… this is not a significant point.

@kai-koch
Copy link
Collaborator

What @dresende said

@tellnes
Copy link
Collaborator

tellnes commented Sep 11, 2013

What about putting startTransaction, commit and release on the Pool instance?

Pool#startTransaction - Calls Pool#getConnection then it runs START TRANSACTION
Pool#commit - Runs COMMIT then Pool#release
Pool#rollback - Runs ROLLBACK then Pool#release
Pool#release - Runs either commit or release based on a config flag, or maybe throws?

@dresende
Copy link
Collaborator

Transactions are also useful in a single connection (to be able to rollback). Think of an administration task you run everyday and does something. I might not need multiple connections but want a transaction at some point.. just in case.

We can then add a Pool.beginTransaction that would get a connection and call connection.beginTransaction passing the initial callback but that's just a shortcut.

@ronkorving
Copy link
Contributor

@tellnes: I love that idea!
release without commit could maybe just rollback?

@ravi
Copy link

ravi commented Sep 12, 2013

I am confused: why would be want connection level operations (transactions) to be available as methods at the pool level? There's no Pool#query (AFAIK). IMHO, shortcutting (as @dresende rightly calls it) might create a leaky abstraction and encourage laziness.

@dresende
Copy link
Collaborator

@jflitton please make the changes to make this simple as I described, then add a warning about the stuff we discussed here. We can then merge it and improve it later on.

@jflitton
Copy link
Contributor Author

Hey guys, any update on pulling this in? Any additional feedback welcome.

jflitton added a commit that referenced this pull request Oct 4, 2013
Basic transaction support for connections
@jflitton jflitton merged commit 14b93f7 into mysqljs:master Oct 4, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

9 participants