-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Unhandled exception in Parser.parseLengthCodedNumber #867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Wow, that you so much for the details explanation and the code, tables, etc. I'll work on investigating what is going on here :) |
@matthew-dean-hp are you able to reproduce the error reliably? I may have a fix and would like you to be able to also confirm the fix, especially since it involved a lot of changes. |
If at all possible, please install the current development version ( |
You can also try this with version 2.4.0 |
I can reproduce this very easily so I'll give those versions a go later today and report back. Thanks for responding so promptly |
So here's the output after doing (
|
Thank you very much :) Unfortunately I'm still not 100% sure what is happening, so I'm going to have yet another version of this library for you to install and run the same debug stuff for some more collection later today. |
@matthew-dean-hp I just want to apologize for not getting back to you yesterday, I completely forgot about this. |
@dougwilson Any further progress on deducing the cause of this bug? If there's anything else I can do to help please just let me know |
Hi @matthew-dean-hp I'm so sorry. I plan to get you another version of this module that will provide me with more debugging information for you to paste back to me. |
I'm' sorry for the delay :( Can you please using the following command: $ npm install felixge/node-mysql#debug Then re-run your script to cause the issue and then post the new results? Hopefully this should be the last thing I need to determine what the issue is. |
So I'm pretty sure this isn't the result you were hoping for, but if I run with either or both configs in debug mode I cannot now reproduce the bug. If I run in non-debug mode for both configs I still get an unhandled exception but it has changed. Here's the last 100 lines or so:
If all you've done is add extra debug lines then this is starting to feel to me like a race condition somewhere. Let me know if there's anything else I can do to help |
No, the error actually still looks just as wrong as all the others you've provided. Without the debug output, though, It's almost impossible for me to determine (you can actually exclude your |
My apologies, I completely missed that. We do now have an error:
|
Awesome! I think this should do it. I saw from your error I actually made a really dumb typo in the code. I updated the debug version again if you're willing to re-run and re-post? :) |
Oh, and one one tweak to your posting instructions: since this occurs when there are multiple threads all running, I need to get the output from just one of them. So when you run it and see output with a bunch of garbage (like the |
How's this:
|
OK, that looks perfect :) I should be able to address it very soon! That you so much in helping me out here. |
By the way, have you tried to run the |
Nevermind. I believe this is coming back as Multi-Resultset packets, which I don't think we support. I'll check it out and if so, we'll have support soon! |
@matthew-dean-hp can you add |
So it doesn't throw an unhandled exception but neither does the query actually execute. In the query callback I get an error of:
|
Right, I figured it wouldn't. I just wanted to confirm the issue is because we do not currently support "MULTI_RESULTS" yet the client is advertising it does to the server, which is why the exceptions are thrown. Supporting "MULTI_RESULTS" itself is actually optional in the protocol and I'll add it, but I just wanted to make sure that was what you were seeing was all :) I've put in the feature request here: #885 This issue is basically that the library is lying about it's capabilities to the server. |
Well at least we've narrowed down the cause of the issue. One thing that is interesting is that if I run similar stored procedures in a non cluster aware fashion then everything works fine. This includes the return of multiple result sets from a CALL. I know this is kind of a 'how long is a piece of string' type of question, but do you have an idea as to what sort of time-frame this might be implemented in? Not trying to hold you to anything, I'm just trying to get a best guess for my own planning. |
I want to take a stab at it this weekend, so basically the intended timeline is very short :) I'm disappointed that it took so long just to figure out what it was in the first place, especially since this library was not sending the correct set of capabilities to the server, which would have brought up the issue as a feature request immediately and would probably have been implemented by now :) |
@matthew-dean-hp when you get the chance, can you try out your original code with the latest version from the following command to see if you're still experiencing the issue? $ npm install felixge/node-mysql#debug |
It still gives an error
|
Hm, I'm still at a loss here. Everything should be implemented and is testing fine. I'm going to go and install/setup a MariaDB to use, since that is what you are using instead of MySQL server. Have you tried against a MySQL server, or only against MariaDB? My biggest roadblock so far is that I don't seem to be able to reproduce. I know you noted the following:
So After I get MariaDB setup, I'm going to try and replicate that as much as I can. |
Another thing you can try in the meantime is perhaps replace your |
I haven't tried anything other than the specific mariadb version stated in the original post. I've tried mysql2 and it also throws a similar error. I'm also going to try and set up a mysql test to see if it's any different. Hopefully one of us works it out and it's just something with this specific version |
Could you post an issue to mysql2 ( with debug enabled as well )? |
Done see sidorares/node-mysql2#113 |
@matthew-dean-hp awesome, thanks :) I'll continue to research and follow with @sidorares finds. This issue is strange, indeed. |
So I've now managed to reproduce this on a pair of mysql 5.5.28 instances. Now I'm still not sure the exact cause of the problem but I've since discovered it only occurs with the setting:
I would therefore guess that the problem is somehow a conflict between two different calls and then the error is somehow not being handled correctly. I understand that the general recommendation, for performance reasons, is that this setting be 1 but I haven't found anywhere suggesting this is a requirement of galera. What do you think? |
One other thing. To reproduce the issue in mysql I followed this guide http://getasysadmin.com/2013/03/how-to-setup-galera-3-node-cluster-on-ubuntu-12-04/ creating two virtualbox instances with fresh installs of ubuntu 12.04. The only substantive difference is the above setting that had to be changed to break things. Hope that helps. |
Some thoughts: packets just before error:
0231362435613736633639652d343937652d343963372d393530312d61313562643930373933396505000003fe00000a00
"5, 0, 0, 3, 0xfe, 0, 0, 0xa, 0" is actually EOF packet (with header) - 0 warnings count, 0xa status flags |
As far as I have been able to determine it's either an issue in the code mixing the bytes up (though I now doubt that after I had @matthew-dean-hp test with a completely different algorithm and tested with |
@dougwilson, I really can't see this being packet corruption on the network. Whilst I could imagine an issue with virtualbox's internal network connections, I first came across this problem using two mysql instances on the same machine connected on a loopback interface. |
It was just speculation :) I added "though I now doubt that", because, well, I don't think it is, especially because it changes based on a setting in the MySQL server. |
By the way, I just want to say sorry that this bug has been so crazy :( I hope you don't feel like we are ignoring you, but just the opposite :) |
I'm starting to think this is definitely a galera issue. Basically I've reconstructed the test using node-mariasql and it's returning to me occasionally an Error 2027, Malformed Packet. Given the vastly different pedigree of their library and this one it seems likely that this is either something peculiar in my setup or it's a very real bug in Galera. It would be very helpful if you could reproduce this just to make sure I'm not going crazy. There's still the issue of the unhandled-ness of the exception but I think that's probably it at your end. What do you think? |
Wow, that is actually using the C client library from MariaDB itself, so that really does make it seem like a server bug to me.
Yep. I'll make sure it's fixed :) |
So @matthew-dean-hp I know this is still an issue, but I think we can all agree that it may not be an issue with these clients, but rather something with the server. In the meantime, I have published 2.4.2 that should make all the errors you've posted above caused by this actually catch-able instead of crashing the Node.js process. |
When `Dial()` returned error and it's `Timeout() == true`, return ErrBadConn to database/sql retry new connection.
I have a stored procedure that needs to be called frequently and I have tried to make the call aware of my galera cluster by using a PoolCluster. If I send all the procedure calls to a single node in the cluster then everything works fine, however if I try to distribute the load by getting connections in a round-robin fashion I sometimes get the following exception when running the procedure:
I've managed to produce the simplest possible test case which can be done using the below SQL and node.js code. If I run the stored procedure SQL as individual statements the exception does not seem to happen and neither does it happen if I remove either the SELECT or the second INSERT. I'm pretty sure my cluster is set-up correctly and hopefully someone else is able to reproduce this. If it helps I'm running MariaDB 5.5.38.
The text was updated successfully, but these errors were encountered: