-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Queries are not sent until previous queries are done, nullifying all advantages of using Node #660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The PostgreSQL server itself only supports one active query per client connection. Libpq will barf if you try to send a query while one is already pending. node-postgres does its best to make it easy to not send two at the same time by internally buffering one query until the next is processed. You're wrong in assuming this nullifies the advantage of node.js. If you want to issue multiple queries at the same time, you need to connect multiple clients and send queries across each client. By default the pool holds 20 clients, but you can modify this. Note: PostgreSQL backend only supports about 500 connected clients at a time - depends on the amount of RAM your server has. The best way to do this is to use the built in connection pool. We handle millions of requests per day with node.js & PostgreSQL - certainly the advantages are not nullified for us. Furthermore, if all you're doing in your script is 1 million inserts then you're not really taking full advantage of node.js. What node allows is to perform non-blocking i/o which allows you to do other actions while your queries are being processed by the PostgreSQL backend. So, issue a query, handle a web request, read a file, all delegated out to the operating system and only calling back to node when the operation is finished. |
This is most likely wrong. The PostgreSQL documentation clearly states this. << It's most likely actually the case, because otherwise PostgreSQL would be horribly designed, since it would make no sense whatsoever to read additional commands while processing one for the sake of rejecting them. In fact, they even explicitly say this: Thus, the server doesn't seem to even look at the socket while processing queries, which means it can't even notice whether you waited for ReadyForQuery or not to send further queries. You should probably reopen the issue unless this is somehow incorrect. |
I know for a fact that libpq prevents you from sending more than 1 command at a time. Run this code: var Client = require('pg-native')
var client = new Client()
client.connectSync()
client.query('pg_sleep(30)', function(err) {
console.log('done 1')
})
client.query('pg_sleep(30)', function(err) {
if(err) throw err;
console.log('done 2')
client.end()
}) It seems like there is a way to dump multiple command statements into the socket without waiting for results, but I don't know if it works with parameterized queries, and more importantly - what does this actually help with? You're still going to be sending the same amount of data across the socket, and having no way to properly deal with errors in earlier queries. In your example in particular you're still going to be waiting for the first query to finish before you get the second query's results. I think there is room to do something special with an explicit API for when you'd want to blast queries at the server and ignore any errors encountered earlier, but it seems very edge case & since libpq doesn't support it I'm not going to add it to node-postgres directly. I think it could be done with a separate module in the way https://github.com/brianc/node-pg-query-stream is done, but it ain't gonna happen in node-postgres itself. I stand by what I said earlier in that sending a query & waiting for the response (in a non-blocking way) in no way nullifies any advantage of node.js. Matter-of-fact the PostgreSQL team made sure you can only send 1 query at a time in a fully non-blocking way even in their official C client. I'd encourage you to build a separate module you can pass into |
Also - you can actually chain multiple statements together in a single command & send it & get the results: var statements = 'INSERT INTO blah(id) VALUES(1); INSERT INTO something(age) VALUES(100);'
client.query(statements) Still not generally a good idea due to error handling. |
Maybe libpq does (just like your implementation right now), but I believe its highly likely you can just pass multiple commands on the socket, since this library does not use libpq. And this library does not use libpq, and the main reason to not use libpq is to implement proper asynchronous semantics... The docs say it's possible, and it makes no sense for it to be impossible.
There's no need to ignore errors. Just queue all queries that you sent and didn't get an answer for, and pop them and call its callback every time the server returns success or an error.
It helps with latency, obviously. Let's say it takes 1 ms to reach the database and go back and you issue 10000 inserts or updates in parallel using async.parallel, Promise.all or similar, then it will take 10 seconds just to go forward or back from the network, and the 10 seconds wait is completely unnecessary. Without the unnecessary roundtrips, the whole 10000 inserts would take much less than a second, and your implementation just made such an application unusably slow. This is ESSENTIAL for applications that do things like that, and useful for everything (you might shave 10-100 ms or more per request on a typical web application if your database is on another machine and the application is properly written). |
I'm happy to review any pull requests you'd like to submit. 😄 |
Actually, it looks like a 2-line patch that removes the call to Query.submit from Client._pulseQueryQueue and adds a call to it in Client.query might be all that is needed. EDIT: actually, that probably breaks copyFrom, more sophistication is needed for that |
How would you support this: pg.connect(function(err, client, done) {
client.query('BEGIN', function(err) {
if(err) return done(err)
client.query('INSERT INTO account(user_id, amount) VALUES(1, 100)', function(err) {
if(err) return done(err)
client.query('INSERT INTO account(user_id, amount) VALUES(2, -100)', function(err) {
if(err) return done(err)
client.query('COMMIT', done)
})
})
})
}) |
In your example queries are sent sequentially (i.e. you are waiting for the callback to be called to issue the next query), so there is no difference. Anyway, see the pull request at #662 |
Can we reopen it since Postgres 14 implemented pipeline mode? https://www.postgresql.org/docs/14/libpq-pipeline-mode.html |
yeah maybe we can open a new issue instead? This one is older than the hills. It's a good feature though I'd like to add it soon. |
Hi Brianc, I opened #2646 |
This module waits to send a query until the results of previous queries arrive from PostgreSQL.
This is an issue that nullifies all advantages of using Node: the whole point of using Node is that you can easily make requests in parallel and receive answers asynchronously.
With this implementation, doing N queries results in N database roundtrips, even if the queries are issued in parallel, while of course there should be more than 1 roundtrip only if subsequent queries need the output of previous ones.
For instance, if you try to insert a million objects in parallel with this library in its current state using separate queries, you will create a million of unnecessary database roundtrips.
You can see this easily with the following code:
Just run it through strace to get this result:
In case it's not obvious, this is the expected output:
The text was updated successfully, but these errors were encountered: