implement pipelining mode allowed by libpq 14 #2646

abenhamdine · 2021-10-31T17:21:35Z

libpq shipped with Postgres 14 now allows using pipelining mode
https://www.postgresql.org/docs/14/libpq-pipeline-mode.html

as stated by these docs

While the pipeline API was introduced in PostgreSQL 14, it is a client-side feature which doesn't require special server support and works on any server that supports the v3 extended query protocol.

So I don't know exactly when v3 extended query protocol has appeared (at least 2010 I guess), but it's there since a lot of time, so probably all supported version of pg server (9.6 and above) are able to support pipelining mode.

related commit in posgres repository is there https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=acb7e4eb6b

For the record, a previous PR has been proposed at #662 but it was before before libpq supports it so the PR has been closed because it would have created a discrepancy between native bindings (which use libpq) and the js client modified by the PR.

IMHO this mode should be opt-in, at least for current major version :

because it's not so clear (to me at least) how some workflows are handled (what if you send a DDL query adding a column and then immediatly a DML query using the new column ?)
to avoid subtle breaking changes in users usage (especially for old version of pg)
for memory consumption reason (from the pg docs : Pipeline mode also generally consumes more memory on both the client and server)

edit : I made an (incomplete) attempt in this PR : #2706

The text was updated successfully, but these errors were encountered:

jadbox · 2022-02-01T00:12:51Z

Any update on this?

abenhamdine · 2022-02-01T07:43:26Z

Any update on this?

I did some tests locally with a fork, built upon #662 and noticed some interesting perf gains in my uses cases (between 5% and 20%).
I will send a PR as a POC.
However it adress only the js driver, not the native one.

marcbachmann · 2022-02-08T17:00:06Z

Explicit pipelining would be interesting.
E.g. ioredis supports client.pipeline([['get', 'foo'], ['get', 'bar']]).exec().

pg could do the same to ensure they are sent on the same connection without any other slow query interfering.
e.g. const [[errA, resA], [errB, resB]] = await pool.pipeline([{text, values}, {text, values}]).

Or

const [{err: errB, rows: rowsA} {err: errA, rows: rowsB}] = await pool.pipeline([{text, values}, {text, values}])

abenhamdine · 2022-02-16T13:50:14Z

Explicit pipelining would be interesting.

I don't get your point. If pipelining is possible, why would you want to get it explicitely and would not want not always use it ?

I can understand to put it behind a flag (eg --pipelining at connection level) until we are sure it's safe to use it for every use case, but I fail to see why it's should be explicit in the api.

marcbachmann · 2022-02-16T13:59:23Z

With transactions it will get executed to a single client, so that will work as desired.

But when using a pool, pipeline would get distributed among all the clients. In some cases you might want to execute it on the same client instance so other queries won't be scheduled in between. But maybe that's not worth it to implement.

abenhamdine · 2022-02-16T14:07:04Z

But when using a pool, pipeline would get distributed among all the clients

~~Indeed, but it will be the case with or without pipelining, thus why would it be an issue ? Or perhaps I miss something in your explanation ?~~

EDIT :
oh sorry, it looks like I missed the important part of your post :

so other queries won't be scheduled in between

Yes it's part of my thoughts (also with mixing DML/DDL queries)
Perhaps indeed it should be allowed to optin (or optout ?) at the pool and/or client level

elchinoo · 2022-02-27T23:17:31Z

But when using a pool, pipeline would get distributed among all the clients. In some cases you might want to execute it on the same client instance so other queries won't be scheduled in between. But maybe that's not worth it to implement.

I'm not sure how the implementation in node would look like but this should not be possible using libpq. I'm investigating this feature, and (I might be wrong) but what I found is this is just allowing batch dispatch transparently from client side. The pipeline essentially makes the network file descriptor non-blocking and allow the client submit multiple requests in a non-blocking way. All the requests are going through the same SOCK_FD and the client needs now to monitor the associated SOCK_FD (select, poll, epoll, io_uring, kqueue, etc...).

Not sure if the complexity it adds to the client side pays off. There are cases it can improve performance reducing/saving network trip time because the lib allows the client to push the requests to the lib and then call sync (sort of flush) and it will send all to the server at the same time, having only 1 RTT instead of multiples. But the SOCK_FD used will be the same.

Another alternative is to call sync after every request (query). The calls will be asynchronous and won't block but they will again go through the same SOCK_FD and the response, even though asynchronous, will be sequential. Let's say we have 5 queries A, B, C, D and E. The 1st takes 1 second, the 2nd 30 seconds and all the subsequent ones take 1 second. We expect to receive all the responses, except the query B within 1~2 seconds time, but this won't happen because they will be returned sequentially. It will be A=1s, B=31s, C=32s, etc...

I just started investigating this feature on libpq and I might have missed something or not got to the nice features yet, but as I said, ATM the client implementation costs is quite high for the benefits it gets. It needs to handle the SOCK_FD but except from the RTT savings, I can't see much more.

abenhamdine · 2022-02-28T10:08:56Z

I'm not sure how the implementation in node would look like but this should not be possible using libpq. I'm investigating this feature, and (I might be wrong) but what I found is this is just allowing batch dispatch transparently from client side. The pipeline essentially makes the network file descriptor non-blocking and allow the client submit multiple requests in a non-blocking way. All the requests are going through the same SOCK_FD and the client needs now to monitor the associated SOCK_FD (select, poll, epoll, io_uring, kqueue, etc...).

Yes pipelining consists essentially in reusing the same socket without waiting for previous requests reponse.

Not sure if the complexity it adds to the client side pays off. There are cases it can improve performance reducing/saving network trip time because the lib allows the client to push the requests to the lib and then call sync (sort of flush) and it will send all to the server at the same time, having only 1 RTT instead of multiples. But the SOCK_FD used will be the same.

Saving network trips can provide significant gain perfs, see comments and benckmark #2706 (nodejs implementation but the saving of network trips would be the same I assume)

elchinoo · 2022-02-28T11:37:35Z

Saving network tripes can provide significant gain perfs, see comments and benckmark #2706 (nodejs implementation but the saving of network trips would be the same I assume)

I have no doubt that when in high latency network the gains could be significant but was skeptical on how much gain we could get otherwise. I'm still concerned of the complexity it may add to some client implementations but after reading the points (and benchmark) in the cases you referred I'm starting to rethink it. Thanks for sharing!

tapz · 2023-10-16T08:16:52Z

Is the postgresql server executing the queries in parallel or one at a time when using a pipeline? When not using a connection pool, but a single connection, pipelining would be really useful. The best way would be to use it like in redis, but the response should be similar to Promise.allSettled.

const [{status, value}, {status, value}] = await client.pipeline().query(sql1).query(sql2).exec();

boromisp · 2023-10-16T10:17:24Z

With pipelining you can reduce the idle time between queries on a single connection, by buffering the next query ahead of time.

The server will not execute multiple queries in parallel on the same connection, but whenever it finishes with one query, it won't have to wait a full network roundtrip for the next. This becomes significant if the query execution times are comparable to the network latency.

tapz · 2023-10-16T11:16:22Z

Would be nice to run multiple queries in parallel with just one connection. Forking in the backend probably would be faster than client opening multiple connections.

abenhamdine · 2023-10-16T13:28:00Z

Would be nice to run multiple queries in parallel with just one connection. Forking in the backend probably would be faster than client opening multiple connections.

pipelining has nothing to do with query execution concurrency/parallelism. With or without pipelining mode, the PostgreSQL server is executing the queries sequentially (while using parallelism capabilities if enabled), pipelining just allows both sides of the connection to work concurrently when possible, and to minimize round-trip time.

tapz · 2023-10-16T16:23:30Z

@abenhamdine I said would be nice. And is only possible with pipelining.

cesco69 · 2024-01-23T10:07:39Z

Any news?

abenhamdine · 2024-01-23T10:22:44Z

No unfortunately, I don't have time to finish the PR (and I didn't even look at the native driver)

cesco69 · 2024-06-10T10:30:09Z

For everyone want test the @abenhamdine PR this is a simple snippet for override pg and implement the #2706 changes.

Side note: I noticed a considerable improvement in performance.

import * as pg from 'pg';

pg.Connection.prototype.submittedNamedStatements = {};

pg.Client.prototype.sentQueryQueue = [];
pg.Client.prototype.pipelining = true;
pg.Client.prototype.handshakeDone = false;

const original_handleReadyForQuery = pg.Client.prototype._handleReadyForQuery;
pg.Client.prototype._handleReadyForQuery = function () {
    this.handshakeDone = true;
    original_handleReadyForQuery.call(this);
};

pg.Client.prototype._pulseQueryQueue = function () {
    if (this.pipelining) {
        if (!this.handshakeDone) {
            return
        }

        while (!this.blocked || (this.activeQuery === null && this.sentQueryQueue.length === 0)) {
            const query = this.queryQueue.shift()
            if (!query) break
            const queryError = query.submit(this.connection)
            if (queryError) {
                process.nextTick(() => {
                    this.activeQuery.handleError(queryError, this.connection)
                    this.readyForQuery = true
                    this._pulseQueryQueue()
                })
            }
            this.blocked = query.blocking
            this.sentQueryQueue.push(query)
            if (query.name) {
                this.connection.submittedNamedStatements[query.name] = query.text
            }
        }
    }
   
    if (this.readyForQuery === true) {
        this.activeQuery = this.pipelining ? this.sentQueryQueue.shift() : this.queryQueue.shift()
          
        if (this.activeQuery) {
            this.readyForQuery = false
            this.hasExecuted = true
            
            if (!this.pipelining) {
                const queryError = this.activeQuery.submit(this.connection)
                if (queryError) {
                    process.nextTick(() => {
                        this.activeQuery.handleError(queryError, this.connection)
                        this.readyForQuery = true
                        this._pulseQueryQueue()
                    })
                }
            }
        } else if (this.hasExecuted) {
            this.activeQuery = null
            this.emit('drain')
        }
    }
};

mcollina · 2025-01-11T10:52:43Z

@brianc I’d like to resume this work. Would you accept a PR?

cesco69 · 2025-01-13T08:57:13Z

@mcollina

@brianc seems not very active, perhaps better to draw the attention of @charmander (see #2646 (comment))

brianc · 2025-01-13T21:30:23Z

I'm here! Yeah I'm very much open to PRs all the time. Sometimes a bit delayed because life is...life...but I'm still here. :)

abenhamdine mentioned this issue Oct 31, 2021

Queries are not sent until previous queries are done, nullifying all advantages of using Node #660

Closed

charmander added the feature request label Nov 1, 2021

marcbachmann mentioned this issue Feb 11, 2022

single connection concurrent transaction porsager/postgres#266

Closed

abenhamdine mentioned this issue Feb 16, 2022

[WIP][POC] Use pipelining mode #2706

Closed

10 tasks

stephenh mentioned this issue Apr 2, 2022

Look into transaction conflicts joist-orm/joist-orm#287

Open

cesco69 mentioned this issue Apr 10, 2024

Pipeline mode #3193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement pipelining mode allowed by libpq 14 #2646

implement pipelining mode allowed by libpq 14 #2646

abenhamdine commented Oct 31, 2021 •

edited

Loading

jadbox commented Feb 1, 2022

abenhamdine commented Feb 1, 2022 •

edited

Loading

marcbachmann commented Feb 8, 2022 •

edited

Loading

abenhamdine commented Feb 16, 2022

marcbachmann commented Feb 16, 2022

abenhamdine commented Feb 16, 2022 •

edited

Loading

elchinoo commented Feb 27, 2022

abenhamdine commented Feb 28, 2022 •

edited

Loading

elchinoo commented Feb 28, 2022 •

edited

Loading

tapz commented Oct 16, 2023

boromisp commented Oct 16, 2023

tapz commented Oct 16, 2023

abenhamdine commented Oct 16, 2023

tapz commented Oct 16, 2023

cesco69 commented Jan 23, 2024

abenhamdine commented Jan 23, 2024

cesco69 commented Jun 10, 2024 •

edited

Loading

mcollina commented Jan 11, 2025

cesco69 commented Jan 13, 2025 •

edited

Loading

brianc commented Jan 13, 2025

implement pipelining mode allowed by libpq 14 #2646

implement pipelining mode allowed by libpq 14 #2646

Comments

abenhamdine commented Oct 31, 2021 • edited Loading

jadbox commented Feb 1, 2022

abenhamdine commented Feb 1, 2022 • edited Loading

marcbachmann commented Feb 8, 2022 • edited Loading

abenhamdine commented Feb 16, 2022

marcbachmann commented Feb 16, 2022

abenhamdine commented Feb 16, 2022 • edited Loading

elchinoo commented Feb 27, 2022

abenhamdine commented Feb 28, 2022 • edited Loading

elchinoo commented Feb 28, 2022 • edited Loading

tapz commented Oct 16, 2023

boromisp commented Oct 16, 2023

tapz commented Oct 16, 2023

abenhamdine commented Oct 16, 2023

tapz commented Oct 16, 2023

cesco69 commented Jan 23, 2024

abenhamdine commented Jan 23, 2024

cesco69 commented Jun 10, 2024 • edited Loading

mcollina commented Jan 11, 2025

cesco69 commented Jan 13, 2025 • edited Loading

brianc commented Jan 13, 2025

abenhamdine commented Oct 31, 2021 •

edited

Loading

abenhamdine commented Feb 1, 2022 •

edited

Loading

marcbachmann commented Feb 8, 2022 •

edited

Loading

abenhamdine commented Feb 16, 2022 •

edited

Loading

abenhamdine commented Feb 28, 2022 •

edited

Loading

elchinoo commented Feb 28, 2022 •

edited

Loading

cesco69 commented Jun 10, 2024 •

edited

Loading

cesco69 commented Jan 13, 2025 •

edited

Loading