Skip to content

Massive performance difference between pg and pg-native #1993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gajus opened this issue Oct 20, 2019 · 36 comments
Closed

Massive performance difference between pg and pg-native #1993

gajus opened this issue Oct 20, 2019 · 36 comments

Comments

@gajus
Copy link
Contributor

gajus commented Oct 20, 2019

I have been trying to get to the bottom of this issue #1952 and along the way I have discovered that the latter issue is not present when using pg-native. However, what I also discovered is that there is massive performance difference between pg and pg-native. Without any change to the workload, and only by changing pg with pg-native, my database CPU usage dropped from an avg. 70% to an avg. 35%.

I suggest that it would be made more clear in the documentation that there is a considerable performance difference between using pg and pg-native. Switching from pg to pg-native will allow us to downgrade our server and save a significant amount of money.

In terms of what changes, as far as I can tell, the biggest difference is how much time is spent in ClientRead state. pg-native compared to pg spends a lot less time in ClientRead state. This appears to imply slow (?) parsing or some other client-side issue handling the response/ sending parameter values.

@gajus
Copy link
Contributor Author

gajus commented Oct 21, 2019

Screenshot 2019-10-21 at 01 46 28

Screenshot 2019-10-21 at 01 46 21

Here is what a switch from pg-native to pg looks over an hour's use.

@vitaly-t
Copy link
Contributor

vitaly-t commented Oct 21, 2019

This would depend a lot on type, size and frequency of data being exchanged, and even Node.js version. For example, if you are exchanging a lot of binary objects, it may have a significant impact. It would be difficult to generalize for documentation. And for an average use case the performance difference is negligible.

@gajus
Copy link
Contributor Author

gajus commented Oct 21, 2019

90%+ of the queries are simple primary key lookups returning couple of columns, e.g.

SELECT id FROM country WHERE code_alpha_2 = $1

I would think that this is the case for most applications.

There is no binary data being read/ written.

@vitaly-t
Copy link
Contributor

vitaly-t commented Oct 21, 2019

And what is the average number of rows returned in that test? If it is too many, the JavaScript version may simply be eating more memory due to more native JavaScript serialization calls between garbage collections.

The latter would depend a lot on the Node.js version, and can also be profiled via Node.js parameters.

@gajus
Copy link
Contributor Author

gajus commented Oct 21, 2019

Looking at the top queries in the pg_stat_statements, most of them return 1 or none rows.

maybeOne and maybeOneFirst are the most used query methods.

Are there any real downsides to using pg-native apart from:

Some of the modules using advanced features of PostgreSQL such as pg-query-stream, pg-cursor,and pg-copy-streams need to operate directly on the binary stream and therefore are incompatible with the native bindings.

?

@vitaly-t
Copy link
Contributor

vitaly-t commented Oct 21, 2019

Interesting. If what you are telling is accurate, then it contradicts great many tests conducted here by many developers, including myself.

There may be something special in how or where your app runs that manifests such a performance discrepancy.

Are there any real downsides to using pg-native apart from...

It can be a real installation pain sometimes 😄 Other than that, NO.

@gajus
Copy link
Contributor Author

gajus commented Oct 22, 2019

Is there a benchmark suite I can run against my infrastructure? Happy to contribute whatever helps to identify the underlying cause.

It can be a real installation pain sometimes 😄 Other than that, NO.

Wasn't an issue using a standard Docker node image.

@murukesh
Copy link

@gajus @brianc @vitaly-t I am experiencing the same problem with node-postgres driver too. Posted it here : #1952 (comment)

@brianc
Copy link
Owner

brianc commented Dec 19, 2019

I think I might have a fix - would you be willing to try it out by running a branch of this? I'd like to confirm it before releasing if possible. It shouldn't negatively impact things. I noticed about 200 queries per second difference on a small quick benchmark I wrote locally.

The difference being I'm testing against a local postgres instance...my hunch is it might be faster when testing against an instance over a longer/slower network.

@murukesh
Copy link

please share the branch

@brianc
Copy link
Owner

brianc commented Dec 19, 2019

@murukesh #2031

Would you be able to pull that down & see if it helps? I have a couple other ideas too...I'm experimenting w/ stuff over here

@ghost
Copy link

ghost commented May 9, 2020

Hi,
How did you config pg-native and pg?
Could you Plz, share your configs? @gajus

@tinder-ydong
Copy link

Hi,
How did you config pg-native and pg?
Could you Plz, share your configs? @gajus

add pg-native to package.json as a dependency, then add this to your environment variables
https://github.com/brianc/node-postgres/blob/master/packages/pg/lib/index.js#L33

@jfirebaugh
Copy link

We were seeing a lot of time spent in the ClientRead state on pg 7.18.2, but it mostly disappeared after upgrading to 8.5.1. I don't know if there were specific fixes targeted at this -- if so, thanks! In any case, the upgrade seems to have resolved the issue.

@ohjeyong
Copy link

For me, it is completely solved by upgrading to 8.5.1

@bfelbo
Copy link

bfelbo commented Jan 29, 2022

Does anyone know if there's still a substantial performance difference between pg and pg-native on the latest version?

@gajus
Copy link
Contributor Author

gajus commented Jan 29, 2022

OP: No, there is not. We have since switched back to using JavaScript driver and have scaled our program a lot more. pg driver itself was never the bottleneck.

@bfelbo
Copy link

bfelbo commented Jan 30, 2022

That’s really useful to know, thanks!

What made you switch back? Are there any benefits to using the JS driver if you don’t use any of the advanced features?

@gajus
Copy link
Contributor Author

gajus commented Jan 30, 2022

I cannot recall the specifics, but we did hit several issues with the native driver and debugging them was always a mystery. JavaScript driver has a lot more maintainers and has since released many patches.

@mgagliardo91
Copy link

Any update on this?

@valentinoPereira
Copy link

Should we continue using the pgNative or switch to the non-native javascript driver? What are the use cases for the existence both of them?

@gajus
Copy link
Contributor Author

gajus commented Mar 12, 2024

pg-native has not been updated for 2 years. Safe to say it is not in active development.

@brianc
Copy link
Owner

brianc commented Mar 12, 2024

I'm coming back to work on node-postgres full time in about a month. Been building a startup and now I'm done w/ that & going to spend 3-6 months (at least) just doing open source. I know it doesn't solve the immediate need but hope you find some encouragement in that I intend to update pg-native along with many other things 🔜 . In the mean time....I'm sorry 😢

Also, not sure why the client read issue...seems like it might be gone? Happy to dive into that next week as well.

@rishi-raj-jain
Copy link

Does anyone know if there's still a substantial performance difference between pg and pg-native on the latest version?

I still see it. Switching to pg-native just magically improves noticeable perf.

@melroy89
Copy link

pg-native has not been updated for 2 years. Safe to say it is not in active development.

Not true.. https://www.npmjs.com/package/pg-native latest update is 1 month ago.

@gajus
Copy link
Contributor Author

gajus commented Feb 12, 2025

@brianc To the best of your knowledge, is there a benefit to preferring pg-native currently over the JavaScript implementation? Speaking strictly from the performance perspective.

@brianc
Copy link
Owner

brianc commented Feb 12, 2025

Speaking strictly from the performance perspective.

Hmm I haven't done an apples to apples benchmark pitting the two against eachother in quite a while. I think it depends probably somewhat on the version of node you're using as I am not sure what the performance implications are for crossing the C++/JavaScript boundary. I can run a couple very rough benchmarks here and let ya know what I see...but probably best bet is to check them both in your own app code & see?

I will say, fwiw, I rarely use pg-native in production....pg has always been fast enough & I like to limit native deps just because there are occasionally shenanigans with installing them depending on the system.

I'll run a quick "no parameters" and "parameterized" benchmark. The other difference I could see is the "cold-start connection" speed could be different - the initial handshake between client and server takes significantly longer than actually running a query & getting the result...and its possible pg-native is better there. 🤷‍♂

I'll do a couple benches here & post what I see...

@brianc
Copy link
Owner

brianc commented Feb 12, 2025

Okay so...for what it's worth:

my package.json

{
  "name": "pgbenchasdf",
  "version": "1.0.0",
  "main": "index.js",
  "license": "MIT",
  "dependencies": {
    "pg": "^8.13.2",
    "pg-native": "^3.2.1"
  }
}

my index.js:

const pg = process.env.PGNATIVE ? require("pg").native : require("pg");

const { Client } = pg;

async function bench(name, benchFunc) {
  const client = new Client();
  await client.connect();

  // warm-up
  for (let i = 0; i < 50; i++) {
    await benchFunc(client);
  }

  // run 10 loops of 100 queries, measure the time of each inner loop
  const runtimes = [];
  for (let i = 0; i < 10; i++) {
    const start = Date.now();
    for (let j = 0; j < 100; j++) {
      await benchFunc(client);
    }
    // collect all the runtimes in an array
    runtimes.push(Date.now() - start);
  }

  // print the average runtime
  const avgRuntime = runtimes.reduce((a, b) => a + b, 0) / runtimes.length;
  console.log(`${name}: ${avgRuntime}ms avg (${runtimes.join("ms, ")}ms)`);

  await client.end();
}

async function main() {
  await bench("simple", async (client) => {
    await client.query("SELECT NOW()");
  });

  await bench("large result", async (client) => {
    await client.query("SELECT * FROM generate_series(1, 10000)");
  });

  await bench("parameters", async (client) => {
    await client.query("SELECT $1::text, $2::text", ["hello", "world"]);
  });
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

My results:

√ ~/src/pgbench % PGNATIVE=true node .
simple: 6.3ms avg (10ms, 9ms, 5ms, 6ms, 5ms, 6ms, 5ms, 5ms, 6ms, 6ms)
large result: 458.7ms avg (390ms, 502ms, 448ms, 453ms, 509ms, 444ms, 446ms, 444ms, 487ms, 464ms)
parameters: 5.9ms avg (9ms, 8ms, 7ms, 5ms, 5ms, 5ms, 5ms, 5ms, 5ms, 5ms)
√ ~/src/pgbench % node .              
simple: 6.1ms avg (10ms, 9ms, 6ms, 5ms, 6ms, 6ms, 5ms, 5ms, 5ms, 4ms)
large result: 277.6ms avg (277ms, 283ms, 283ms, 273ms, 277ms, 276ms, 282ms, 273ms, 274ms, 278ms)
parameters: 6.1ms avg (10ms, 8ms, 5ms, 6ms, 5ms, 7ms, 5ms, 5ms, 5ms, 5ms)
√ ~/src/pgbench % node --version
v20.10.0

Looks like, to me, pg-native is slower on larger result sets. At least on my particular system with this version of node! Hope this helps? I think the tl;dr is they're pretty close. Not saying either version is as fast as it could ever possibly be...but I did do quite some perf work on non pg-native maybe 1-2 years ago now? Got some pretty big wins at the time.

@gajus
Copy link
Contributor Author

gajus commented Feb 12, 2025

Thank you Brian.

I am doing a real-world stress test and will share the findings.

@brianc
Copy link
Owner

brianc commented Feb 12, 2025 via email

@gajus
Copy link
Contributor Author

gajus commented Feb 12, 2025

Okay, so for context we have 2,000+ tests that cover a broad spectrum of scenarios (basically, everything that's possible to do with https://contra.com/). I would expect that the completion time of these tests is a somewhat accurate approximation of how the performance would even out across real-user sessions.

I replaced pg with pg-native and ran tests to see how this would affect test completion times.

shard pg-native pg
1 3m 40s 3m 3s
2 4m 5s 3m 33s
3 4m 22s 3m 39s
4 4m 25s 3m 37s
5 5m 10s 3m 13s

Somewhat to my surprise, pg-native underperforms on every shard.

Interestingly, in another test set (vitest vs playwright), pg-native did almost the same and sometimes better. However, the difference is ~2%.

@gajus
Copy link
Contributor Author

gajus commented Feb 12, 2025

Didn't realize I am the author of the original issue.

I think it it is safe to close.

Based on the most recent tests, it looks like pg is as good if not better than pg-native.

@gajus gajus closed this as completed Feb 12, 2025
@brianc
Copy link
Owner

brianc commented Feb 12, 2025

Thank you so much!!!

@charmander
Copy link
Collaborator

Maybe it’s drastic, but if they’re close to performance parity now: any thoughts on deprecating pg-native in a future major version of pg? It doubles the maintenance burden for many parts of the library, it has the usual added complexities of native bindings, and I know it’s full of subtle inconsistencies/bugs on rarer code paths compared to the JS implementation. @brianc

@brianc
Copy link
Owner

brianc commented Feb 13, 2025

That sounds kinda nice, actually. So looking at npm stats, pg was installed back of the napkin 6.5 million times a week over the past month. pg-native was installed 90k times per week. I think probably best thing would be to spin pg-native back out of the mono-repo...maybe put it in with node-libpq since they're pretty married together, and then just be open to someone maintaining it or taking it over. The API could split away from pg itself at that point or 🤷 ... The only thing I know for sure is there are a few edge-case things pg-native might handle better like binary encoding on the result records instead of text encoding, but I dunno if there's even an option exposed to enable that. Again...separate library would make those things still available to anyone who wanted/needed them but we personally wouldn't be on the hook for maintaining it. I know from my own experience I kinda always dread working on node-libpq because its like "oh boy c++ build dependencies, how fun!"

@brianc
Copy link
Owner

brianc commented Feb 13, 2025

There are a few other things too like not being able to properly stream results with node-libpq / pg-native. There are ways in the C of libpq to do those things but exposing it was beyond my time & skill level at the time - so not everything pg can do (cursors, query-stream, copy-streams) can be done by pg-native.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests