A throughput experiment demonstrates significant variation with specific maxUnconfirmedMessages values #750
-
I posted this question on the Google group but did not get any feedback, so let's try it here too. While testing the streams with 4.1.0 and Java client 1.0.0 I found some strange behavior. I have a performance test where I publish 10,000 messages to a stream and measure the throughput (messages/second). The stream, the environment and the producer is recreated before each run. The producer is all default, except with different maxUnconfirmedMessages setting at each run. I run the same test series, once with 64 bytes per message, once with 2048 bytes per message. The broker is a single node, running on Kubernetes in Docker Desktop, I run my JUnit test natively on Windows from my IDE. I expect the throughput to increase as I use bigger values for maxUnconfirmedMessages, and that is mostly the trend, however, there are some very strange outliers. maxUnconfirmedMessages=10 seems always producing almost exactly 100 messages per second, while 9 or 11 is two orders of magnitudes higher, with both 64 and 2048 byte messages. The 64 byte message throughput also shows a dip at maxUnconfirmedMessages=100 (but not at 99 or 101) and there are some weird variations around 1000, see below. Can someone explain me what is going on here? |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 12 replies
-
@pvaiko not with a single spreadsheet screenshot for problem definition. I don't know what this test does but the numbers are much lower than what I get on a several year old machine with a fast SSD, 8 cores and not containers or VMs between Stream PerfTest and your numbers. java -jar stream-perf-test.jar produces numbers well over a million of messages per second:
Therefore I conclude that your environment is resources constrained for what even a single stream can do. Throughput variation for streams can be directly linked to any I/O throughput variation in the container and not depend on the maximum number of unconfirmed messages alone. That variable is probably not isolated nearly as much as you expect it to be. In fact, I would not expect the maximum number of unconfirmed messages to matter much, there are likely significantly more important factors such as the I/O throughput, since streams are relatively CPU light but very I/O heavy. Finally, |
Beta Was this translation helpful? Give feedback.
-
So to make this a fairer comparison, I have tried a few more runs on a 10 core machine (from early 2022). I don't know what @pvaiko's test does but let's assume that the Stream PerfTest's The default of that setting is 10K, which already tells you what our team believes to be a reasonable number given that Stream PerfTest can approach 1.5-2M messages a second. --confirms 999java -jar stream-perf-test.jar --confirms=999 produces
--confirms 1000java -jar stream-perf-test.jar --confirms=1000 produces
--confirms 1001java -jar stream-perf-test.jar --confirms=1001 produces
--confirms 20000java -jar stream-perf-test.jar --confirms=20000 produces
ConclusionStream PerfTest does not reproduce this kind of variability in an environment where a reasonably fast 3 year old SSD drive is used and there isn't much disk I/O variability. The problem can be in the benchmark, they are difficult to get right, as we still sometimes discover with Stream PerfTest and PerfTest even after a decade of experience. Or it can be in the disk or network I/O variability in the @pvaiko's environment. The more virtualization or containerization layers ther are, the greater and less predictable it can be. |
Beta Was this translation helpful? Give feedback.
-
Have you tried with 10?
And then with 9:
|
Beta Was this translation helpful? Give feedback.
-
I have filed #751, we'll see what @acogoluegnes thinks. In any case, using values higher than 100 is a perfectly acceptable, if not recommended, thing to do, so I'm moving on. |
Beta Was this translation helpful? Give feedback.
There's also
ProducerBuilder#dynamicBatch
that is enabled by default but can be used if you care more about throughput than latency: