Use BitSet instead of LinkedHashSet to improve the performance of SlotRange construction #2525

daihuabin · 2023-03-08T05:03:16Z

Here are the JMH test results for BitSet and LinkedHashSet

Benchmark              (slotSize)  (startFrom)   Mode  Cnt     Score     Error   Units
Scratch.bitSet                100          100  thrpt   10  6584.879 ± 178.558  ops/ms
Scratch.bitSet                100         3000  thrpt   10  5546.809 ± 216.817  ops/ms
Scratch.bitSet                100         6000  thrpt   10  4024.224 ± 111.127  ops/ms
Scratch.bitSet               3000          100  thrpt   10   197.832 ±  27.294  ops/ms
Scratch.bitSet               3000         3000  thrpt   10   201.136 ±  32.989  ops/ms
Scratch.bitSet               3000         6000  thrpt   10   203.528 ±  23.295  ops/ms
Scratch.bitSet               6000          100  thrpt   10   101.227 ±  14.600  ops/ms
Scratch.bitSet               6000         3000  thrpt   10   101.928 ±  19.558  ops/ms
Scratch.bitSet               6000         6000  thrpt   10   100.366 ±  17.370  ops/ms
Scratch.linkedHashSet         100          100  thrpt   10  1556.557 ±  95.599  ops/ms
Scratch.linkedHashSet         100         3000  thrpt   10  1553.883 ±  40.297  ops/ms
Scratch.linkedHashSet         100         6000  thrpt   10  1563.915 ±  29.968  ops/ms
Scratch.linkedHashSet        3000          100  thrpt   10    62.365 ±   2.632  ops/ms
Scratch.linkedHashSet        3000         3000  thrpt   10    61.362 ±   3.366  ops/ms
Scratch.linkedHashSet        3000         6000  thrpt   10    61.484 ±   4.668  ops/ms
Scratch.linkedHashSet        6000          100  thrpt   10    29.789 ±   1.172  ops/ms
Scratch.linkedHashSet        6000         3000  thrpt   10    29.868 ±   1.360  ops/ms
Scratch.linkedHashSet        6000         6000  thrpt   10    29.746 ±   1.843  ops/ms

Here is the JMH test code

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

import java.util.BitSet;
import java.util.LinkedHashSet;
import java.util.Set;
import java.util.concurrent.TimeUnit;

@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Thread)
@Fork(value = 1)
@Warmup(iterations = 1)
@Measurement(iterations = 10, time = 1)
@Threads(Threads.MAX)
public class Scratch {

    Set<Integer> slots;

    @Param({"100", "3000", "6000"})
    int slotSize;

    @Param({"100", "3000", "6000"})
    int startFrom;

    @Setup(Level.Iteration)
    public void init() {
        slots = new LinkedHashSet<>();
        for (int i = startFrom; i < startFrom + slotSize; i++) {
            slots.add(i);
        }
    }

    @Benchmark
    public void linkedHashSet() {
        LinkedHashSet<Integer> integers = new LinkedHashSet<>();
        for (Integer slot : slots) {
            integers.add(slot);
        }
    }

    @Benchmark
    public void bitSet() {
        BitSet set = new BitSet();
        for (Integer slot : slots) {
            set.set(slot);
        }
    }

    public static void main(String[] args) throws RunnerException {
        Options options = new OptionsBuilder().include(Scratch.class.getSimpleName()).build();
        new Runner(options).run();
    }
}

You have read the Spring Data contribution guidelines.
You use the code formatters provided here and have them applied to your changes. Don’t submit any formatting related changes.
You submit test cases (unit or integration tests) that back your changes.
You added yourself as author in the headers of the classes you touched. Amend the date range in the Apache license header if needed. For new types, add the license header (copy from another file and set the current year only).

…tRange construction

Pull request: spring-projects#2525

jxblum · 2023-06-15T02:47:39Z

First, I will start by saying that there is little question a BitSet in this UC will be more performant than a LinkedHashSet, not only in terms of throughput, but also memory consumption, particularly since the purpose of RedisClusterNode.SlotRange is simply to track which "slots" a particular Redis cluster node is responsible for managing when the data set stored in Redis, across a "cluster", is sharded (partitioned).

Therefore, a "throughput" measurement for "adding slots" to the Set that is internally used to track which slots a Redis cluster node is managing (performed only once, and on startup for a clustered connection) is less interesting than measuring the "average (access) time" when determining whether a particular Redis cluster node is managing specific (hashed) slots, which would be a function of the size of the Redis cluster since not all nodes manage all slots, and would be essential, if not necessary information in certain key operations.

Therefore, I created another Benchmark to measure the "average time" of calling the RedisClusterNode.SlotRange.contains(slot) method, which is called by the RedisCluterNode.servesSlot(slot) method.

The ReidsClusterNodes.servesSlot(..) method is used when assessing the topology of the cluster, in SD Redis's ClusterTopology class. And, in particular, this is important in determining the master node managing the slot containing the key (this), which is essential during write operations to ensure a degree of consistency when the value of the key changes, most likely in a concurrent environment.

It is also used to determine all the nodes in the cluster that serve a particular slot (this). Of course, this is useful during reads in a replicated, clustered environment in order to improve read performance (both throughput and latency).

jxblum · 2023-06-15T02:54:29Z

My findings are consistent with the expected outcome, that there is sizable improvement in performance when assessing whether a Redis cluster node is managing a slot using a BitSet. By comparison:

Using LinkedHashSet:

Benchmark                                                         (iterations)  Mode  Cnt   Score    Error  Units
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          2000  avgt    3   6.716 ± 23.909  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          3000  avgt    3  10.528 ± 11.847  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          5000  avgt    3  21.953 ± 64.360  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot         10000  avgt    3  58.908 ± 13.313  us/op

Using BitSet:

Benchmark                                                         (iterations)  Mode  Cnt  Score   Error  Units
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          2000  avgt    3  0.899 ± 0.266  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          3000  avgt    3  1.345 ± 0.041  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          5000  avgt    3  2.327 ± 0.896  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot         10000  avgt    3  4.597 ± 4.279  us/op

Pull request: spring-projects#2525

…tRange construction. Pull request: spring-projects#2525

Pull request: spring-projects#2525

jxblum · 2023-06-15T03:03:59Z

Of course, using a BitSet should improve startup time of the Spring Data Redis application on average, and to a degree, dependent on environment factors, but again this is a function of the cluster size and number of configured slots.

…tRange construction. Pull request: spring-projects#2525

Pull request: spring-projects#2525

spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Mar 8, 2023

daihuabin force-pushed the main branch from 29a1810 to 7a883b2 Compare March 9, 2023 01:12

Use BitSet instead of LinkedHashSet to improve the performance of Slo…

0ea5127

…tRange construction

daihuabin force-pushed the main branch from 7a883b2 to 0ea5127 Compare March 9, 2023 11:18

daihuabin closed this Mar 9, 2023

daihuabin reopened this Mar 9, 2023

mp911de added type: enhancement A general enhancement and removed status: waiting-for-triage An issue we've not yet triaged labels Mar 14, 2023

mp911de assigned jxblum Mar 27, 2023

jxblum force-pushed the main branch from 6afe1e0 to f1492e1 Compare June 14, 2023 19:55

jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 14, 2023

Polish.

4da95f5

Pull request: spring-projects#2525

jxblum added this to the 3.2 M1 (2023.1.0) milestone Jun 15, 2023

jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023

Prepare topic branch for PR spring-projects#2525.

44157e4

jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023

Polish.

5ce3f63

Pull request: spring-projects#2525

jxblum pushed a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023

Use BitSet instead of LinkedHashSet to improve the performance of Slo…

1d293d8

…tRange construction. Pull request: spring-projects#2525

jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023

Polish.

8f58c1f

Pull request: spring-projects#2525

jxblum pushed a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023

Use BitSet instead of LinkedHashSet to improve the performance of Slo…

1ed5421

…tRange construction. Pull request: spring-projects#2525

jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023

Polish.

2b21d93

Pull request: spring-projects#2525

jxblum modified the milestones: 3.2 M1 (2023.1.0), 3.1.1 (2023.0.1) Jun 15, 2023

jxblum closed this Jun 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use BitSet instead of LinkedHashSet to improve the performance of SlotRange construction #2525

Use BitSet instead of LinkedHashSet to improve the performance of SlotRange construction #2525

daihuabin commented Mar 8, 2023

jxblum commented Jun 15, 2023 •

edited

Loading

jxblum commented Jun 15, 2023 •

edited

Loading

jxblum commented Jun 15, 2023

Use BitSet instead of LinkedHashSet to improve the performance of SlotRange construction #2525

Use BitSet instead of LinkedHashSet to improve the performance of SlotRange construction #2525

Conversation

daihuabin commented Mar 8, 2023

jxblum commented Jun 15, 2023 • edited Loading

jxblum commented Jun 15, 2023 • edited Loading

jxblum commented Jun 15, 2023

jxblum commented Jun 15, 2023 •

edited

Loading

jxblum commented Jun 15, 2023 •

edited

Loading