Skip to content

Use BitSet instead of LinkedHashSet to improve the performance of SlotRange construction #2525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

daihuabin
Copy link
Contributor

Here are the JMH test results for BitSet and LinkedHashSet

Benchmark              (slotSize)  (startFrom)   Mode  Cnt     Score     Error   Units
Scratch.bitSet                100          100  thrpt   10  6584.879 ± 178.558  ops/ms
Scratch.bitSet                100         3000  thrpt   10  5546.809 ± 216.817  ops/ms
Scratch.bitSet                100         6000  thrpt   10  4024.224 ± 111.127  ops/ms
Scratch.bitSet               3000          100  thrpt   10   197.832 ±  27.294  ops/ms
Scratch.bitSet               3000         3000  thrpt   10   201.136 ±  32.989  ops/ms
Scratch.bitSet               3000         6000  thrpt   10   203.528 ±  23.295  ops/ms
Scratch.bitSet               6000          100  thrpt   10   101.227 ±  14.600  ops/ms
Scratch.bitSet               6000         3000  thrpt   10   101.928 ±  19.558  ops/ms
Scratch.bitSet               6000         6000  thrpt   10   100.366 ±  17.370  ops/ms
Scratch.linkedHashSet         100          100  thrpt   10  1556.557 ±  95.599  ops/ms
Scratch.linkedHashSet         100         3000  thrpt   10  1553.883 ±  40.297  ops/ms
Scratch.linkedHashSet         100         6000  thrpt   10  1563.915 ±  29.968  ops/ms
Scratch.linkedHashSet        3000          100  thrpt   10    62.365 ±   2.632  ops/ms
Scratch.linkedHashSet        3000         3000  thrpt   10    61.362 ±   3.366  ops/ms
Scratch.linkedHashSet        3000         6000  thrpt   10    61.484 ±   4.668  ops/ms
Scratch.linkedHashSet        6000          100  thrpt   10    29.789 ±   1.172  ops/ms
Scratch.linkedHashSet        6000         3000  thrpt   10    29.868 ±   1.360  ops/ms
Scratch.linkedHashSet        6000         6000  thrpt   10    29.746 ±   1.843  ops/ms

Here is the JMH test code

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

import java.util.BitSet;
import java.util.LinkedHashSet;
import java.util.Set;
import java.util.concurrent.TimeUnit;

@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Thread)
@Fork(value = 1)
@Warmup(iterations = 1)
@Measurement(iterations = 10, time = 1)
@Threads(Threads.MAX)
public class Scratch {

    Set<Integer> slots;

    @Param({"100", "3000", "6000"})
    int slotSize;

    @Param({"100", "3000", "6000"})
    int startFrom;

    @Setup(Level.Iteration)
    public void init() {
        slots = new LinkedHashSet<>();
        for (int i = startFrom; i < startFrom + slotSize; i++) {
            slots.add(i);
        }
    }

    @Benchmark
    public void linkedHashSet() {
        LinkedHashSet<Integer> integers = new LinkedHashSet<>();
        for (Integer slot : slots) {
            integers.add(slot);
        }
    }

    @Benchmark
    public void bitSet() {
        BitSet set = new BitSet();
        for (Integer slot : slots) {
            set.set(slot);
        }
    }

    public static void main(String[] args) throws RunnerException {
        Options options = new OptionsBuilder().include(Scratch.class.getSimpleName()).build();
        new Runner(options).run();
    }
}
  • You have read the Spring Data contribution guidelines.
  • You use the code formatters provided here and have them applied to your changes. Don’t submit any formatting related changes.
  • You submit test cases (unit or integration tests) that back your changes.
  • You added yourself as author in the headers of the classes you touched. Amend the date range in the Apache license header if needed. For new types, add the license header (copy from another file and set the current year only).

@daihuabin daihuabin closed this Mar 9, 2023
@daihuabin daihuabin reopened this Mar 9, 2023
@mp911de mp911de added type: enhancement A general enhancement and removed status: waiting-for-triage An issue we've not yet triaged labels Mar 14, 2023
jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 14, 2023
@jxblum
Copy link
Contributor

jxblum commented Jun 15, 2023

First, I will start by saying that there is little question a BitSet in this UC will be more performant than a LinkedHashSet, not only in terms of throughput, but also memory consumption, particularly since the purpose of RedisClusterNode.SlotRange is simply to track which "slots" a particular Redis cluster node is responsible for managing when the data set stored in Redis, across a "cluster", is sharded (partitioned).

Therefore, a "throughput" measurement for "adding slots" to the Set that is internally used to track which slots a Redis cluster node is managing (performed only once, and on startup for a clustered connection) is less interesting than measuring the "average (access) time" when determining whether a particular Redis cluster node is managing specific (hashed) slots, which would be a function of the size of the Redis cluster since not all nodes manage all slots, and would be essential, if not necessary information in certain key operations.

Therefore, I created another Benchmark to measure the "average time" of calling the RedisClusterNode.SlotRange.contains(slot) method, which is called by the RedisCluterNode.servesSlot(slot) method.

The ReidsClusterNodes.servesSlot(..) method is used when assessing the topology of the cluster, in SD Redis's ClusterTopology class. And, in particular, this is important in determining the master node managing the slot containing the key (this), which is essential during write operations to ensure a degree of consistency when the value of the key changes, most likely in a concurrent environment.

It is also used to determine all the nodes in the cluster that serve a particular slot (this). Of course, this is useful during reads in a replicated, clustered environment in order to improve read performance (both throughput and latency).

@jxblum
Copy link
Contributor

jxblum commented Jun 15, 2023

My findings are consistent with the expected outcome, that there is sizable improvement in performance when assessing whether a Redis cluster node is managing a slot using a BitSet. By comparison:

Using LinkedHashSet:

Benchmark                                                         (iterations)  Mode  Cnt   Score    Error  Units
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          2000  avgt    3   6.716 ± 23.909  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          3000  avgt    3  10.528 ± 11.847  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          5000  avgt    3  21.953 ± 64.360  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot         10000  avgt    3  58.908 ± 13.313  us/op

Using BitSet:

Benchmark                                                         (iterations)  Mode  Cnt  Score   Error  Units
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          2000  avgt    3  0.899 ± 0.266  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          3000  avgt    3  1.345 ± 0.041  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot          5000  avgt    3  2.327 ± 0.896  us/op
RedisClusterNodeSlotRangeBenchmarks.measureSlotRangeContainsSlot         10000  avgt    3  4.597 ± 4.279  us/op

@jxblum jxblum added this to the 3.2 M1 (2023.1.0) milestone Jun 15, 2023
jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023
jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023
jxblum pushed a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023
jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023
@jxblum
Copy link
Contributor

jxblum commented Jun 15, 2023

Of course, using a BitSet should improve startup time of the Spring Data Redis application on average, and to a degree, dependent on environment factors, but again this is a function of the cluster size and number of configured slots.

jxblum pushed a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023
jxblum added a commit to jxblum/spring-data-redis that referenced this pull request Jun 15, 2023
@jxblum jxblum closed this Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants