Skip to content

Commit 62d60ea

Browse files
committed
Benchmark duplicates implementation
1 parent 564b48f commit 62d60ea

File tree

2 files changed

+179
-0
lines changed

2 files changed

+179
-0
lines changed

bench/duplicates.exs

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
defmodule Bench do
2+
def naive_uniq(enum) do
3+
Enum.uniq(enum -- Enum.uniq(enum))
4+
end
5+
6+
def naive_freq(enum) do
7+
for {x, c} when c > 1 <- Enum.frequencies(enum), do: x
8+
end
9+
10+
def duplicates_body(enum) when is_list(enum) do
11+
list_duplicates(enum, %{})
12+
end
13+
14+
defp list_duplicates([], _map), do: []
15+
16+
defp list_duplicates([x | xs], map) do
17+
case map do
18+
%{^x => true} -> list_duplicates(xs, map)
19+
%{^x => false} -> [x | list_duplicates(xs, Map.put(map, x, true))]
20+
_ -> list_duplicates(xs, Map.put(map, x, false))
21+
end
22+
end
23+
24+
def duplicates_tail(enum) when is_list(enum) do
25+
list_duplicates(enum, [], %{})
26+
end
27+
28+
defp list_duplicates([], acc, _map), do: :lists.reverse(acc)
29+
30+
defp list_duplicates([x | xs], acc, map) do
31+
case map do
32+
%{^x => true} -> list_duplicates(xs, acc, map)
33+
%{^x => false} -> list_duplicates(xs, [x | acc], Map.put(map, x, true))
34+
_ -> list_duplicates(xs, acc, Map.put(map, x, false))
35+
end
36+
end
37+
end
38+
39+
inputs = [
40+
{"no duplicate", Enum.to_list(1..100)},
41+
{"half duplicates", Enum.concat(1..50, 50..1//-1)},
42+
{"many same duplicates", List.duplicate(1, 50) ++ List.duplicate(2, 50)},
43+
{"big no duplicate", Enum.to_list(1..100_000)}
44+
]
45+
46+
Benchee.run(
47+
%{
48+
"naive_uniq" => &Bench.naive_uniq/1,
49+
"naive_frequencies" => &Bench.naive_freq/1,
50+
"duplicates (body recursive)" => &Bench.duplicates_body/1,
51+
"duplicates (tail recursive)" => &Bench.duplicates_tail/1
52+
},
53+
inputs: inputs,
54+
memory_time: 0.5
55+
)

bench/duplicates.results.txt

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
Operating System: macOS
2+
CPU Information: Apple M1
3+
Number of Available Cores: 8
4+
Available memory: 16 GB
5+
Elixir 1.16.1
6+
Erlang 27.0-rc1
7+
8+
Benchmark suite executing with the following configuration:
9+
warmup: 2 s
10+
time: 5 s
11+
memory time: 500 ms
12+
reduction time: 0 ns
13+
parallel: 1
14+
inputs: no duplicate, half duplicates, many same duplicates, big no duplicate
15+
Estimated total run time: 2 min
16+
17+
Benchmarking duplicates (body recursive) with input no duplicate ...
18+
Benchmarking duplicates (body recursive) with input half duplicates ...
19+
Benchmarking duplicates (body recursive) with input many same duplicates ...
20+
Benchmarking duplicates (body recursive) with input big no duplicate ...
21+
Benchmarking duplicates (tail recursive) with input no duplicate ...
22+
Benchmarking duplicates (tail recursive) with input half duplicates ...
23+
Benchmarking duplicates (tail recursive) with input many same duplicates ...
24+
Benchmarking duplicates (tail recursive) with input big no duplicate ...
25+
Benchmarking naive_frequencies with input no duplicate ...
26+
Benchmarking naive_frequencies with input half duplicates ...
27+
Benchmarking naive_frequencies with input many same duplicates ...
28+
Benchmarking naive_frequencies with input big no duplicate ...
29+
Benchmarking naive_uniq with input no duplicate ...
30+
Benchmarking naive_uniq with input half duplicates ...
31+
Benchmarking naive_uniq with input many same duplicates ...
32+
Benchmarking naive_uniq with input big no duplicate ...
33+
34+
##### With input no duplicate #####
35+
Name ips average deviation median 99th %
36+
duplicates (body recursive) 137.85 K 7.25 μs ±215.59% 5.88 μs 35.33 μs
37+
duplicates (tail recursive) 131.41 K 7.61 μs ±142.89% 6 μs 40.25 μs
38+
naive_frequencies 96.16 K 10.40 μs ±260.72% 7.54 μs 64.83 μs
39+
naive_uniq 75.68 K 13.21 μs ±200.23% 10.71 μs 57.96 μs
40+
41+
Comparison:
42+
duplicates (body recursive) 137.85 K
43+
duplicates (tail recursive) 131.41 K - 1.05x slower +0.36 μs
44+
naive_frequencies 96.16 K - 1.43x slower +3.15 μs
45+
naive_uniq 75.68 K - 1.82x slower +5.96 μs
46+
47+
Memory usage statistics:
48+
49+
Name Memory usage
50+
duplicates (body recursive) 1.18 KB
51+
duplicates (tail recursive) 1.18 KB - 1.00x memory usage +0 KB
52+
naive_frequencies 6.67 KB - 5.66x memory usage +5.49 KB
53+
naive_uniq 21.34 KB - 18.09x memory usage +20.16 KB
54+
55+
**All measurements for memory usage were the same**
56+
57+
##### With input half duplicates #####
58+
Name ips average deviation median 99th %
59+
naive_frequencies 134.38 K 7.44 μs ±210.18% 6.63 μs 25.71 μs
60+
duplicates (tail recursive) 125.87 K 7.94 μs ±348.23% 6.08 μs 46.55 μs
61+
duplicates (body recursive) 122.51 K 8.16 μs ±434.69% 6.21 μs 44.67 μs
62+
naive_uniq 73.02 K 13.69 μs ±191.92% 9.92 μs 78.25 μs
63+
64+
Comparison:
65+
naive_frequencies 134.38 K
66+
duplicates (tail recursive) 125.87 K - 1.07x slower +0.50 μs
67+
duplicates (body recursive) 122.51 K - 1.10x slower +0.72 μs
68+
naive_uniq 73.02 K - 1.84x slower +6.25 μs
69+
70+
Memory usage statistics:
71+
72+
Name Memory usage
73+
naive_frequencies 5.50 KB
74+
duplicates (tail recursive) 12.49 KB - 2.27x memory usage +6.99 KB
75+
duplicates (body recursive) 11.90 KB - 2.16x memory usage +6.40 KB
76+
naive_uniq 24.45 KB - 4.45x memory usage +18.95 KB
77+
78+
**All measurements for memory usage were the same**
79+
80+
##### With input many same duplicates #####
81+
Name ips average deviation median 99th %
82+
duplicates (tail recursive) 2.39 M 0.42 μs ±12190.07% 0.29 μs 0.71 μs
83+
duplicates (body recursive) 1.81 M 0.55 μs ±12732.96% 0.33 μs 0.79 μs
84+
naive_frequencies 0.64 M 1.55 μs ±2239.50% 1.13 μs 5.42 μs
85+
naive_uniq 0.47 M 2.12 μs ±1067.78% 1.33 μs 17.50 μs
86+
87+
Comparison:
88+
duplicates (tail recursive) 2.39 M
89+
duplicates (body recursive) 1.81 M - 1.32x slower +0.135 μs
90+
naive_frequencies 0.64 M - 3.71x slower +1.13 μs
91+
naive_uniq 0.47 M - 5.07x slower +1.70 μs
92+
93+
Memory usage statistics:
94+
95+
Name Memory usage
96+
duplicates (tail recursive) 0.25 KB
97+
duplicates (body recursive) 0.22 KB - 0.88x memory usage -0.03125 KB
98+
naive_frequencies 1.36 KB - 5.44x memory usage +1.11 KB
99+
naive_uniq 1.06 KB - 4.25x memory usage +0.81 KB
100+
101+
**All measurements for memory usage were the same**
102+
103+
##### With input big no duplicate #####
104+
Name ips average deviation median 99th %
105+
duplicates (tail recursive) 32.56 30.71 ms ±16.51% 30.20 ms 45.23 ms
106+
duplicates (body recursive) 23.97 41.71 ms ±38.19% 40.87 ms 102.08 ms
107+
naive_frequencies 21.51 46.49 ms ±25.28% 46.34 ms 77.20 ms
108+
naive_uniq 12.67 78.90 ms ±21.18% 74.63 ms 162.62 ms
109+
110+
Comparison:
111+
duplicates (tail recursive) 32.56
112+
duplicates (body recursive) 23.97 - 1.36x slower +11.00 ms
113+
naive_frequencies 21.51 - 1.51x slower +15.78 ms
114+
naive_uniq 12.67 - 2.57x slower +48.19 ms
115+
116+
Memory usage statistics:
117+
118+
Name Memory usage
119+
duplicates (tail recursive) 23.95 MB
120+
duplicates (body recursive) 23.95 MB - 1.00x memory usage +0 MB
121+
naive_frequencies 39.04 MB - 1.63x memory usage +15.09 MB
122+
naive_uniq 50.53 MB - 2.11x memory usage +26.59 MB
123+
124+
**All measurements for memory usage were the same**

0 commit comments

Comments
 (0)