Skip to content

Commit 2d59211

Browse files
authored
Merge pull request #832 from scalacenter/new-collections-performance
Add blog article about the new collections performance
2 parents 3b0fbab + d324021 commit 2d59211

File tree

4 files changed

+145
-0
lines changed

4 files changed

+145
-0
lines changed
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
---
2+
layout: blog-detail
3+
post-type: blog
4+
by: Julien Richard-Foy
5+
title: On Performance of the New Collections
6+
---
7+
8+
In a [previous blog post](/blog/2017/11/28/view-based-collections.html), I explained
9+
how [Scala 2.13’s new collections](http://www.scala-lang.org/blog/2017/02/28/collections-rework.html)
10+
have been designed so that the default implementations of transformation operations work
11+
with both strict and non-strict types of collections. In essence, we abstract over
12+
the evaluation mode (strict or non strict) of concrete collection types.
13+
14+
After we published that blog post, the community
15+
[raised concerns](https://www.reddit.com/r/scala/comments/7g52cy/let_them_be_lazy/dqgol36/)
16+
about possible performance implications of having more levels of abstraction than before.
17+
18+
This blog article:
19+
20+
- gives more information about the overhead of the collections’
21+
view-based design and our solution to remove that overhead,
22+
- argues that for correctness reasons it is still better to have
23+
view-based default implementations,
24+
- shows that we should expect the new collections to be equally fast
25+
or faster than the old collections, and reports an average speedup
26+
of 35% in the case of `Vector`’s `filter`, `map` and `flatMap`.
27+
28+
For reference, the source code of the new collections is available in
29+
[this GitHub repository](https://github.com/scala/collection-strawman).
30+
31+
## Overhead Of View Based Implementations
32+
33+
Let’s be clear, the view based implementations are in general slower than their
34+
builder based versions. How much slower exactly varies with the type of collection
35+
(e.g. `List`, `Vector`, `Set`), the operation (e.g. `map`, `flatMap`, `filter`)
36+
and the number of elements in the collection. In my benchmark on `Vector`, on
37+
the `map`, `filter` and `flatMap` operations, with 1 to 7 million of
38+
elements, I measured an average slowdown of 25%.
39+
40+
## How To Fix That Performance Regression?
41+
42+
Our solution is simply to go back to builder based implementations for strict collections: we
43+
override the default view based implementations with more efficient builder based
44+
ones. We actually end up with the same implementations as in the old collections.
45+
46+
In practice these implementations are factored out in traits that can be mixed
47+
into concrete collection types. Such trait names are always prefixed with
48+
`StrictOptimized`. For instance, here is an excerpt of the `StrictOptimizedIterableOps`
49+
trait:
50+
51+
~~~ scala
52+
trait StrictOptimizedIterableOps[+A, +CC[_], +C] extends IterableOps[A, CC, C] {
53+
54+
override def map[B](f: A => B): CC[B] = {
55+
val b = iterableFactory.newBuilder[B]()
56+
val it = iterator()
57+
while (it.hasNext) {
58+
b += f(it.next())
59+
}
60+
b.result()
61+
}
62+
63+
}
64+
~~~
65+
66+
Then, to implement the `Vector` collection, we just mix such a “strict optimized” trait:
67+
68+
~~~ scala
69+
trait Vector[+A] extends IndexedSeq[A]
70+
with IndexedSeqOps[A, Vector, Vector[A]]
71+
with StrictOptimizedSeqOps[A, Vector, Vector[A]]
72+
~~~
73+
74+
Here we use `StrictOptimizedSeqOps`, which is a specialization of `StrictOptimizedIterableOps`
75+
for `Seq` collections.
76+
77+
## Is The View Based Design Worth It?
78+
79+
In my previous article, I explained a drawback of the old builder based design.
80+
On non strict collections (e.g. `Stream` or `View`), we had to carefully override all the
81+
default implementations of transformation operations to make them non strict.
82+
83+
Now it seems that the situation is just reversed: the default implementations work well
84+
with non strict collections, but we have to override them in strict collections.
85+
86+
So, is the new design worth it? To answer this question I will quote a comment posted
87+
by Stefan Zeiger [here](https://www.reddit.com/r/scala/comments/7g52cy/let_them_be_lazy/dqixt8d/):
88+
89+
> The lazy-by-default approach is mostly beneficial when you're implementing lazy
90+
> collections because you don't have to override pretty much everything or get
91+
> incorrect semantics. The reverse risk is smaller: If you don't override a lazy
92+
> implementation for a strict collection type you only suffer a small performance
93+
> impact but it's still correct.
94+
95+
In short, implementations are **correct first** in the new design but you might want to
96+
override them for performance reasons on strict collections.
97+
98+
## Performance Comparison With 2.12’s Collections
99+
100+
Talking about performance, how performant are the new collections compared to the old ones?
101+
102+
Again, the answer depends on the type of collection, the operations and the number of elements.
103+
My `Vector` benchmarks show a 35% speedup on average:
104+
105+
![](/resources/img/blog/new-collections-performance-filter.png)
106+
107+
![](/resources/img/blog/new-collections-performance-map.png)
108+
109+
![](/resources/img/blog/new-collections-performance-flatMap.png)
110+
111+
These charts show the speedup factor (vertically) of the `filter`, `map` and `flatMap`
112+
operations execution compared to the old `Vector`, for various number of elements (horizontally).
113+
The blue line shows the old `Vector`,
114+
the red line shows the new `Vector` if it used only view based
115+
implementations, and the yellow line shows the actual new `Vector`
116+
(with strict optimized implementations). Benchmark source code and numbers can be found
117+
[here](https://gist.github.com/julienrf/f1cb2b062cd9783a35e2f35778959c76).
118+
119+
Since operation implementations end up being the same, why do we get better performance
120+
at all? Well, these numbers are specific to `Vector` and the tested operations, they
121+
are due to the fact that
122+
we more aggressively inlined a few critical methods. I don’t expect the new collections
123+
to be *always* faster than the old collections. However, there is no reason for
124+
them to be slower since the execution path, when calling an operation, can be made
125+
exactly the same as in the old collections.
126+
127+
## Conclusion
128+
129+
This article studied the performance of the new collections. I’ve reported that view
130+
based operation implementations are about 25% slower than builder based implementations,
131+
and I’ve explained how we restored builder based implementations on strict collections.
132+
Last but not least, I’ve shown that defaulting to view based implementations does
133+
make sense for the sake of correctness.
134+
135+
I expect the new collections to be equally fast or slightly faster than the previous collections.
136+
Indeed, we took advantage of the rewrite to apply some more optimizations here and
137+
again.
138+
139+
More significant performance improvements can be achieved by using different
140+
data structures. For instance, we recently
141+
[merged](https://github.com/scala/collection-strawman/pull/342)
142+
a completely new implementation of immutable `Set` and `Map` based on [compressed
143+
hash-array mapped prefix-trees](https://michael.steindorfer.name/publications/oopsla15.pdf).
144+
This data structure has a smaller memory footprint than the old `HashSet` and `HashMap`,
145+
and some operations can be an order of magnitude faster (e.g. `==` is up to 7x faster).
Loading
Loading
Loading

0 commit comments

Comments
 (0)