Skip to content

Add code tabs for collections-2.13/sets, immutabes and muttabes #2572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
208 changes: 151 additions & 57 deletions _overviews/collections-2.13/concrete-immutable-collection-classes.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,25 +24,42 @@ A [LazyList](https://www.scala-lang.org/api/{{ site.scala-version }}/scala/colle

Whereas lists are constructed with the `::` operator, lazy lists are constructed with the similar-looking `#::`. Here is a simple example of a lazy list containing the integers 1, 2, and 3:

scala> val lazyList = 1 #:: 2 #:: 3 #:: LazyList.empty
lazyList: scala.collection.immutable.LazyList[Int] = LazyList(<not computed>)
{% tabs LazyList_1 %}
{% tab 'Scala 2 and 3' for=LazyList_1 %}
~~~scala
scala> val lazyList = 1 #:: 2 #:: 3 #:: LazyList.empty
lazyList: scala.collection.immutable.LazyList[Int] = LazyList(<not computed>)
~~~
{% endtab %}
{% endtabs %}

The head of this lazy list is 1, and the tail of it has 2 and 3. None of the elements are printed here, though, because the list
hasn’t been computed yet! Lazy lists are specified to compute lazily, and the `toString` method of a lazy list is careful not to force any extra evaluation.

Below is a more complex example. It computes a lazy list that contains a Fibonacci sequence starting with the given two numbers. A Fibonacci sequence is one where each element is the sum of the previous two elements in the series.


scala> def fibFrom(a: Int, b: Int): LazyList[Int] = a #:: fibFrom(b, a + b)
fibFrom: (a: Int,b: Int)LazyList[Int]
{% tabs LazyList_2 %}
{% tab 'Scala 2 and 3' for=LazyList_2 %}
~~~scala
scala> def fibFrom(a: Int, b: Int): LazyList[Int] = a #:: fibFrom(b, a + b)
fibFrom: (a: Int,b: Int)LazyList[Int]
~~~
{% endtab %}
{% endtabs %}

This function is deceptively simple. The first element of the sequence is clearly `a`, and the rest of the sequence is the Fibonacci sequence starting with `b` followed by `a + b`. The tricky part is computing this sequence without causing an infinite recursion. If the function used `::` instead of `#::`, then every call to the function would result in another call, thus causing an infinite recursion. Since it uses `#::`, though, the right-hand side is not evaluated until it is requested.
Here are the first few elements of the Fibonacci sequence starting with two ones:

scala> val fibs = fibFrom(1, 1).take(7)
fibs: scala.collection.immutable.LazyList[Int] = LazyList(<not computed>)
scala> fibs.toList
res9: List[Int] = List(1, 1, 2, 3, 5, 8, 13)
{% tabs LazyList_3 %}
{% tab 'Scala 2 and 3' for=LazyList_3 %}
~~~scala
scala> val fibs = fibFrom(1, 1).take(7)
fibs: scala.collection.immutable.LazyList[Int] = LazyList(<not computed>)
scala> fibs.toList
res9: List[Int] = List(1, 1, 2, 3, 5, 8, 13)
~~~
{% endtab %}
{% endtabs %}

## Immutable ArraySeqs

Expand All @@ -56,24 +73,32 @@ and thus they can be much more convenient to write.

ArraySeqs are built and updated just like any other sequence.

~~~
{% tabs ArraySeq_1 %}
{% tab 'Scala 2 and 3' for=ArraySeq_1 %}
~~~scala
scala> val arr = scala.collection.immutable.ArraySeq(1, 2, 3)
arr: scala.collection.immutable.ArraySeq[Int] = ArraySeq(1, 2, 3)
scala> val arr2 = arr :+ 4
arr2: scala.collection.immutable.ArraySeq[Int] = ArraySeq(1, 2, 3, 4)
scala> arr2(0)
res22: Int = 1
~~~
{% endtab %}
{% endtabs %}

ArraySeqs are immutable, so you cannot change an element in place. However, the `updated`, `appended` and `prepended`
operations create new ArraySeqs that differ from a given ArraySeq only in a single element:

~~~
{% tabs ArraySeq_2 %}
{% tab 'Scala 2 and 3' for=ArraySeq_2 %}
~~~scala
scala> arr.updated(2, 4)
res26: scala.collection.immutable.ArraySeq[Int] = ArraySeq(1, 2, 4)
scala> arr
res27: scala.collection.immutable.ArraySeq[Int] = ArraySeq(1, 2, 3)
~~~
{% endtab %}
{% endtabs %}

As the last line above shows, a call to `updated` has no effect on the original ArraySeq `arr`.

Expand All @@ -91,67 +116,115 @@ but linear for `ArraySeq`, and, conversely, indexed access is constant for `Arra

Vectors are built and modified just like any other sequence.

scala> val vec = scala.collection.immutable.Vector.empty
vec: scala.collection.immutable.Vector[Nothing] = Vector()
scala> val vec2 = vec :+ 1 :+ 2
vec2: scala.collection.immutable.Vector[Int] = Vector(1, 2)
scala> val vec3 = 100 +: vec2
vec3: scala.collection.immutable.Vector[Int] = Vector(100, 1, 2)
scala> vec3(0)
res1: Int = 100
{% tabs Vector_1 %}
{% tab 'Scala 2 and 3' for=Vector_1 %}
~~~scala
scala> val vec = scala.collection.immutable.Vector.empty
vec: scala.collection.immutable.Vector[Nothing] = Vector()
scala> val vec2 = vec :+ 1 :+ 2
vec2: scala.collection.immutable.Vector[Int] = Vector(1, 2)
scala> val vec3 = 100 +: vec2
vec3: scala.collection.immutable.Vector[Int] = Vector(100, 1, 2)
scala> vec3(0)
res1: Int = 100
~~~
{% endtab %}
{% endtabs %}

Vectors are represented as trees with a high branching factor. (The branching factor of a tree or a graph is the number of children at each node.) The details of how this is accomplished [changed](https://github.com/scala/scala/pull/8534) in Scala 2.13.2, but the basic idea remains the same, as follows.
Vectors are represented as trees with a high branching factor (The branching factor of a tree or a graph is the number of children at each node). The details of how this is accomplished [changed](https://github.com/scala/scala/pull/8534) in Scala 2.13.2, but the basic idea remains the same, as follows.

Every tree node contains up to 32 elements of the vector or contains up to 32 other tree nodes. Vectors with up to 32 elements can be represented in a single node. Vectors with up to `32 * 32 = 1024` elements can be represented with a single indirection. Two hops from the root of the tree to the final element node are sufficient for vectors with up to 2<sup>15</sup> elements, three hops for vectors with 2<sup>20</sup>, four hops for vectors with 2<sup>25</sup> elements and five hops for vectors with up to 2<sup>30</sup> elements. So for all vectors of reasonable size, an element selection involves up to 5 primitive array selections. This is what we meant when we wrote that element access is "effectively constant time".

Like selection, functional vector updates are also "effectively constant time". Updating an element in the middle of a vector can be done by copying the node that contains the element, and every node that points to it, starting from the root of the tree. This means that a functional update creates between one and five nodes that each contain up to 32 elements or subtrees. This is certainly more expensive than an in-place update in a mutable array, but still a lot cheaper than copying the whole vector.

Because vectors strike a good balance between fast random selections and fast random functional updates, they are currently the default implementation of immutable indexed sequences:

scala> collection.immutable.IndexedSeq(1, 2, 3)
res2: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3)
{% tabs Vector_2 %}
{% tab 'Scala 2 and 3' for=Vector_2 %}
~~~scala
scala> collection.immutable.IndexedSeq(1, 2, 3)
res2: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3)
~~~
{% endtab %}
{% endtabs %}

## Immutable Queues

A [Queue](https://www.scala-lang.org/api/{{ site.scala-version }}/scala/collection/immutable/Queue.html) is a first-in-first-out sequence. You enqueue an element onto a queue with `enqueue`, and dequeue an element with `dequeue`. These operations are constant time.

Here's how you can create an empty immutable queue:

scala> val empty = scala.collection.immutable.Queue[Int]()
empty: scala.collection.immutable.Queue[Int] = Queue()
{% tabs Queue_1 %}
{% tab 'Scala 2 and 3' for=Queue_1 %}
~~~scala
scala> val empty = scala.collection.immutable.Queue[Int]()
empty: scala.collection.immutable.Queue[Int] = Queue()
~~~
{% endtab %}
{% endtabs %}

You can append an element to an immutable queue with `enqueue`:

scala> val has1 = empty.enqueue(1)
has1: scala.collection.immutable.Queue[Int] = Queue(1)
{% tabs Queue_2 %}
{% tab 'Scala 2 and 3' for=Queue_2 %}
~~~scala
scala> val has1 = empty.enqueue(1)
has1: scala.collection.immutable.Queue[Int] = Queue(1)
~~~
{% endtab %}
{% endtabs %}

To append multiple elements to a queue, call `enqueueAll` with a collection as its argument:

scala> val has123 = has1.enqueueAll(List(2, 3))
has123: scala.collection.immutable.Queue[Int]
= Queue(1, 2, 3)
{% tabs Queue_3 %}
{% tab 'Scala 2 and 3' for=Queue_3 %}
~~~scala
scala> val has123 = has1.enqueueAll(List(2, 3))
has123: scala.collection.immutable.Queue[Int]
= Queue(1, 2, 3)
~~~
{% endtab %}
{% endtabs %}

To remove an element from the head of the queue, you use `dequeue`:

scala> val (element, has23) = has123.dequeue
element: Int = 1
has23: scala.collection.immutable.Queue[Int] = Queue(2, 3)
{% tabs Queue_4 %}
{% tab 'Scala 2 and 3' for=Queue_4 %}
~~~scala
scala> val (element, has23) = has123.dequeue
element: Int = 1
has23: scala.collection.immutable.Queue[Int] = Queue(2, 3)
~~~
{% endtab %}
{% endtabs %}

Note that `dequeue` returns a pair consisting of the element removed and the rest of the queue.

## Ranges

A [Range](https://www.scala-lang.org/api/{{ site.scala-version }}/scala/collection/immutable/Range.html) is an ordered sequence of integers that are equally spaced apart. For example, "1, 2, 3," is a range, as is "5, 8, 11, 14." To create a range in Scala, use the predefined methods `to` and `by`.

scala> 1 to 3
res2: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3)
scala> 5 to 14 by 3
res3: scala.collection.immutable.Range = Range(5, 8, 11, 14)
{% tabs Range_1 %}
{% tab 'Scala 2 and 3' for=Range_1 %}
~~~scala
scala> 1 to 3
res2: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3)
scala> 5 to 14 by 3
res3: scala.collection.immutable.Range = Range(5, 8, 11, 14)
~~~
{% endtab %}
{% endtabs %}

If you want to create a range that is exclusive of its upper limit, then use the convenience method `until` instead of `to`:

scala> 1 until 3
res2: scala.collection.immutable.Range = Range(1, 2)
{% tabs Range_2 %}
{% tab 'Scala 2 and 3' for=Range_2 %}
~~~scala
scala> 1 until 3
res2: scala.collection.immutable.Range = Range(1, 2)
~~~
{% endtab %}
{% endtabs %}

Ranges are represented in constant space, because they can be defined by just three numbers: their start, their end, and the stepping value. Because of this representation, most operations on ranges are extremely fast.

Expand All @@ -167,11 +240,16 @@ Red-black trees are a form of balanced binary tree where some nodes are designat

Scala provides implementations of immutable sets and maps that use a red-black tree internally. Access them under the names [TreeSet](https://www.scala-lang.org/api/{{ site.scala-version }}/scala/collection/immutable/TreeSet.html) and [TreeMap](https://www.scala-lang.org/api/{{ site.scala-version }}/scala/collection/immutable/TreeMap.html).


scala> scala.collection.immutable.TreeSet.empty[Int]
res11: scala.collection.immutable.TreeSet[Int] = TreeSet()
scala> res11 + 1 + 3 + 3
res12: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 3)
{% tabs Red-Black_1 %}
{% tab 'Scala 2 and 3' for=Red-Black_1 %}
~~~scala
scala> scala.collection.immutable.TreeSet.empty[Int]
res11: scala.collection.immutable.TreeSet[Int] = TreeSet()
scala> res11 + 1 + 3 + 3
res12: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 3)
~~~
{% endtab %}
{% endtabs %}

Red-black trees are the standard implementation of `SortedSet` in Scala, because they provide an efficient iterator that returns all elements in sorted order.

Expand All @@ -183,22 +261,30 @@ Internally, bit sets use an array of 64-bit `Long`s. The first `Long` in the arr

Operations on bit sets are very fast. Testing for inclusion takes constant time. Adding an item to the set takes time proportional to the number of `Long`s in the bit set's array, which is typically a small number. Here are some simple examples of the use of a bit set:

scala> val bits = scala.collection.immutable.BitSet.empty
bits: scala.collection.immutable.BitSet = BitSet()
scala> val moreBits = bits + 3 + 4 + 4
moreBits: scala.collection.immutable.BitSet = BitSet(3, 4)
scala> moreBits(3)
res26: Boolean = true
scala> moreBits(0)
res27: Boolean = false
{% tabs BitSet_1 %}
{% tab 'Scala 2 and 3' for=BitSet_1 %}
~~~scala
scala> val bits = scala.collection.immutable.BitSet.empty
bits: scala.collection.immutable.BitSet = BitSet()
scala> val moreBits = bits + 3 + 4 + 4
moreBits: scala.collection.immutable.BitSet = BitSet(3, 4)
scala> moreBits(3)
res26: Boolean = true
scala> moreBits(0)
res27: Boolean = false
~~~
{% endtab %}
{% endtabs %}

## VectorMaps

A [VectorMap](https://www.scala-lang.org/api/{{ site.scala-version }}/scala/collection/immutable/VectorMap.html) represents
a map using both a `Vector` of keys and a `HashMap`. It provides an iterator that returns all the entries in their
insertion order.

~~~
{% tabs VectorMap_1 %}
{% tab 'Scala 2 and 3' for=VectorMap_1 %}
~~~scala
scala> val vm = scala.collection.immutable.VectorMap.empty[Int, String]
vm: scala.collection.immutable.VectorMap[Int,String] =
VectorMap()
Expand All @@ -211,6 +297,8 @@ vm2: scala.collection.immutable.VectorMap[Int,String] =
scala> vm2 == Map(2 -> "two", 1 -> "one")
res29: Boolean = true
~~~
{% endtab %}
{% endtabs %}

The first lines show that the content of the `VectorMap` keeps the insertion order, and the last line
shows that `VectorMap`s are comparable with other `Map`s and that this comparison does not take the
Expand All @@ -220,8 +308,14 @@ order of elements into account.

A [ListMap](https://www.scala-lang.org/api/{{ site.scala-version }}/scala/collection/immutable/ListMap.html) represents a map as a linked list of key-value pairs. In general, operations on a list map might have to iterate through the entire list. Thus, operations on a list map take time linear in the size of the map. In fact there is little usage for list maps in Scala because standard immutable maps are almost always faster. The only possible exception to this is if the map is for some reason constructed in such a way that the first elements in the list are selected much more often than the other elements.

scala> val map = scala.collection.immutable.ListMap(1->"one", 2->"two")
map: scala.collection.immutable.ListMap[Int,java.lang.String] =
Map(1 -> one, 2 -> two)
scala> map(2)
res30: String = "two"
{% tabs ListMap_1 %}
{% tab 'Scala 2 and 3' for=ListMap_1 %}
~~~scala
scala> val map = scala.collection.immutable.ListMap(1->"one", 2->"two")
map: scala.collection.immutable.ListMap[Int,java.lang.String] =
Map(1 -> one, 2 -> two)
scala> map(2)
res30: String = "two"
~~~
{% endtab %}
{% endtabs %}
Loading