@@ -6,13 +6,15 @@ title: The Architecture of Scala Collections
6
6
** Martin Odersky and Lex Spoon**
7
7
8
8
These pages describe the architecture of the Scala collections
9
- framework in detail. Compared to the Scala 2.8 Collections API you
9
+ framework in detail. Compared to
10
+ [ the Scala 2.8 Collections API] ( http://scala.github.com/2.9.1/overviews/collections.html ) you
10
11
will find out more about the internal workings of the framework. You
11
12
will also learn how this architecture helps you define your own
12
13
collections in a few lines of code, while reusing the overwhelming
13
14
part of collection functionality from the framework.
14
15
15
- The Scala 2.8 Collections API contains a large number of collection
16
+ [ The Scala 2.8 Collections API] ( http://scala.github.com/2.9.1/overviews/collections.html )
17
+ contains a large number of collection
16
18
operations, which exist uniformly on many different collection
17
19
implementations. Implementing every collection operation anew for
18
20
every collection type would lead to an enormous amount of code, most
@@ -30,7 +32,7 @@ templates and other classes and traits that constitute the "building
30
32
blocks" of the framework, as well as the construction principles they
31
33
support.
32
34
33
- ## Builders
35
+ ## Builders ##
34
36
35
37
An outline of the ` Builder ` class:
36
38
@@ -80,9 +82,9 @@ of `buf` is computed, which yields the array buffer `buf` itself. This
80
82
array buffer is then mapped with ` _.toArray ` to an array. So the end
81
83
result is that ` bldr ` is a builder for arrays.
82
84
83
- ## Factoring out common operations
85
+ ## Factoring out common operations ##
84
86
85
- Implementation of ` filter ` in ` TraversableLike ` :
87
+ ### Outline of class TraversableLike ###
86
88
87
89
package scala.collection
88
90
@@ -134,8 +136,8 @@ collection implementation trait.
134
136
135
137
Taking ` filter ` as an example, this operation is defined once for all
136
138
collection classes in the trait ` TraversableLike ` . An outline of the
137
- relevant code is shown in the above outline of class
138
- ` TraversableLike ` . The trait declares two abstract methods, ` newBuilder `
139
+ relevant code is shown in the above [ outline of class
140
+ ` TraversableLike ` ] ( #outline_of_class_traversablelike ) . The trait declares two abstract methods, ` newBuilder `
139
141
and ` foreach ` , which are implemented in concrete collection classes. The
140
142
` filter ` operation is implemented in the same way for all collections
141
143
using these methods. It first constructs a new builder for the
@@ -152,7 +154,7 @@ instance, if `f` is a function from `String` to `Int`, and `xs` is a
152
154
if ` ys ` is an ` Array[String] ` , then ` ys map f ` should give an
153
155
` Array[Int] ` . The problem is how to achieve that without duplicating
154
156
the definition of the ` map ` method in lists and arrays. The
155
- ` newBuilder ` /` foreach ` framework shown in class ` TraversableLike ` is
157
+ ` newBuilder ` /` foreach ` framework shown in [ class ` TraversableLike ` ] ( #outline_of_class_traversablelike ) is
156
158
not sufficient for this because it only allows creation of new
157
159
instances of the same collection * type* whereas ` map ` needs an
158
160
instance of the same collection * type constructor* , but possibly with
@@ -230,8 +232,9 @@ Implementation of `map` in `TraversableLike`:
230
232
}
231
233
232
234
The listing above shows trait ` TraversableLike ` 's implementation of
233
- ` map ` . It's quite similar to the implementation of ` filter ` shown in class
234
- ` TraversableLike ` . The principal difference is that where ` filter ` used
235
+ ` map ` . It's quite similar to the implementation of ` filter ` shown in [ class
236
+ ` TraversableLike ` ] ( #outline_of_class_traversablelike ) .
237
+ The principal difference is that where ` filter ` used
235
238
the ` newBuilder ` method, which is abstract in class ` TraversableLike ` , ` map `
236
239
uses a * builder factory* that's passed as an additional implicit
237
240
parameter of type ` CanBuildFrom ` .
@@ -290,14 +293,14 @@ resolution to resolve constraints on the types of `map`, and virtual
290
293
dispatch to pick the best dynamic type that corresponds to these
291
294
constraints.
292
295
293
- ## Integrating new collections
296
+ ## Integrating new collections ##
294
297
295
298
What needs to be done if you want to integrate a new collection class,
296
299
so that it can profit from all predefined operations at the right
297
300
types? On the next few pages you'll be walked through two examples
298
301
that do this.
299
302
300
- ### Integrating sequences
303
+ ### Integrating sequences ###
301
304
302
305
RNA Bases:
303
306
@@ -335,7 +338,7 @@ two-bit values in an integer. The idea, then, is to construct a
335
338
specialized subclass of ` Seq[Base] ` , which uses this packed
336
339
representation.
337
340
338
- RNA strands class, first version:
341
+ #### First version of RNA strands class ####
339
342
340
343
import collection.IndexedSeqLike
341
344
import collection.mutable.{Builder, ArrayBuffer}
@@ -374,7 +377,7 @@ RNA strands class, first version:
374
377
def apply(bases: Base*) = fromSeq(bases)
375
378
}
376
379
377
- The RNA strands class listing above presents the first version of this
380
+ The [ RNA strands class listing] ( #first_version_of_rna_strands_class ) above presents the first version of this
378
381
class. It will be refined later. The class ` RNA1 ` has a constructor that
379
382
takes an array of ` Int ` s as its first argument. This array contains the
380
383
packed RNA data, with sixteen bases in each element, except for the
@@ -385,8 +388,8 @@ argument, `length`, specifies the total number of bases on the array
385
388
defines two abstract methods, ` length ` and ` apply ` . These need to be
386
389
implemented in concrete subclasses. Class ` RNA1 ` implements ` length `
387
390
automatically by defining a parametric field of the same name. It
388
- implements the indexing method ` apply ` with the code given in class
389
- ` RNA1 ` . Essentially, ` apply ` first extracts an integer value from the
391
+ implements the indexing method ` apply ` with the code given in [ class
392
+ ` RNA1 ` ] ( #first_version_of_rna_strands_class ) . Essentially, ` apply ` first extracts an integer value from the
390
393
` groups ` array, then extracts the correct two-bit number from that
391
394
integer using right shift (` >> ` ) and mask (` & ` ). The private constants ` S ` ,
392
395
` N ` , and ` M ` come from the ` RNA1 ` companion object. ` S ` specifies the size of
@@ -429,7 +432,7 @@ creation schemes in action:
429
432
scala> val rna1 = RNA1(A, U, G, G, T)
430
433
rna1: RNA1 = RNA1(A, U, G, G, T)
431
434
432
- ## Adapting the result type of ` RNA ` methods
435
+ ## Adapting the result type of RNA methods ##
433
436
434
437
Here are some more interactions with the ` RNA1 ` abstraction:
435
438
@@ -446,14 +449,14 @@ The first two results are as expected, but the last result of taking
446
449
the first three elements of ` rna1 ` might not be. In fact, you see a
447
450
` IndexedSeq[Base] ` as static result type and a ` Vector ` as the dynamic
448
451
type of the result value. You might have expected to see an ` RNA1 ` value
449
- instead. But this is not possible because all that was done in class
450
- ` RNA1 ` was making ` RNA1 ` extend ` IndexedSeq ` . Class ` IndexedSeq ` , on the other
452
+ instead. But this is not possible because all that was done in [ class
453
+ ` RNA1 ` ] ( #first_version_of_rna_strands_class ) was making ` RNA1 ` extend ` IndexedSeq ` . Class ` IndexedSeq ` , on the other
451
454
hand, has a ` take ` method that returns an ` IndexedSeq ` , and that's
452
455
implemented in terms of ` IndexedSeq ` 's default implementation,
453
456
` Vector ` . So that's what you were seeing on the last line of the
454
457
previous interaction.
455
458
456
- RNA strands class, second version:
459
+ ### Second version of RNA strands class ###
457
460
458
461
final class RNA2 private (
459
462
val groups: Array[Int],
@@ -525,13 +528,13 @@ method `newBuilder` with result type `Builder[Base, RNA2]` needed to be
525
528
defined, but a method ` newBuilder ` with result type
526
529
` Builder[Base,IndexedSeq[Base]] ` was found. The latter does not override
527
530
the former. The first method, whose result type is ` Builder[Base, RNA2] ` , is an abstract method that got instantiated at this type in
528
- class ` RNA2 ` by passing the ` RNA2 ` type parameter to ` IndexedSeqLike ` . The
531
+ [ class ` RNA2 ` ] ( #second_version_of_rna_strands_class ) by passing the ` RNA2 ` type parameter to ` IndexedSeqLike ` . The
529
532
second method, of result type ` Builder[Base,IndexedSeq[Base]] ` , is
530
533
what's provided by the inherited ` IndexedSeq ` class. In other words, the
531
534
` RNA2 ` class is invalid without a definition of ` newBuilder ` with the
532
535
first result type.
533
536
534
- With the refined implementation of the ` RNA2 ` class, methods like ` take ` ,
537
+ With the refined implementation of the [ ` RNA2 ` class] ( #second_version_of_rna_strands_class ) , methods like ` take ` ,
535
538
` drop ` , or ` filter ` work now as expected:
536
539
537
540
scala> val rna2 = RNA2(A, U, G, G, T)
@@ -543,7 +546,7 @@ With the refined implementation of the `RNA2` class, methods like `take`,
543
546
scala> rna2 filter (U !=)
544
547
res6: RNA2 = RNA2(A, G, G, T)
545
548
546
- ## Dealing with map and friends
549
+ ## Dealing with map and friends ##
547
550
548
551
However, there is another class of methods in collections that are not
549
552
dealt with yet. These methods do not always return the collection type
@@ -588,7 +591,7 @@ yield a general sequence, but it cannot yield another RNA strand.
588
591
Vector(A, U, G, G, T, missing, data)
589
592
590
593
This is what you'd expect in the ideal case. But this is not what the
591
- ` RNA2 ` class provides. In fact, if you ran the first two examples above
594
+ [ ` RNA2 ` class] ( #second_version_of_rna_strands_class ) provides. In fact, if you ran the first two examples above
592
595
with instances of this class you would obtain:
593
596
594
597
scala> val rna2 = RNA2(A, U, G, G, T)
@@ -626,7 +629,7 @@ collection classes. In essence, an implicit value of type
626
629
of type ` From ` , to build with elements of type ` Elem ` a collection of type
627
630
` To ` ."
628
631
629
- RNA strands class, final version:
632
+ ### Final version of RNA strands class ###
630
633
631
634
final class RNA private (val groups: Array[Int], val length: Int)
632
635
extends IndexedSeq[Base] with IndexedSeqLike[Base, RNA] {
@@ -657,7 +660,7 @@ RNA strands class, final version:
657
660
}
658
661
}
659
662
660
- RNA companion object--final version:
663
+ ### Final version of RNA companion object ###
661
664
662
665
object RNA {
663
666
@@ -696,8 +699,9 @@ of `CanBuildFrom` in the companion object of the RNA class. That
696
699
instance should have type ` CanBuildFrom[RNA, Base, RNA] ` . Hence, this
697
700
instance states that, given an RNA strand and a new element type ` Base ` ,
698
701
you can build another collection which is again an RNA strand. The two
699
- listings above on class ` RNA ` and its companion object show the
700
- details. Compared to class ` RNA2 ` there are two important
702
+ listings above on [ class ` RNA ` ] ( #final_version_of_rna_strands_class ) and
703
+ [ its companion object] ( #final_version_of_rna_companion_object ) show the
704
+ details. Compared to [ class ` RNA2 ` ] ( #second_version_of_rna_strands_class ) there are two important
701
705
differences. First, the ` newBuilder ` implementation has moved from the
702
706
RNA class to its companion object. The ` newBuilder ` method in class ` RNA `
703
707
simply forwards to this definition. Second, there is now an implicit
@@ -713,14 +717,14 @@ is a final class, so any receiver of static type `RNA` also has `RNA` as
713
717
its dynamic type. That's why ` apply(from) ` also simply calls ` newBuilder ` ,
714
718
ignoring its argument.
715
719
716
- That is it. The final ` RNA ` class implements all collection methods at
720
+ That is it. The final [ ` RNA ` class] ( #final_version_of_rna_strands_class ) implements all collection methods at
717
721
their natural types. Its implementation requires a little bit of
718
722
protocol. In essence, you need to know where to put the ` newBuilder `
719
723
factories and the ` canBuildFrom ` implicits. On the plus side, with
720
724
relatively little code you get a large number of methods automatically
721
725
defined. Also, if you don't intend to do bulk operations like ` take ` ,
722
726
` drop ` , ` map ` , or ` ++ ` on your collection you can choose to not go the extra
723
- length and stop at the implementation shown in for class ` RNA1 ` .
727
+ length and stop at the implementation shown in for [ class ` RNA1 ` ] ( #first_version_of_rna_strands_class ) .
724
728
725
729
The discussion so far centered on the minimal amount of definitions
726
730
needed to define new sequences with methods that obey certain
@@ -741,7 +745,7 @@ immediately applies the given function to all bases contained in
741
745
it. So the effort for array selection and bit unpacking is much
742
746
reduced.
743
747
744
- ## Integrating new sets and maps
748
+ ## Integrating new sets and maps ##
745
749
746
750
As a second example you'll learn how to integrate a new kind of map
747
751
into the collection framework. The idea is to implement a mutable map
@@ -973,7 +977,7 @@ map `res0` and produces pairs of strings. The result of the `map` is a
973
977
the ` canBuildFrom ` implicit in ` PrefixMap ` the result would just have been
974
978
a general mutable map, not a prefix map.
975
979
976
- ## Summary
980
+ ## Summary ##
977
981
978
982
To summarize, if you want to fully integrate a new collection class
979
983
into the framework you need to pay attention to the following points:
@@ -991,7 +995,7 @@ build new kinds of collections. Because of Scala's rich support for
991
995
abstraction, each new collection type can have a large number of
992
996
methods without having to reimplement them all over again.
993
997
994
- ### Acknowledgement
998
+ ### Acknowledgement ###
995
999
996
1000
These pages contain material adapted from the 2nd edition of
997
1001
[ Programming in Scala] ( http://www.artima.com/shop/programming_in_scala ) by
0 commit comments