SlideShare a Scribd company logo
Scala Parallel Collections 
Aleksandar Prokopec 
EPFL
Scala Parallel Collections
Scala collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s) 
McDonald
Scala collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s) 
1040 ms
Scala Parallel Collections
Scala parallel collections 
for { 
s <- surnames 
n <- names 
if s endsWith n 
} yield (n, s)
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s)
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s) 
2 cores 
575 ms
Scala parallel collections 
for { 
s <- surnames.par 
n <- names.par 
if s endsWith n 
} yield (n, s) 
4 cores 
305 ms
for comprehensions 
surnames.par.flatMap { s => 
names.par 
.filter(n => s endsWith n) 
.map(n => (n, s)) 
}
for comprehensions nested parallelized bulk operations 
surnames.par.flatMap { s => 
names.par 
.filter(n => s endsWith n) 
.map(n => (n, s)) 
}
Nested parallelism
Nested parallelism parallel within parallel 
composition 
surnames.par.flatMap { s => 
surnameToCollection(s) 
// may invoke parallel ops 
}
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ...
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
recursive algorithms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, Array(""))
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: Seq[String]): Seq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, Array("")) 
1545 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray(""))
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
1 core 
1575 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
2 cores 
809 ms
Nested parallelism going recursive 
def vowel(c: Char): Boolean = ... 
def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = 
if (n == 0) acc 
else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield 
if (s.length == 0) s + c 
else if (vowel(s.last) && !vowel(c)) s + c 
else if (!vowel(s.last) && vowel(c)) s + c 
else s 
gen(5, ParArray("")) 
4 cores 
530 ms
So, I just use par and I’m home free?
How to think parallel
Character count use case for foldLeft 
val txt: String = ... 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}
6 
5 
4 
3 
2 
1 
0 
Character count use case for foldLeft 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
going left to right - not parallelizable! 
A 
B 
C 
D 
E 
F 
_ + 1
Character count use case for foldLeft 
txt.foldLeft(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
going left to right – not really necessary 
3 
2 
1 
0 
A 
B 
C 
_ + 1 
3 
2 
1 
0 
D 
E 
F 
_ + 1 
_ + _ 
6
Character count in parallel 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}
Character count in parallel 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
3 
2 
1 
A 
B 
C 
_ + 1 
3 
2 
1 
A 
B 
C 
: (Int, Char) => Int
Character count fold not applicable 
txt.fold(0) { 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
} 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
! (Int, Int) => Int
Character count use case for aggregate 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _)
3 
2 
1 
A 
B 
C 
Character count use case for aggregate 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
_ + 1
Character count use case for aggregate 
aggregation  element 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + 1
Character count use case for aggregate 
aggregation  aggregation 
aggregation  element 
3 
2 
1 
A 
B 
C 
_ + _ 
3 
3 
3 
2 
1 
A 
B 
C 
txt.aggregate(0)({ 
case (a, ‘ ‘) => a 
case (a, c) => a + 1 
}, _ + _) 
_ + 1
Word count another use case for foldLeft 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
}
Word count initial accumulation 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
0 words so far 
last character was a space 
“Folding me softly.”
Word count a space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character is a space
Word count a non space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character was a space – a new word
Word count a non space 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
“Folding me softly.” 
last seen character wasn’t a space – no new word
Word count in parallel 
“softly.“ 
“Folding me “ 
P1 
P2
Word count in parallel 
“softly.“ 
“Folding me “ 
wc = 2; rs = 1 
wc = 1; ls = 0 
 
P1 
P2
Word count in parallel 
“softly.“ 
“Folding me “ 
wc = 2; rs = 1 
wc = 1; ls = 0 
 
wc = 3 
P1 
P2
Word count must assume arbitrary partitions 
“g me softly.“ 
“Foldin“ 
wc = 1; rs = 0 
wc = 3; ls = 0 
 
P1 
P2
Word count must assume arbitrary partitions 
“g me softly.“ 
“Foldin“ 
wc = 1; rs = 0 
wc = 3; ls = 0 
 
P1 
P2 
wc = 3
Word count initial aggregation 
txt.par.aggregate((0, 0, 0))
Word count initial aggregation 
txt.par.aggregate((0, 0, 0)) 
# spaces on the left 
# spaces on the right 
#words
Word count initial aggregation 
txt.par.aggregate((0, 0, 0)) 
# spaces on the left 
# spaces on the right 
#words 
””
Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
““ 
“Folding me“ 
 
“softly.“ 
““ 

Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
“e softly.“ 
“Folding m“ 

Word count aggregation  aggregation 
... 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
“ softly.“ 
“Folding me” 

Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
”_” 
0 words and a space – add one more space each side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
” m” 
0 words and a non-space – one word, no spaces on the right side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
” me_” 
nonzero words and a space – one more space on the right side
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
” me sof” 
nonzero words, last non-space and current non-space – no change
Word count aggregation  element 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
” me s” 
nonzero words, last space and current non-space – one more word
Word count in parallel 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count using parallel strings? 
txt.par.aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count string not really parallelizable 
scala> (txt: String).par
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…)
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation!
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation! 
ParArray
Word count string not really parallelizable 
scala> (txt: String).par 
collection.parallel.ParSeq[Char] = ParArray(…) 
different internal representation! 
ParArray 
 copy string contents into an array
Conversions going parallel 
// `par` is efficient for... 
mutable.{Array, ArrayBuffer, ArraySeq} 
mutable.{HashMap, HashSet} 
immutable.{Vector, Range} 
immutable.{HashMap, HashSet}
Conversions going parallel 
// `par` is efficient for... 
mutable.{Array, ArrayBuffer, ArraySeq} 
mutable.{HashMap, HashSet} 
immutable.{Vector, Range} 
immutable.{HashMap, HashSet} 
most other collections construct a new parallel collection!
Conversions going parallel 
sequential 
parallel 
Array, ArrayBuffer, ArraySeq 
mutable.ParArray 
mutable.HashMap 
mutable.ParHashMap 
mutable.HashSet 
mutable.ParHashSet 
immutable.Vector 
immutable.ParVector 
immutable.Range 
immutable.ParRange 
immutable.HashMap 
immutable.ParHashMap 
immutable.HashSet 
immutable.ParHashSet
Conversions going parallel 
// `seq` is always efficient 
ParArray(1, 2, 3).seq 
List(1, 2, 3, 4).seq 
ParHashMap(1 -> 2, 3 -> 4).seq 
”abcd”.seq 
// `par` may not be... 
”abcd”.par
Custom collections
Custom collection 
class ParString(val str: String)
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] {
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str)
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str) 
def splitter: Splitter[Char]
Custom collection 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
def length = str.length 
def seq = new WrappedString(str) 
def splitter = 
new ParStringSplitter(0, str.length)
Custom collection splitter definition 
class ParStringSplitter(var i: Int, len: Int) 
extends Splitter[Char] {
Custom collection splitters are iterators 
class ParStringSplitter(i: Int, len: Int) 
extends Splitter[Char] { 
def hasNext = i < len 
def next = { 
val r = str.charAt(i) 
i += 1 
r 
}
Custom collection splitters must be duplicated 
... 
def dup = new ParStringSplitter(i, len)
Custom collection splitters know how many elements remain 
... 
def dup = new ParStringSplitter(i, len) 
def remaining = len - i
Custom collection splitters can be split 
... 
def psplit(sizes: Int*): Seq[ParStringSplitter] = { 
val splitted = new ArrayBuffer[ParStringSplitter] 
for (sz <- sizes) { 
val next = (i + sz) min ntl 
splitted += new ParStringSplitter(i, next) 
i = next 
} 
splitted 
}
Word count now with parallel strings 
new ParString(txt).aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
})
Word count performance 
txt.foldLeft((0, true)) { 
case ((wc, _), ' ') => (wc, true) 
case ((wc, true), x) => (wc + 1, false) 
case ((wc, false), x) => (wc, false) 
} 
new ParString(txt).aggregate((0, 0, 0))({ 
case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) 
case ((ls, 0, _), c) => (ls, 1, 0) 
case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) 
case ((ls, wc, 0), c) => (ls, wc, 0) 
case ((ls, wc, rs), c) => (ls, wc + 1, 0) 
}, { 
case ((0, 0, 0), res) => res 
case (res, (0, 0, 0)) => res 
case ((lls, lwc, 0), (0, rwc, rrs)) => 
(lls, lwc + rwc - 1, rrs) 
case ((lls, lwc, _), (_, rwc, rrs)) => 
(lls, lwc + rwc, rrs) 
}) 
100 ms 
cores: 1 2 4 
time: 137 ms 70 ms 35 ms
Hierarchy 
GenTraversable 
GenIterable 
GenSeq 
Traversable 
Iterable 
Seq 
ParIterable 
ParSeq
Hierarchy 
def nonEmpty(sq: Seq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
}
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
}
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
} 
side-effects! 
ArrayBuffer is not synchronized!
Hierarchy 
def nonEmpty(sq: ParSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res += s 
} 
res 
} 
side-effects! 
ArrayBuffer is not synchronized! 
ParSeq 
Seq
Hierarchy 
def nonEmpty(sq: GenSeq[String]) = { 
val res = new mutable.ArrayBuffer[String]() 
for (s <- sq) { 
if (s.nonEmpty) res.synchronized { 
res += s 
} 
} 
res 
}
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
These return collections!
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
Sequential collections – builders
Accessors vs. transformers some methods need more than just splitters 
foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … 
map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … 
Sequential collections – builders 
Parallel collections – combiners
Builders building a sequential collection 
1 
2 
3 
4 
5 
6 
7 
Nil 
Nil 
ListBuilder 
+= 
+= 
+= 
result
How to build parallel?
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
}
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
Combiner 
Combiner 
Combiner
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
Should be efficient – O(log n) worst case
Combiners building parallel collections 
trait Combiner[-Elem, +To] 
extends Builder[Elem, To] { 
def combine[N <: Elem, NewTo >: To] 
(other: Combiner[N, NewTo]): 
Combiner[N, NewTo] 
} 
How to implement this combine?
Parallel arrays 
1, 2, 3, 4 
5, 6, 7, 8 
4 
6, 8 
3, 1, 8, 0 
2, 2, 1, 9 
8, 0 
2, 2 
merge 
merge 
merge 
copy 
allocate 
2 
4 
6 
8 
8 
0 
2 
2
Parallel hash tables 
ParHashMap
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
e.g. calling filter
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
ParHashCombiner 
e.g. calling filter
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
0 
1 
4 
ParHashCombiner 
5 
7 
9
Parallel hash tables 
ParHashMap 
0 
1 
2 
4 
5 
7 
8 
9 
ParHashCombiner 
0 
1 
4 
ParHashCombiner 
5 
9 
5 
7 
0 
1 
4 
7 
9
Parallel hash tables 
ParHashMap 
ParHashCombiner 
ParHashCombiner 
How to merge? 
5 
7 
0 
1 
4 
9
5 
7 
8 
9 
1 
4 
0 
Parallel hash tables 
buckets! 
ParHashCombiner 
ParHashCombiner 
ParHashMap 
2 
0 = 00002 
1 = 00012 
4 = 01002
Parallel hash tables 
ParHashCombiner 
ParHashCombiner 
0 
1 
4 
9 
7 
5 
combine
Parallel hash tables 
ParHashCombiner 
ParHashCombiner 
9 
7 
5 
0 
1 
4 
ParHashCombiner 
no copying!
Parallel hash tables 
9 
7 
5 
0 
1 
4 
ParHashCombiner
Parallel hash tables 
9 
7 
5 
0 
1 
4 
ParHashMap
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
What is the return type here?
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
creates a ParVector!
Custom combiners for methods returning custom collections 
new ParString(txt).filter(_ != ‘ ‘) 
creates a ParVector! 
class ParString(val str: String) 
extends parallel.immutable.ParSeq[Char] { 
def apply(i: Int) = str.charAt(i) 
...
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
...
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
... 
protected[this] override def newCombiner 
: Combiner[Char, ParString]
Custom combiners for methods returning custom collections 
class ParString(val str: String) 
extends immutable.ParSeq[Char] 
with ParSeqLike[Char, ParString, WrappedString] 
{ 
def apply(i: Int) = str.charAt(i) 
... 
protected[this] override def newCombiner = 
new ParStringCombiner
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] {
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
size
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
size
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
size 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
size 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
size 
lastc 
chunks
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
def +=(elem: Char) = { 
lastc += elem 
size += 1 
this 
}
Custom combiners for methods returning custom collections 
class ParStringCombiner 
extends Combiner[Char, ParString] { 
var size = 0 
val chunks = ArrayBuffer(new StringBuilder) 
var lastc = chunks.last 
def +=(elem: Char) = { 
lastc += elem 
size += 1 
this 
} 
size 
lastc 
chunks 
+1
Custom combiners for methods returning custom collections 
... 
def combine[U <: Char, NewTo >: ParString] 
(other: Combiner[U, NewTo]) = other match { 
case psc: ParStringCombiner => 
sz += that.sz 
chunks ++= that.chunks 
lastc = chunks.last 
this 
}
Custom combiners for methods returning custom collections 
... 
def combine[U <: Char, NewTo >: ParString] 
(other: Combiner[U, NewTo]) 
lastc 
chunks 
lastc 
chunks
Custom combiners for methods returning custom collections 
... 
def result = { 
val rsb = new StringBuilder 
for (sb <- chunks) rsb.append(sb) 
new ParString(rsb.toString) 
} 
...
Custom combiners for methods returning custom collections 
... 
def result = ... 
lastc 
chunks 
StringBuilder
Custom combiners for methods expecting implicit builder factories 
// only for big boys 
... 
with GenericParTemplate[T, ParColl] 
... 
object ParColl extends ParFactory[ParColl] { 
implicit def canCombineFrom[T] = 
new GenericCanCombineFrom[T] 
...
Custom combiners performance measurement 
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘)
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
2 cores 
81 ms 
Custom combiners performance measurement
txt.filter(_ != ‘ ‘) 
new ParString(txt).filter(_ != ‘ ‘) 
106 ms 
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
Custom combiners performance measurement
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
t/ms 
proc 
125 ms 
1 
2 
4 
81 ms 
56 ms 
Custom combiners performance measurement
1 core 
125 ms 
2 cores 
81 ms 
4 cores 
56 ms 
t/ms 
proc 
125 ms 
1 
2 
4 
81 ms 
56 ms 
def result 
(not parallelized) 
Custom combiners performance measurement
Custom combiners tricky! 
•two-step evaluation 
–parallelize the result method in combiners 
•efficient merge operation 
–binomial heaps, ropes, etc. 
•concurrent data structures 
–non-blocking scalable insertion operation 
–we’re working on this
Future work coming up 
•concurrent data structures 
•more efficient vectors 
•custom task pools 
•user defined scheduling 
•parallel bulk in-place modifications
Thank you! 
Examples at: 
git://github.com/axel22/sd.git

More Related Content

Scala Parallel Collections

  • 1. Scala Parallel Collections Aleksandar Prokopec EPFL
  • 3. Scala collections for { s <- surnames n <- names if s endsWith n } yield (n, s) McDonald
  • 4. Scala collections for { s <- surnames n <- names if s endsWith n } yield (n, s) 1040 ms
  • 6. Scala parallel collections for { s <- surnames n <- names if s endsWith n } yield (n, s)
  • 7. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s)
  • 8. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s) 2 cores 575 ms
  • 9. Scala parallel collections for { s <- surnames.par n <- names.par if s endsWith n } yield (n, s) 4 cores 305 ms
  • 10. for comprehensions surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s)) }
  • 11. for comprehensions nested parallelized bulk operations surnames.par.flatMap { s => names.par .filter(n => s endsWith n) .map(n => (n, s)) }
  • 13. Nested parallelism parallel within parallel composition surnames.par.flatMap { s => surnameToCollection(s) // may invoke parallel ops }
  • 14. Nested parallelism going recursive def vowel(c: Char): Boolean = ...
  • 15. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc
  • 16. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield recursive algorithms
  • 17. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c
  • 18. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c
  • 19. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, Array(""))
  • 20. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: Seq[String]): Seq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, Array("")) 1545 ms
  • 21. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray(""))
  • 22. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 1 core 1575 ms
  • 23. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 2 cores 809 ms
  • 24. Nested parallelism going recursive def vowel(c: Char): Boolean = ... def gen(n: Int, acc: ParSeq[String]): ParSeq[String] = if (n == 0) acc else for (s <- gen(n - 1, acc); c <- 'a' to 'z') yield if (s.length == 0) s + c else if (vowel(s.last) && !vowel(c)) s + c else if (!vowel(s.last) && vowel(c)) s + c else s gen(5, ParArray("")) 4 cores 530 ms
  • 25. So, I just use par and I’m home free?
  • 26. How to think parallel
  • 27. Character count use case for foldLeft val txt: String = ... txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 }
  • 28. 6 5 4 3 2 1 0 Character count use case for foldLeft txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } going left to right - not parallelizable! A B C D E F _ + 1
  • 29. Character count use case for foldLeft txt.foldLeft(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } going left to right – not really necessary 3 2 1 0 A B C _ + 1 3 2 1 0 D E F _ + 1 _ + _ 6
  • 30. Character count in parallel txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 }
  • 31. Character count in parallel txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } 3 2 1 A B C _ + 1 3 2 1 A B C : (Int, Char) => Int
  • 32. Character count fold not applicable txt.fold(0) { case (a, ‘ ‘) => a case (a, c) => a + 1 } 3 2 1 A B C _ + _ 3 3 3 2 1 A B C ! (Int, Int) => Int
  • 33. Character count use case for aggregate txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _)
  • 34. 3 2 1 A B C Character count use case for aggregate txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + _ 3 3 3 2 1 A B C _ + 1
  • 35. Character count use case for aggregate aggregation  element 3 2 1 A B C _ + _ 3 3 3 2 1 A B C txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + 1
  • 36. Character count use case for aggregate aggregation  aggregation aggregation  element 3 2 1 A B C _ + _ 3 3 3 2 1 A B C txt.aggregate(0)({ case (a, ‘ ‘) => a case (a, c) => a + 1 }, _ + _) _ + 1
  • 37. Word count another use case for foldLeft txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) }
  • 38. Word count initial accumulation txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } 0 words so far last character was a space “Folding me softly.”
  • 39. Word count a space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character is a space
  • 40. Word count a non space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character was a space – a new word
  • 41. Word count a non space txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } “Folding me softly.” last seen character wasn’t a space – no new word
  • 42. Word count in parallel “softly.“ “Folding me “ P1 P2
  • 43. Word count in parallel “softly.“ “Folding me “ wc = 2; rs = 1 wc = 1; ls = 0  P1 P2
  • 44. Word count in parallel “softly.“ “Folding me “ wc = 2; rs = 1 wc = 1; ls = 0  wc = 3 P1 P2
  • 45. Word count must assume arbitrary partitions “g me softly.“ “Foldin“ wc = 1; rs = 0 wc = 3; ls = 0  P1 P2
  • 46. Word count must assume arbitrary partitions “g me softly.“ “Foldin“ wc = 1; rs = 0 wc = 3; ls = 0  P1 P2 wc = 3
  • 47. Word count initial aggregation txt.par.aggregate((0, 0, 0))
  • 48. Word count initial aggregation txt.par.aggregate((0, 0, 0)) # spaces on the left # spaces on the right #words
  • 49. Word count initial aggregation txt.par.aggregate((0, 0, 0)) # spaces on the left # spaces on the right #words ””
  • 50. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res ““ “Folding me“  “softly.“ ““ 
  • 51. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) “e softly.“ “Folding m“ 
  • 52. Word count aggregation  aggregation ... }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) “ softly.“ “Folding me” 
  • 53. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) ”_” 0 words and a space – add one more space each side
  • 54. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) ” m” 0 words and a non-space – one word, no spaces on the right side
  • 55. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) ” me_” nonzero words and a space – one more space on the right side
  • 56. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) ” me sof” nonzero words, last non-space and current non-space – no change
  • 57. Word count aggregation  element txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) ” me s” nonzero words, last space and current non-space – one more word
  • 58. Word count in parallel txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 59. Word count using parallel strings? txt.par.aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 60. Word count string not really parallelizable scala> (txt: String).par
  • 61. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…)
  • 62. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation!
  • 63. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation! ParArray
  • 64. Word count string not really parallelizable scala> (txt: String).par collection.parallel.ParSeq[Char] = ParArray(…) different internal representation! ParArray  copy string contents into an array
  • 65. Conversions going parallel // `par` is efficient for... mutable.{Array, ArrayBuffer, ArraySeq} mutable.{HashMap, HashSet} immutable.{Vector, Range} immutable.{HashMap, HashSet}
  • 66. Conversions going parallel // `par` is efficient for... mutable.{Array, ArrayBuffer, ArraySeq} mutable.{HashMap, HashSet} immutable.{Vector, Range} immutable.{HashMap, HashSet} most other collections construct a new parallel collection!
  • 67. Conversions going parallel sequential parallel Array, ArrayBuffer, ArraySeq mutable.ParArray mutable.HashMap mutable.ParHashMap mutable.HashSet mutable.ParHashSet immutable.Vector immutable.ParVector immutable.Range immutable.ParRange immutable.HashMap immutable.ParHashMap immutable.HashSet immutable.ParHashSet
  • 68. Conversions going parallel // `seq` is always efficient ParArray(1, 2, 3).seq List(1, 2, 3, 4).seq ParHashMap(1 -> 2, 3 -> 4).seq ”abcd”.seq // `par` may not be... ”abcd”.par
  • 70. Custom collection class ParString(val str: String)
  • 71. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] {
  • 72. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length
  • 73. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str)
  • 74. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter: Splitter[Char]
  • 75. Custom collection class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) def length = str.length def seq = new WrappedString(str) def splitter = new ParStringSplitter(0, str.length)
  • 76. Custom collection splitter definition class ParStringSplitter(var i: Int, len: Int) extends Splitter[Char] {
  • 77. Custom collection splitters are iterators class ParStringSplitter(i: Int, len: Int) extends Splitter[Char] { def hasNext = i < len def next = { val r = str.charAt(i) i += 1 r }
  • 78. Custom collection splitters must be duplicated ... def dup = new ParStringSplitter(i, len)
  • 79. Custom collection splitters know how many elements remain ... def dup = new ParStringSplitter(i, len) def remaining = len - i
  • 80. Custom collection splitters can be split ... def psplit(sizes: Int*): Seq[ParStringSplitter] = { val splitted = new ArrayBuffer[ParStringSplitter] for (sz <- sizes) { val next = (i + sz) min ntl splitted += new ParStringSplitter(i, next) i = next } splitted }
  • 81. Word count now with parallel strings new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) })
  • 82. Word count performance txt.foldLeft((0, true)) { case ((wc, _), ' ') => (wc, true) case ((wc, true), x) => (wc + 1, false) case ((wc, false), x) => (wc, false) } new ParString(txt).aggregate((0, 0, 0))({ case ((ls, 0, _), ' ') => (ls + 1, 0, ls + 1) case ((ls, 0, _), c) => (ls, 1, 0) case ((ls, wc, rs), ' ') => (ls, wc, rs + 1) case ((ls, wc, 0), c) => (ls, wc, 0) case ((ls, wc, rs), c) => (ls, wc + 1, 0) }, { case ((0, 0, 0), res) => res case (res, (0, 0, 0)) => res case ((lls, lwc, 0), (0, rwc, rrs)) => (lls, lwc + rwc - 1, rrs) case ((lls, lwc, _), (_, rwc, rrs)) => (lls, lwc + rwc, rrs) }) 100 ms cores: 1 2 4 time: 137 ms 70 ms 35 ms
  • 83. Hierarchy GenTraversable GenIterable GenSeq Traversable Iterable Seq ParIterable ParSeq
  • 84. Hierarchy def nonEmpty(sq: Seq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res }
  • 85. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res }
  • 86. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res } side-effects! ArrayBuffer is not synchronized!
  • 87. Hierarchy def nonEmpty(sq: ParSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res += s } res } side-effects! ArrayBuffer is not synchronized! ParSeq Seq
  • 88. Hierarchy def nonEmpty(sq: GenSeq[String]) = { val res = new mutable.ArrayBuffer[String]() for (s <- sq) { if (s.nonEmpty) res.synchronized { res += s } } res }
  • 89. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, …
  • 90. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … These return collections!
  • 91. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … Sequential collections – builders
  • 92. Accessors vs. transformers some methods need more than just splitters foreach, reduce, find, sameElements, indexOf, corresponds, forall, exists, max, min, sum, count, … map, flatMap, filter, partition, ++, take, drop, span, zip, patch, padTo, … Sequential collections – builders Parallel collections – combiners
  • 93. Builders building a sequential collection 1 2 3 4 5 6 7 Nil Nil ListBuilder += += += result
  • 94. How to build parallel?
  • 95. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] }
  • 96. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } Combiner Combiner Combiner
  • 97. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } Should be efficient – O(log n) worst case
  • 98. Combiners building parallel collections trait Combiner[-Elem, +To] extends Builder[Elem, To] { def combine[N <: Elem, NewTo >: To] (other: Combiner[N, NewTo]): Combiner[N, NewTo] } How to implement this combine?
  • 99. Parallel arrays 1, 2, 3, 4 5, 6, 7, 8 4 6, 8 3, 1, 8, 0 2, 2, 1, 9 8, 0 2, 2 merge merge merge copy allocate 2 4 6 8 8 0 2 2
  • 100. Parallel hash tables ParHashMap
  • 101. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 e.g. calling filter
  • 102. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner ParHashCombiner e.g. calling filter
  • 103. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner 0 1 4 ParHashCombiner 5 7 9
  • 104. Parallel hash tables ParHashMap 0 1 2 4 5 7 8 9 ParHashCombiner 0 1 4 ParHashCombiner 5 9 5 7 0 1 4 7 9
  • 105. Parallel hash tables ParHashMap ParHashCombiner ParHashCombiner How to merge? 5 7 0 1 4 9
  • 106. 5 7 8 9 1 4 0 Parallel hash tables buckets! ParHashCombiner ParHashCombiner ParHashMap 2 0 = 00002 1 = 00012 4 = 01002
  • 107. Parallel hash tables ParHashCombiner ParHashCombiner 0 1 4 9 7 5 combine
  • 108. Parallel hash tables ParHashCombiner ParHashCombiner 9 7 5 0 1 4 ParHashCombiner no copying!
  • 109. Parallel hash tables 9 7 5 0 1 4 ParHashCombiner
  • 110. Parallel hash tables 9 7 5 0 1 4 ParHashMap
  • 111. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) What is the return type here?
  • 112. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) creates a ParVector!
  • 113. Custom combiners for methods returning custom collections new ParString(txt).filter(_ != ‘ ‘) creates a ParVector! class ParString(val str: String) extends parallel.immutable.ParSeq[Char] { def apply(i: Int) = str.charAt(i) ...
  • 114. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ...
  • 115. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ... protected[this] override def newCombiner : Combiner[Char, ParString]
  • 116. Custom combiners for methods returning custom collections class ParString(val str: String) extends immutable.ParSeq[Char] with ParSeqLike[Char, ParString, WrappedString] { def apply(i: Int) = str.charAt(i) ... protected[this] override def newCombiner = new ParStringCombiner
  • 117. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] {
  • 118. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0
  • 119. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 size
  • 120. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) size
  • 121. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) size chunks
  • 122. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last size chunks
  • 123. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last size lastc chunks
  • 124. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this }
  • 125. Custom combiners for methods returning custom collections class ParStringCombiner extends Combiner[Char, ParString] { var size = 0 val chunks = ArrayBuffer(new StringBuilder) var lastc = chunks.last def +=(elem: Char) = { lastc += elem size += 1 this } size lastc chunks +1
  • 126. Custom combiners for methods returning custom collections ... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo]) = other match { case psc: ParStringCombiner => sz += that.sz chunks ++= that.chunks lastc = chunks.last this }
  • 127. Custom combiners for methods returning custom collections ... def combine[U <: Char, NewTo >: ParString] (other: Combiner[U, NewTo]) lastc chunks lastc chunks
  • 128. Custom combiners for methods returning custom collections ... def result = { val rsb = new StringBuilder for (sb <- chunks) rsb.append(sb) new ParString(rsb.toString) } ...
  • 129. Custom combiners for methods returning custom collections ... def result = ... lastc chunks StringBuilder
  • 130. Custom combiners for methods expecting implicit builder factories // only for big boys ... with GenericParTemplate[T, ParColl] ... object ParColl extends ParFactory[ParColl] { implicit def canCombineFrom[T] = new GenericCanCombineFrom[T] ...
  • 131. Custom combiners performance measurement txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘)
  • 132. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms Custom combiners performance measurement
  • 133. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms Custom combiners performance measurement
  • 134. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms 2 cores 81 ms Custom combiners performance measurement
  • 135. txt.filter(_ != ‘ ‘) new ParString(txt).filter(_ != ‘ ‘) 106 ms 1 core 125 ms 2 cores 81 ms 4 cores 56 ms Custom combiners performance measurement
  • 136. 1 core 125 ms 2 cores 81 ms 4 cores 56 ms t/ms proc 125 ms 1 2 4 81 ms 56 ms Custom combiners performance measurement
  • 137. 1 core 125 ms 2 cores 81 ms 4 cores 56 ms t/ms proc 125 ms 1 2 4 81 ms 56 ms def result (not parallelized) Custom combiners performance measurement
  • 138. Custom combiners tricky! •two-step evaluation –parallelize the result method in combiners •efficient merge operation –binomial heaps, ropes, etc. •concurrent data structures –non-blocking scalable insertion operation –we’re working on this
  • 139. Future work coming up •concurrent data structures •more efficient vectors •custom task pools •user defined scheduling •parallel bulk in-place modifications
  • 140. Thank you! Examples at: git://github.com/axel22/sd.git