SlideShare a Scribd company logo
Spark Deep Dive
Corey Nolet
Tetra Concepts
Design Philosophies
●
Akka
●
Remote actors model
●
Designing for scalability
●
Distributed / concurrent processing
●
Across threads, processes, machines
●
Scala
●
Functional / closure-based
●
Lazy-evaluated
●
Immutable
●
Type inference
●
Terse but safe
Hadoop-based
●
Integration with HDFS
●
Preserves data locality
●
Shuffles for all-to-all communications
●
Integrates natively with resource negotiators
like YARN
●
Can use existing Hadoop input/output
formats
Spark Deep Dive
New concepts for the
community
●
Dependency graph instead of
map/combine/reduce
●
Can be narrow or wide depending on
communication model
●
Reprocessing partitions instead of restarting entire
tasks
●
Dataset appears like a local collection but
actions cause distributed computation
●
Memory can be used to cache data for reuse
across different transformations & actions.
Spark Deep Dive
RDD
●
API Similar to Scala's collection's API
●
Provides lazy functions like map(), flatMap(),
reduce(), collect(), etc…
●
Transformation lineage is tracked
●
Partitions can be rebuilt in the case of failure
●
Broken up into partitions that get scheduled
on tasks
Jobs, Stages, Tasks
●
SparkContext can be a long-running object and
we can submit many jobs to it.
●
Job: a sequence of transformations and actions
on an RDD
●
Stage: a specific transformation or action on an
RDD that gets scheduled on the executors.
●
Tasks: The actual closures executing on
executors to process stages.
What are partitions?
●
Chunks of data that make up an RDD.
●
Distributed across the cluster and control
parallelism of processing
●
Often start in a job from an input format
●
Similar to input splits in MapReduce
●
Number can change throughout the stages
that make up a job
●
Default can be set using
spark.default.parallelism
Partitions
Partition Locality
●
Can carry a set of “preferred locations” for
which tasks should be scheduled.
●
Like splits in MapReduce
●
Locality levels lowered when tasks become
too busy
●
Process, Node, Rack, Any, or No Pref
●
Process is most preferred
●
Set through
spark.locality.wait.[process,node,rack]
Partition Sizes
●
Can be changed manually using
rdd.coalesce()
●
Low overhead in deserializing tasks to
process partitions
●
Unlike MapReduce, many small partitions
are recommended over few large ones.
●
Generally 2-3 per core
●
Tasks can be small enough to run in 200ms
and still be efficient.
Changing Partition Sizes
rdd.coalesce(
numPartitions,
shuffle?
)
rdd.repartition(
numPartitions
)
Changing Partition Sizes
rdd.repartition(numParts)
actually calls
rdd.coalesce(numParts, true)
Coalesce (not shuffled)
●
Results in narrow dependency
●
For reducing number of partitions
●
Drastic decrease (e.g. 1000→ 1) usually
benefits more from shuffling
●
Final number of parts will never be greater
than specified amount
●
Could be less if the number of parent parts
is less
Coalesce (not shuffled)
●
Groups final partitions so they map to the
same number of parent partitions
●
When parents have locality information:
●
Attempts to group parent partitions on their
local nodes
●
When parents don't have locality information:
●
Create groups by chunking parents that are
close in the array of partitions
Coalesce (shuffled)
●
Results in wide dependency
●
Allows number of partitions to be increased
at the expense of a shuffle.
●
Evens out distribution of data using a hash
partitioner.
Memory in Spark
Executor Memory
●
Divided among cache and processing
●
60% used for cached objects
spark.storage.memoryFraction=60
spark.storage.safetyFraction=90
●
20% used for shuffles
spark.shuffle.memoryFraction=20
spark.shuffle.safetyFraction=80
●
What's left over is for task execution
●
Usable memory is defined as follows:
(max memory allocated to JVM – overhead memory
used in the JVM) * memoryFraction * safetyFraction.
Executor Memory
●
High JVM overhead can significantly reduce amount of
memory available for caching, shuffles, and task
execution.
●
Default amount allocated for YARN executors used to
be 7%. Raised to 10% in 1.3
●
Dependent on choices of data structures and
overhead of classes used.
●
spark.yarn.executor.memoryOverhead
RDD Caching
●
Useful when multiple downstream
transformations depend on a single upstream
RDD
val rdd1 = inputRdd.map(..)..saveAsTextFile(..)
val rdd2 = inputRdd.map(..)..saveAsTextFile(..)
●
Done through rdd.persist()
●
LRU eviction of memory cached RDDs when
memory is full (automatic cleanup)
●
Can be manually evicted using
rdd.unpersist()
RDD Caching
●
Deserialized / Raw
●
Generally faster
●
No cost of serializing data
●
Larger data sets put pressure on the garbage
collector
●
Serialized
●
Can take up to 2x - 4x less memory
●
Can be slower processing than raw while
garbage collector is running efficiently
Storage Levels
●
MEMORY_ONLY
●
MEMORY_AND_DISK
●
MEMORY_ONLY_SER
●
MEMORY_AND_DISK_SER
●
DISK_ONLY
●
MEMORY_ONLY_SER_2
●
MEMORY_AND_DISK_SER_2
●
MEMORY_ONLY_2
●
MEMORY_AND_DISK_2
●
OFF_HEAP
Tachyon
●
Uses a Ramdisk, or in-memory file system,
to expose HDFS API.
●
Asynchronously writes to HDFS
●
Allows off-heap caching to put less pressure
on garbage collector
●
Data can be shared by multiple executors
●
Cached data is not lost when an executor
dies
●
Still experimental as of Spark 1.4.0
Project Tungsten
●
Designs for three major optimizations to Spark
●
One of them provides off-heap memory
management to lower object overheads and
bypass garbage collection.
●
Another provides cache-aware data structures
that can minimize memory lookups
●
https://databricks.com/blog/2015/04/28/project-
tungsten-bringing-spark-closer-to-bare-
metal.html
Shared Memory
●
Broadcast variables
●
Read-only memory cached on each executor
and shared across tasks
●
Can be used like distributed cache in
MapReduce to share large lookup tables
across tasks
●
Accumulators
●
Can be used like counters in MapReduce
●
Can also perform any generic associative
algorithm.
Broadcast & Accumulators
// Using broadcast variable
val valueToWrap = “fubar”
val broadcastVal = sc.broadcast(valueToWrap)
…
rdd.filter(_ == broadcastVal.value)
// Using accumulators
val accumulator = sc.accumulator(0)
rdd.map(it => {
accumulator += 1
it
})
Serialization in Spark
●
Two different types
●
Closures
●
Data
Closures
●
Scala can be a little confusing
●
Functions vs Methods
●
Objects vs Classes
●
Closure is just an anonymous
implementation of the FunctionX class in
Scala
●
Closure will always contain a reference to its
outer object
●
Any objects used inside the closure will be serialized
Functions vs. Methods
class MyClass {
// compiles down to Java method
def myMethod(): Unit {}
}
class MyClass {
// compiles to impl of FunctionX trait
val myFuction: () => Unit = () => {}
}
Methods can also be coerced into functions, allowing
them to pass around like functions.
Objects vs. Classes
object MyObject {
// compiles to static member of MyObject
val myVal: Boolean = true
// compiles to Java static method
def myMethod(): Unit {}
}
class MyClass {
// compiles to instance value
val myVal: Boolean = true
// compiles to method on MyClass
def myMethod(): Unit {}
}
Closure Serialization
●
The primary way code makes it from the
driver to executors
●
No more extends Mapper/Reducer
●
Closures can be shipped at runtime
●
Currently only supports Java serialization
●
Closure cleaner attempts to prune unused
references of the object graph
●
Can still use unnecessary memory if not careful
Closure Cleaner
class MyProcessor {
def process(rdd: RDD[String]) {
rdd.filter(_ == “good”)
...
}
}
The filter() closure's reference to outer class
MyProcessor gets pruned by the ClosureCleaner
because it is not used.
Closure Cleaner
class MyProcessor(
filterWord: String
) {
def process(rdd: RDD[String]) {
rdd.filter(_ == filterWord)
...
}
}
Whole class gets serialized but doesn't extend
Serializable. Execution will fail.
Closure Cleaner
object MyProcessor{
val filterWord = ...
def process(
rdd: RDD[String]
) {
rdd.filter(_ == filterWord)
...
}
}
process() compiles to a Java static method so
only filter()'s closure gets serialized.
Closure Cleaner
class MyProcessor(
filterWord: String
) {
def process(rdd: RDD[String]) {
val filterWord2 = filterWord
rdd.filter(_ == filterWord2)
...
}
}
The filter() closure serializes because filterWord2
has separated the value from the instance of
MyProcessor
Data Serialization
●
Kryo & Java both supported
●
Kryo is faster and more compact than Java
●
spark.serializer =
org.apache.spark.serializer.KryoSerializer
●
Kryo requires object serializers to be
registered
●
Native Scala classes are supported
●
Serialization errors will not be noticed until
the data leaves the JVM
●
Used in in memory and on disk
Shuffling in Spark
Shuffling
●
Required for all-to-all communications
●
reduceByKey(), aggregateByKey(),
sortByKey(), etc…
●
Always a bottleneck
●
Network & Disk IO
●
Serialization
●
Compression
●
Receiving lots of attention.
Spark vs MapReduce
●
Reduce phase does not overlap with the Map
phase like MapReduce
●
Spark reducer's pull shuffle data from
mappers
●
MapReduce does push in a concurrent copy
stage
●
Map and Reduce tasks all run on same
executor JVMs
●
MapReduce uses different JVMs for these
tasks
First there was a hash-
based shuffle...
●
Originally required M * R number of
intermediate files (that is, # of mappers & #
of reducers)
●
Concurrently opened files are C * R (# of
cores * # of reducers)
●
Enabling shuffle spilling created even more
temporary files.
●
Many random writes/reads caused CPU
time spent in reduces to mainly wait on disk
I/O
Original hash-based shuffle
Then they consolidated
files...
●
Introduced an extra merge phase
●
All map tasks running on the same core write to
the same set of files in tandem
●
File consolidation reduces number of files to C *
R
●
Each reducer fetches a smaller number of files
●
Still bad for high numbers of reducers
●
Concurrently opened files are still C * R
●
spark.shuffle.consolidateFiles=true
Consolidated files
And along came sort-based
Shuffle
1)Records sorted in memory by partition ID and merged into a single
file for each core along with an index file
●
If map-side combine, buckets sorted by key & partition and run
through combiner
●
Otherwise, just sorted only by partition
2)Ranges of buckets in each file served to reducers upon request
3)Each segment is merged together on the reducer
4)Records deserialized and passed through all-to-all function (e.g.
aggregateByKey(), reduceByKey()) to complete the stage
●
In the case of sortByKey() and other ordered functions, the
partitions are sorted before being run through the all-to-all
function.
5)When <= 200 reducers and no sort or aggregation is needed
hash-based is used instead
●
spark.shuffle.sort.bypassMergeThreshold
And along came a sort-
based Shuffle
Shuffle Evolution
●
Shuffle write consolidation in 0.9
●
Pluggable shuffle managers in Spark 1.0
●
Hash-based (pre-1.2)
●
Sort-based (introduced in 1.1, default in 1.2+)
●
NettyTransferService introduced in 1.2 for
transferring shuffle “blocks”
●
External shuffle service introduced in 1.2
●
In 1.5+, Community is working on tiered merge
strategy.
Shuffle Durability
●
Failure of an executor will lose shuffle files unless Aux
Shuffle Service is configured on the YARN
NodeManager.
SparkConf:
spark.yarn.shuffle.service = true
yarn-site.xml add spark_shuffle to:
yarn.nodemanager.aux-services
yarn-site.xml add:
yarn.nodemanager.aux-services.spark_shuffle.class =
org.apache.spark.network.yarn.YarnShuffleService
Perhaps we could establish
some best practices
●
Consider the parallelism at each stage of your
jobs based on your data and number of cores.
●
Executor memory should be fine-tuned for
expected cache and shuffle sizes.
●
Minimize footprint of closures
●
Use broadcasts for large values
●
Use Kryo to serialize data
●
Know your communication patterns (one-to-all,
all-to-all, etc..) and optimize accordingly
●
Use aux-shuffle service
Shuffle Optimization
●
A couple properties that affect shuffle
performance
●
spark.akka.threads
●
spark.reducer.maxMbInFlight
●
By default, shuffles will use only 20% of the
memory allocated to executor
●
Increase spark.shuffle.memoryFraction
at expense of
spark.storage.memoryFraction
Questions?
corey@tetraconcepts.com

More Related Content

Spark Deep Dive

  • 1. Spark Deep Dive Corey Nolet Tetra Concepts
  • 2. Design Philosophies ● Akka ● Remote actors model ● Designing for scalability ● Distributed / concurrent processing ● Across threads, processes, machines ● Scala ● Functional / closure-based ● Lazy-evaluated ● Immutable ● Type inference ● Terse but safe
  • 3. Hadoop-based ● Integration with HDFS ● Preserves data locality ● Shuffles for all-to-all communications ● Integrates natively with resource negotiators like YARN ● Can use existing Hadoop input/output formats
  • 5. New concepts for the community ● Dependency graph instead of map/combine/reduce ● Can be narrow or wide depending on communication model ● Reprocessing partitions instead of restarting entire tasks ● Dataset appears like a local collection but actions cause distributed computation ● Memory can be used to cache data for reuse across different transformations & actions.
  • 7. RDD ● API Similar to Scala's collection's API ● Provides lazy functions like map(), flatMap(), reduce(), collect(), etc… ● Transformation lineage is tracked ● Partitions can be rebuilt in the case of failure ● Broken up into partitions that get scheduled on tasks
  • 8. Jobs, Stages, Tasks ● SparkContext can be a long-running object and we can submit many jobs to it. ● Job: a sequence of transformations and actions on an RDD ● Stage: a specific transformation or action on an RDD that gets scheduled on the executors. ● Tasks: The actual closures executing on executors to process stages.
  • 9. What are partitions? ● Chunks of data that make up an RDD. ● Distributed across the cluster and control parallelism of processing ● Often start in a job from an input format ● Similar to input splits in MapReduce ● Number can change throughout the stages that make up a job ● Default can be set using spark.default.parallelism
  • 11. Partition Locality ● Can carry a set of “preferred locations” for which tasks should be scheduled. ● Like splits in MapReduce ● Locality levels lowered when tasks become too busy ● Process, Node, Rack, Any, or No Pref ● Process is most preferred ● Set through spark.locality.wait.[process,node,rack]
  • 12. Partition Sizes ● Can be changed manually using rdd.coalesce() ● Low overhead in deserializing tasks to process partitions ● Unlike MapReduce, many small partitions are recommended over few large ones. ● Generally 2-3 per core ● Tasks can be small enough to run in 200ms and still be efficient.
  • 15. Coalesce (not shuffled) ● Results in narrow dependency ● For reducing number of partitions ● Drastic decrease (e.g. 1000→ 1) usually benefits more from shuffling ● Final number of parts will never be greater than specified amount ● Could be less if the number of parent parts is less
  • 16. Coalesce (not shuffled) ● Groups final partitions so they map to the same number of parent partitions ● When parents have locality information: ● Attempts to group parent partitions on their local nodes ● When parents don't have locality information: ● Create groups by chunking parents that are close in the array of partitions
  • 17. Coalesce (shuffled) ● Results in wide dependency ● Allows number of partitions to be increased at the expense of a shuffle. ● Evens out distribution of data using a hash partitioner.
  • 19. Executor Memory ● Divided among cache and processing ● 60% used for cached objects spark.storage.memoryFraction=60 spark.storage.safetyFraction=90 ● 20% used for shuffles spark.shuffle.memoryFraction=20 spark.shuffle.safetyFraction=80 ● What's left over is for task execution ● Usable memory is defined as follows: (max memory allocated to JVM – overhead memory used in the JVM) * memoryFraction * safetyFraction.
  • 20. Executor Memory ● High JVM overhead can significantly reduce amount of memory available for caching, shuffles, and task execution. ● Default amount allocated for YARN executors used to be 7%. Raised to 10% in 1.3 ● Dependent on choices of data structures and overhead of classes used. ● spark.yarn.executor.memoryOverhead
  • 21. RDD Caching ● Useful when multiple downstream transformations depend on a single upstream RDD val rdd1 = inputRdd.map(..)..saveAsTextFile(..) val rdd2 = inputRdd.map(..)..saveAsTextFile(..) ● Done through rdd.persist() ● LRU eviction of memory cached RDDs when memory is full (automatic cleanup) ● Can be manually evicted using rdd.unpersist()
  • 22. RDD Caching ● Deserialized / Raw ● Generally faster ● No cost of serializing data ● Larger data sets put pressure on the garbage collector ● Serialized ● Can take up to 2x - 4x less memory ● Can be slower processing than raw while garbage collector is running efficiently
  • 24. Tachyon ● Uses a Ramdisk, or in-memory file system, to expose HDFS API. ● Asynchronously writes to HDFS ● Allows off-heap caching to put less pressure on garbage collector ● Data can be shared by multiple executors ● Cached data is not lost when an executor dies ● Still experimental as of Spark 1.4.0
  • 25. Project Tungsten ● Designs for three major optimizations to Spark ● One of them provides off-heap memory management to lower object overheads and bypass garbage collection. ● Another provides cache-aware data structures that can minimize memory lookups ● https://databricks.com/blog/2015/04/28/project- tungsten-bringing-spark-closer-to-bare- metal.html
  • 26. Shared Memory ● Broadcast variables ● Read-only memory cached on each executor and shared across tasks ● Can be used like distributed cache in MapReduce to share large lookup tables across tasks ● Accumulators ● Can be used like counters in MapReduce ● Can also perform any generic associative algorithm.
  • 27. Broadcast & Accumulators // Using broadcast variable val valueToWrap = “fubar” val broadcastVal = sc.broadcast(valueToWrap) … rdd.filter(_ == broadcastVal.value) // Using accumulators val accumulator = sc.accumulator(0) rdd.map(it => { accumulator += 1 it })
  • 28. Serialization in Spark ● Two different types ● Closures ● Data
  • 29. Closures ● Scala can be a little confusing ● Functions vs Methods ● Objects vs Classes ● Closure is just an anonymous implementation of the FunctionX class in Scala ● Closure will always contain a reference to its outer object ● Any objects used inside the closure will be serialized
  • 30. Functions vs. Methods class MyClass { // compiles down to Java method def myMethod(): Unit {} } class MyClass { // compiles to impl of FunctionX trait val myFuction: () => Unit = () => {} } Methods can also be coerced into functions, allowing them to pass around like functions.
  • 31. Objects vs. Classes object MyObject { // compiles to static member of MyObject val myVal: Boolean = true // compiles to Java static method def myMethod(): Unit {} } class MyClass { // compiles to instance value val myVal: Boolean = true // compiles to method on MyClass def myMethod(): Unit {} }
  • 32. Closure Serialization ● The primary way code makes it from the driver to executors ● No more extends Mapper/Reducer ● Closures can be shipped at runtime ● Currently only supports Java serialization ● Closure cleaner attempts to prune unused references of the object graph ● Can still use unnecessary memory if not careful
  • 33. Closure Cleaner class MyProcessor { def process(rdd: RDD[String]) { rdd.filter(_ == “good”) ... } } The filter() closure's reference to outer class MyProcessor gets pruned by the ClosureCleaner because it is not used.
  • 34. Closure Cleaner class MyProcessor( filterWord: String ) { def process(rdd: RDD[String]) { rdd.filter(_ == filterWord) ... } } Whole class gets serialized but doesn't extend Serializable. Execution will fail.
  • 35. Closure Cleaner object MyProcessor{ val filterWord = ... def process( rdd: RDD[String] ) { rdd.filter(_ == filterWord) ... } } process() compiles to a Java static method so only filter()'s closure gets serialized.
  • 36. Closure Cleaner class MyProcessor( filterWord: String ) { def process(rdd: RDD[String]) { val filterWord2 = filterWord rdd.filter(_ == filterWord2) ... } } The filter() closure serializes because filterWord2 has separated the value from the instance of MyProcessor
  • 37. Data Serialization ● Kryo & Java both supported ● Kryo is faster and more compact than Java ● spark.serializer = org.apache.spark.serializer.KryoSerializer ● Kryo requires object serializers to be registered ● Native Scala classes are supported ● Serialization errors will not be noticed until the data leaves the JVM ● Used in in memory and on disk
  • 39. Shuffling ● Required for all-to-all communications ● reduceByKey(), aggregateByKey(), sortByKey(), etc… ● Always a bottleneck ● Network & Disk IO ● Serialization ● Compression ● Receiving lots of attention.
  • 40. Spark vs MapReduce ● Reduce phase does not overlap with the Map phase like MapReduce ● Spark reducer's pull shuffle data from mappers ● MapReduce does push in a concurrent copy stage ● Map and Reduce tasks all run on same executor JVMs ● MapReduce uses different JVMs for these tasks
  • 41. First there was a hash- based shuffle... ● Originally required M * R number of intermediate files (that is, # of mappers & # of reducers) ● Concurrently opened files are C * R (# of cores * # of reducers) ● Enabling shuffle spilling created even more temporary files. ● Many random writes/reads caused CPU time spent in reduces to mainly wait on disk I/O
  • 43. Then they consolidated files... ● Introduced an extra merge phase ● All map tasks running on the same core write to the same set of files in tandem ● File consolidation reduces number of files to C * R ● Each reducer fetches a smaller number of files ● Still bad for high numbers of reducers ● Concurrently opened files are still C * R ● spark.shuffle.consolidateFiles=true
  • 45. And along came sort-based Shuffle 1)Records sorted in memory by partition ID and merged into a single file for each core along with an index file ● If map-side combine, buckets sorted by key & partition and run through combiner ● Otherwise, just sorted only by partition 2)Ranges of buckets in each file served to reducers upon request 3)Each segment is merged together on the reducer 4)Records deserialized and passed through all-to-all function (e.g. aggregateByKey(), reduceByKey()) to complete the stage ● In the case of sortByKey() and other ordered functions, the partitions are sorted before being run through the all-to-all function. 5)When <= 200 reducers and no sort or aggregation is needed hash-based is used instead ● spark.shuffle.sort.bypassMergeThreshold
  • 46. And along came a sort- based Shuffle
  • 47. Shuffle Evolution ● Shuffle write consolidation in 0.9 ● Pluggable shuffle managers in Spark 1.0 ● Hash-based (pre-1.2) ● Sort-based (introduced in 1.1, default in 1.2+) ● NettyTransferService introduced in 1.2 for transferring shuffle “blocks” ● External shuffle service introduced in 1.2 ● In 1.5+, Community is working on tiered merge strategy.
  • 48. Shuffle Durability ● Failure of an executor will lose shuffle files unless Aux Shuffle Service is configured on the YARN NodeManager. SparkConf: spark.yarn.shuffle.service = true yarn-site.xml add spark_shuffle to: yarn.nodemanager.aux-services yarn-site.xml add: yarn.nodemanager.aux-services.spark_shuffle.class = org.apache.spark.network.yarn.YarnShuffleService
  • 49. Perhaps we could establish some best practices ● Consider the parallelism at each stage of your jobs based on your data and number of cores. ● Executor memory should be fine-tuned for expected cache and shuffle sizes. ● Minimize footprint of closures ● Use broadcasts for large values ● Use Kryo to serialize data ● Know your communication patterns (one-to-all, all-to-all, etc..) and optimize accordingly ● Use aux-shuffle service
  • 50. Shuffle Optimization ● A couple properties that affect shuffle performance ● spark.akka.threads ● spark.reducer.maxMbInFlight ● By default, shuffles will use only 20% of the memory allocated to executor ● Increase spark.shuffle.memoryFraction at expense of spark.storage.memoryFraction

Editor's Notes

  1. I think a great place to start would be with the core design philosophies that the architecture was designed around. First, we have Akka. This is a framework that, similar to many other distributed frameworks today, has you thinking about breaking your problems into their atomic particles first so that those particles can be designed on a single node and then scaled out as needed to run on clusters of machines. At its heart, it focuses on Actors who know how to intercept, process, and create new messages that are sent to other Actors. Actors are nothing more than objects which get serialized and deployed onto nodes and then start doing their magic, processing messages they choose to accept. Then there&amp;apos;s Scala which is the backbone that supports the expressive nature of the Akka framework. I could go on for hours about why Scala is such a useful framework for processing data but I&amp;apos;ll opt for the basic bullet points since we&amp;apos;re going to be strapped for time. Scala blends together functional and imperative programming in the JVM by using Java objects to define Closures, or first class functions that remember any variables available in the environment in which they were created. Similar to Java, Scala has a rich collections API. Unlike Java, however, Scala promotes immutability and uses shared structural state create new lightweight objects out of older objects when doing many operations that would normally require mutating state. Of course it does have mutable objects for those who need them, but it recommends sticking with immutability. When operations like add, remove, concatenate, etc.. are performed on many of the collections objects, they need not be copied- instead the inner Similar to Java 8&amp;apos;s streams API, and Guava&amp;apos;s Iterables, Scala has
  2. It turns out, this was no coincidence. This guy by the name of Matei Zaharia in 2008 wrote Hadoop&amp;apos;s FairScheduler while working for Cloudera. While getting his PhD at UC Berkeley, he created Apache Spark and Apache Mesos. Clearly this guy is no stranger to large-scale scheduling of computations on distributed systems. He is one of the co-founders of Databricks and is currently their CTO while also an assistant professor of Computer Science @ MIT. We love this guy- even though many of us don&amp;apos;t know who he is.