SlideShare a Scribd company logo
Alluxio
christophe.marchal@ilegra.com
@toff63
http://github.com/toff63
http://francesbagual.net
About me
Berkeley Data Analytics Stack
Supported Storages and Framework
Memory-Centric distributed storage system
Storage (hdfs,s3,...)
AlluxioBlock 1 Block 2 Block 3
Spark Memory
Spark Job A
Block 1 Block 2 Spark Memory
Spark Job B
Block 1 Block 3
Architecture
● Metadata
● Workflow ManagerMaster
Worker Worker Worker
Tiered Storage
Master
Worker Worker Worker
Memory SSD HDD
Unified and Transparent Namespace
Resiliency: Master
Master
Worker Worker
Master
Worker
Active Passive
Write Read
Journal
Resiliency: Lineage
File
Set A
File
Set C
Spark Job
File
Set B
File
Set D
Spark Job
MapReduce Job File
Set E
X
Code
Bigger Case
● Mem+HDD
● 100+ nodes
● 1 PB+ managed space
● 30x Perf improvement
Where is the code?
Thanks!

More Related Content

Alluxio