SlideShare a Scribd company logo
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Breaking IO Performance Barriers:
Scalable Parallel File System for AWS
Paresh G. Pattani, Ph.D.
Sr. Director, High Performance Data Solutions
Intel Corporation
July 10, 2014
The need for parallel storage
Parallel Storage Needs
• Time spent storing and retrieving data is time not
spent on compute. Fast storage maximizes
processing utilization.
Scalability
Reliability
Performance
• Growing datasets require greater amounts of storage
and the ability to expand existing storage.
• Large clusters and critical workloads require a
comprehensive focus on data availability.
Scale Out Storage Using Lustre*
• Purpose-built for HPC
• Distributed, Parallel, Vast Global Namespace
• Linux server based
• Linux, Windows and Mac client support
• Support for 100,000+ Clients
• Designed for Reliable Storage
• Now available on AWS Marketplace
lustre.intel.com/cloudedition
* Some names and brands may be claimed as the property of others.
Intel Strategy for Lustre* Storage
Extend core Lustre* for use across
HPC and enterprise applications
Intel Enhanced Lustre* – HPC Clouds
 Extend core Lustre* with key
features for new markets and use
cases
 Push Lustre* onto HPC cloud
infrastructure
Open-source innovation driving
performance at scale
Open Source - Powerful storage
foundation for exascale applications
 Increased scale and streaming
bandwidth
 Accelerate maturity, lower risk
and grow the ecosystem
1 2
* Some names and brands may be claimed as the property of others.
Use Models: Cloud Resources for HPC
1 Augment: burst peak workloads and supplement resources
2 Transition: move on-premises HPC to cloud infrastructure
3 Deploy: launch new applications exclusively to the cloud
Key HPC Markets Using Lustre* Today
Large-scale
Manufacturing
Weather and
Climate
Life Sciences Energy Finance
* Some names and brands may be claimed as the property of others.
What Does Intel® Cloud
Edition for Lustre* Software
Look Like?
*Other names and brands may be claimed as the property of others.
MDS MDS
Lustre* Components
Management Metadata Storage
Lustre* mount service
Initial point of contact
for clients
Namespace of file
system
File layouts, no data
Scalable
File content stored as
objects
Striped across targets
Scales to 100+
MGT MDT OST OST
MGS OSS OSS
*Other names and brands may be claimed as the property of others.
Deploying a Storage Cluster
Deploying a Storage Cluster
Deploying a Storage Cluster
Deploying a Storage Cluster
Monitoring & Command Line Interface
Performance….
Large File Benchmark
Comparing 3 Lustre* cluster configuration
Increase the number of OSSs
• 4 OSS
• 8 OSS
• 16 OSS
Configurations of MGS and MDS are the
same
We use 32 clients
MDS
EBS Optimized
RAID0
8x 40GB
Standard
110 MB/sec
m3.2xlarge
OSS
EBS Optimized
8x 100GB
Standard
110 MB/sec
m3.2xlarge
Client
110 MB/sec
m3.2xlarge
MGS
94 MB/sec
m1.medium
*Other names and brands may be claimed as the property of others.
IOR Sequential Read FPP
0
200
400
600
800
1000
1200
1400
1600
1 2 4 8 16 32
4OSS
8OSS
16OSS
N. Clients
MB/sec
Client’s network bottleneck
OSS’s network bottleneck
OSS’s network bottleneck
Close to the OSS network
0
200
400
600
800
1000
1200
1400
1600
1 2 4 8 16 32
4OSS
8OSS
16OSS
IOR Sequential Write FPP
N. Clients
MB/sec
Client’s network bottleneck
OSS’s network bottleneck
OSS’s network bottleneck
Ops….
Aggregate Performance During Run
• LTOP is available and
we use it to record the
OSTs activities during
the IOR run.
• With a simple python
script we create this
graph: “aggregate
performance vs time”
to analyze the problem. time
1920
MB/sec
Long tail
Compare Lustre* and NFS
*Other names and brands may be claimed as the property of others.
Small File Benchmark
Simulated EDA Benchmark
• Simulate workload by compiling a package
• untar; configure; make;
• Python wrapper parallelizes on cluster using MPI
• Calculate score based on (total workload/runtime)
32 Clients
• Linux, c3.xlarge
Compare with NFS
• Linux, i2.4xlarge
• 4x EBS RAID0
Lustre* Configuration
1 MGT
• m3.medium
1 - 4 MDTs
• m3.2xlarge
• 8x 40GB EBS
4 OSTs
• c3.xlarge
• 8x 40GB EBS
*Other names and brands may be claimed as the property of others.
EDABench – Lustre* vs. NFS
0
2000
4000
6000
8000
10000
12000
1 2 4 8 16 32 64 128
EDABench
Score
(Compile)
Processes (32 clients)
1 MDT
2 MDTs
4 MDTs
NFS
*Other names and brands may be claimed as the property of others.
Storage Instance Cost Comparison
• EBS Optimized for all storage instances
• Global Support for Lustre*
• Does not include EBS cost
Cluster Option Total Cost / Hour
Lustre* – 1xMDT + 4xOSS $2.00
Lustre* – 2xMDT + 4xOSS $2.69
Lustre* – 4xMDT + 4xOSS $4.07
NFS – i2.4xlarge $3.51
*Other names and brands may be claimed as the property of others.
Intel® Cloud Edition for Lustre* software
*Other names and brands may be claimed as the property of others.
Status Today
• Available on AWS Marketplace
• Setup in less than 10 minutes
• Try for yourself
lustre.intel.com/cloudedition
lustre.intel.com/contactus
Thank You.

More Related Content

Breaking IO Performance Barriers: Scalable Parallel File System for AWS

  • 1. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Breaking IO Performance Barriers: Scalable Parallel File System for AWS Paresh G. Pattani, Ph.D. Sr. Director, High Performance Data Solutions Intel Corporation July 10, 2014
  • 2. The need for parallel storage
  • 3. Parallel Storage Needs • Time spent storing and retrieving data is time not spent on compute. Fast storage maximizes processing utilization. Scalability Reliability Performance • Growing datasets require greater amounts of storage and the ability to expand existing storage. • Large clusters and critical workloads require a comprehensive focus on data availability.
  • 4. Scale Out Storage Using Lustre* • Purpose-built for HPC • Distributed, Parallel, Vast Global Namespace • Linux server based • Linux, Windows and Mac client support • Support for 100,000+ Clients • Designed for Reliable Storage • Now available on AWS Marketplace lustre.intel.com/cloudedition * Some names and brands may be claimed as the property of others.
  • 5. Intel Strategy for Lustre* Storage Extend core Lustre* for use across HPC and enterprise applications Intel Enhanced Lustre* – HPC Clouds  Extend core Lustre* with key features for new markets and use cases  Push Lustre* onto HPC cloud infrastructure Open-source innovation driving performance at scale Open Source - Powerful storage foundation for exascale applications  Increased scale and streaming bandwidth  Accelerate maturity, lower risk and grow the ecosystem 1 2 * Some names and brands may be claimed as the property of others.
  • 6. Use Models: Cloud Resources for HPC 1 Augment: burst peak workloads and supplement resources 2 Transition: move on-premises HPC to cloud infrastructure 3 Deploy: launch new applications exclusively to the cloud
  • 7. Key HPC Markets Using Lustre* Today Large-scale Manufacturing Weather and Climate Life Sciences Energy Finance * Some names and brands may be claimed as the property of others.
  • 8. What Does Intel® Cloud Edition for Lustre* Software Look Like? *Other names and brands may be claimed as the property of others.
  • 9. MDS MDS Lustre* Components Management Metadata Storage Lustre* mount service Initial point of contact for clients Namespace of file system File layouts, no data Scalable File content stored as objects Striped across targets Scales to 100+ MGT MDT OST OST MGS OSS OSS *Other names and brands may be claimed as the property of others.
  • 14. Monitoring & Command Line Interface
  • 16. Large File Benchmark Comparing 3 Lustre* cluster configuration Increase the number of OSSs • 4 OSS • 8 OSS • 16 OSS Configurations of MGS and MDS are the same We use 32 clients MDS EBS Optimized RAID0 8x 40GB Standard 110 MB/sec m3.2xlarge OSS EBS Optimized 8x 100GB Standard 110 MB/sec m3.2xlarge Client 110 MB/sec m3.2xlarge MGS 94 MB/sec m1.medium *Other names and brands may be claimed as the property of others.
  • 17. IOR Sequential Read FPP 0 200 400 600 800 1000 1200 1400 1600 1 2 4 8 16 32 4OSS 8OSS 16OSS N. Clients MB/sec Client’s network bottleneck OSS’s network bottleneck OSS’s network bottleneck Close to the OSS network
  • 18. 0 200 400 600 800 1000 1200 1400 1600 1 2 4 8 16 32 4OSS 8OSS 16OSS IOR Sequential Write FPP N. Clients MB/sec Client’s network bottleneck OSS’s network bottleneck OSS’s network bottleneck Ops….
  • 19. Aggregate Performance During Run • LTOP is available and we use it to record the OSTs activities during the IOR run. • With a simple python script we create this graph: “aggregate performance vs time” to analyze the problem. time 1920 MB/sec Long tail
  • 20. Compare Lustre* and NFS *Other names and brands may be claimed as the property of others.
  • 21. Small File Benchmark Simulated EDA Benchmark • Simulate workload by compiling a package • untar; configure; make; • Python wrapper parallelizes on cluster using MPI • Calculate score based on (total workload/runtime) 32 Clients • Linux, c3.xlarge Compare with NFS • Linux, i2.4xlarge • 4x EBS RAID0
  • 22. Lustre* Configuration 1 MGT • m3.medium 1 - 4 MDTs • m3.2xlarge • 8x 40GB EBS 4 OSTs • c3.xlarge • 8x 40GB EBS *Other names and brands may be claimed as the property of others.
  • 23. EDABench – Lustre* vs. NFS 0 2000 4000 6000 8000 10000 12000 1 2 4 8 16 32 64 128 EDABench Score (Compile) Processes (32 clients) 1 MDT 2 MDTs 4 MDTs NFS *Other names and brands may be claimed as the property of others.
  • 24. Storage Instance Cost Comparison • EBS Optimized for all storage instances • Global Support for Lustre* • Does not include EBS cost Cluster Option Total Cost / Hour Lustre* – 1xMDT + 4xOSS $2.00 Lustre* – 2xMDT + 4xOSS $2.69 Lustre* – 4xMDT + 4xOSS $4.07 NFS – i2.4xlarge $3.51 *Other names and brands may be claimed as the property of others.
  • 25. Intel® Cloud Edition for Lustre* software *Other names and brands may be claimed as the property of others.
  • 26. Status Today • Available on AWS Marketplace • Setup in less than 10 minutes • Try for yourself lustre.intel.com/cloudedition lustre.intel.com/contactus