SlideShare a Scribd company logo
Azure + DSE Powers O365
Per-User Store
© 2015. All Rights Reserved.
1 Introduction
2 What We Built
3 What to Pay Close Attention To
4 Deployment
5 Wrap Up
© 2015. All Rights Reserved.
Overview
Sean Usher
Office 365
Email: seusher@microsoft.com
Twitter: @seanushermsft
Introduction
© 2015. All Rights Reserved.
Mahesh Thiagarajan
Microsoft Azure
Email: mahthi@microsoft.com
Twitter: @_cloudguy
Ben Lackey
DataStax
Email: ben.lackey@datastax.com
Introduction – Office 365
© 2015. All Rights Reserved.
Email
Collaboration
Document Authoring
Social Networking
Calendaring
File Storage
Business Intelligence
Etc…
Introduction – Azure
© 2015. All Rights Reserved.
Azure is Microsoft’s cloud computing platform, a growing collection of
integrated services—analytics, computing, database, mobile, networking,
storage, and web—for moving faster, achieving more, and saving money.
What We Built - Overview
© 2015. All Rights Reserved.
A way to understand our users and organizations at a deeper level!
• Are users happy with the service they are receiving?
• Are users fully utilizing the services they are paying us for?
• Are users hitting issues that we can proactively help them with?
• How has a user’s experience been over their lifetime?
• Can we discover insights that we aren’t even aware of?
This requires ingesting and storing a lot of data. We need to be able to
perform fast, scalable analytics on that data, or we will discover issues too
late!
Questions:
What We Built – Why Cassandra
© 2015. All Rights Reserved.
The Good
• Low Latency ✓
• Linear Scale ✓
• Highly Available ✓
• Aggregations (Spark/Spark Streaming) ✓
• Machine Learning (Spark ML) ✓
• No Enforcement of Full Consistency ✓ ✓ ✓
The Not-So-Good
• No Hosted Option in Azure ✗
• Have to Install and Configure it Ourselves ✗
Cassandra: 12 Nodes
Analytics: 12 Nodes
VM Size: G4
Heap Size: 30 GB
GC: G1
Ingestion: 20k – 50k events/sec
Data on ephemeral SSD drives.
RF = 3 in both DCs
Cassandra: 30 Nodes
Analytics: 15 Nodes (30 within 1 month)
VM Size: G4
Heap Size: 30 GB
GC: G1
Ingestion: 200k+ events/sec
Data on ephemeral SSD drives.
RF = 3 in both DCs
© 2015. All Rights Reserved.
What We Built – DSE Clusters
Cluster 1:
Cluster 2:
What We Built - Pipeline Evolution
RESTAPI
O365
Event Hub
Ingestion
Worker
(Azure worker role
using DataStax C#
driver)
C* Analytics
RESTAPI
O365
Kafka
C*/
Spark
Streaming
Analytics
G4 – Local SSD
Kafka: G4 – Data Disk
ZooKeeper: A7 – Data Disk
PaaS Small
G4 – Local SSD
© 2015. All Rights Reserved.
Cluster 1:
Cluster 2:
What to Pay Close Attention To – Azure Disks
VHD Storage: No more than 40 VMs per-storage account
“… and for a Standard Tier VM, it is about 40 (20,000/500 IOPS per disk)…..”
https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/
Disk Choice:
1. Local SSD (Ephemeral) – Fast but allows data loss.
2. Data Disk (Standard Storage) – No data loss, network-attached which can add latency. 20k IOPs account Limit.
3. Data Disk (Premium Storage) – No data loss, network-attached which can add latency. Per-disk IOPs Limit.
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-how-to-attach-disk/
https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/#storage-limits
VM
SSD: /dev/sdb
Storage Account
(Data Disk)
Storage Account
(OS Disk)
OS: /dev/sda
© 2015. All Rights Reserved.
What to Pay Close Attention To – Azure VM Size
VM Size: We chose G4 nodes, but are investigating moving to D14 nodes. Having a larger number of smaller
nodes will allow for faster rebuild which can reduce recovery time.
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-size-specs/
© 2015. All Rights Reserved.
What to Pay Close Attention To – Azure Networking
Networking: Virtual Network (VNet) vs Public IP
1. Public IPs – Default limit of 5 per subscription. Allows geo-redundant replication over Internet.
2. VNet – Define your own subnets and IP ranges. Allows geo-redundant replication via Gateways/Express Route.
No bandwidth limit within Vnet.
1. Standard Gateway – Max 100Mbs.
2. High-Performance Gateway – Max 200Mbs.
3. Express Route – Max 10Gbs.
https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-instance-level-public-ip/
https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-vnet-vnet-rm-ps/
https://msdn.microsoft.com/en-us/library/azure/mt586720.aspx
© 2015. All Rights Reserved.
What to Pay Close Attention To – Azure Networking
Test performance of every dependency and see if it meets the expectations of your application.
Network Performance: Iperf (https://iperf.fr/) – Test bandwidth between two VMs within various DCs
VNet
VM
10.1.0.10
Iperf -s
VM
10.1.0.11
Iperf –c 10.1.0.10
user@machine:~$ iperf -c 10.1.0.10
------------------------------------------------------------
Client connecting to 10.1.0.10, TCP port 5001
TCP window size: 2.50 MByte (default)
------------------------------------------------------------
[ 3] local 10.1.0.10 port 42892 connected with 10.1.0.10 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 45.7 GBytes 39.2 Gbits/sec
© 2015. All Rights Reserved.
What to Pay Close Attention To – Azure Storage
Test performance of every dependency and see if it meets the expectations of your application.
Disk: SysBench (https://wiki.gentoo.org/wiki/Sysbench) – Test write throughput and IOPs
user@machine:/mnt$ sysbench --test=fileio --file-total-size=1000G --file-test-mode=rndrw --init-rng=on --max-time=300 --max-requests=0 run
sysbench 0.4.12: multi-threaded system evaluation benchmark
<….. Excess Logging Removed….>
Operations performed: 402240 Read, 268160 Write, 858065 Other = 1528465 Total
Read 6.1377Gb Written 4.0918Gb Total transferred 10.229Gb (34.917Mb/sec)
2234.67 Requests/sec executed
Test execution summary:
total time: 300.0002s
total number of events: 670400
total time taken by event execution: 16.1526
per-request statistics:
min: 0.00ms
avg: 0.02ms
max: 2.20ms
approx. 95 percentile: 0.05ms
Threads fairness:
events (avg/stddev): 670400.0000/0.00
execution time (avg/stddev): 16.1526/0.00 © 2015. All Rights Reserved.
What to Pay Close Attention To – Cassandra
Metrics!
Need to tune? Al Tobey can help - https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html
© 2015. All Rights Reserved.
What to Pay Close Attention To – Cassandra
SSTable Count
• Too many SSTables can lead to OOM errors and nodes becoming unavailable.
• Watch count and balance compaction throughput with system limits.
• SSTable count may spike during repairs if data is inconsistent.
Dropped Mutations
• Dropped mutations mean more repairs need to be done.
• Impact of dropped mutations can be controlled by tuning write consistency.
• Check iostat to see if disk queue is building up or write latency is high.
• iostat -x /dev/sdb 1 5
• Do drops only happen when Spark Jobs batch write? Tune Spark write throughput
(https://github.com/datastax/spark-cassandra-connector/blob/v1.2.5/doc/FAQ.md)
See memtables & flushing in Al’s Tuning Guide.
© 2015. All Rights Reserved.
What to Pay Close Attention To – Cassandra
Pending Compactions
• If you aren’t keeping up with compactions, performance will suffer.
• Too many SSTables impact read speed, but also can lead to hitting OS limits. See:
• /etc/sysctl.conf - vm.max_map_count
• /etc/security/limits.d/cassandra.conf – nofile
• /etc/init.d/dse – Certain DSE versions overwrite nofile with: FD_LIMIT=100000
Heap Used
• Heap usage changes over time. What works in week one, may not work in week 10.
• We used a 20GB heap until nodes started hitting OOM when they needed 25 GB.
• Use G1 if at all possible to see GC times decrease, and use a large (25 – 30 GB) heap.
• Let G1 tune your young generation heap size.
© 2015. All Rights Reserved.
What to Pay Close Attention To – Spark
We are still learning!
Scheduler Output:
NOT CRON!
Spark UI: Spark Job Logs:
If you don’t enable Spark UI for
security reasons, ship your Spark
logs off box for analysis.
You may also find that jobs fail to
read data because partitions are
missing or nodes are timing out.
This can indicate you are
overwhelming Cassandra.
© 2015. All Rights Reserved.
Deployment
Use the Azure/DataStax Template
Azure will be investing in building more features into the Azure template, and you will get those easier if you use the
existing template.
https://www.youtube.com/watch?v=vacp267zLBA&noredirect=1
https://github.com/DSPN/azure-resource-manager-dse
We Didn’t Use the Template because it wasn’t ready yet. We had to write our own logic to deploy nodes and need to
transition to the template so we can get all of these new features. We are scheduling time to do this because it will
save us a lot of work!
Consider Security and Compliance: This will influence how you deploy (VNet vs Public IP), what Cassandra configuration
you use (internode encryption, require_client_auth: true), and what OS configuration you use (CIS standards).
C* Hardening: http://thelastpickle.com/blog/2015/09/30/hardening-cassandra-step-by-step-part-1-server-to-server.html
CIS Standards: https://benchmarks.cisecurity.org/downloads/show-single/?file=ubuntu1404.100
© 2015. All Rights Reserved.
Azure Templates can:
• Ensure Idempotency
• Simplify Orchestration
• Simplify Roll-back
• Provide Cross-Resource Configuration
and Update Support
Azure Templates are:
• Source file, checked-in
• Specifies resources and dependencies
(VMs, WebSites, DBs) and connections
(config, LB sets)
• Parametized input/output
Instantiation of repeatable config.
Configuration  Resource Group
Power of Repeatability
SQL - A Website
Virtual
Machines
SQL-A
Website
[SQL CONFIG] VM (2x)
DEPENDS ON SQLDEPENDS ON SQL
SQL CONFIG
Extending the power of your VM
Enable easier management
Support partner ecosystem
Full control still with you!
Azure VM Extensions
Curated
ExtensionsAgent
Thank you
Sean Usher
Office 365
Email: seusher@microsoft.com
Twitter: @seanushermsft
Mahesh Thiagarajan
Microsoft Azure
Email: mahthi@microsoft.com
Twitter: @_cloudguy
Ben Lackey
DataStax
Email: ben.lackey@datastax.com

More Related Content

Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store

  • 1. Azure + DSE Powers O365 Per-User Store © 2015. All Rights Reserved.
  • 2. 1 Introduction 2 What We Built 3 What to Pay Close Attention To 4 Deployment 5 Wrap Up © 2015. All Rights Reserved. Overview
  • 3. Sean Usher Office 365 Email: seusher@microsoft.com Twitter: @seanushermsft Introduction © 2015. All Rights Reserved. Mahesh Thiagarajan Microsoft Azure Email: mahthi@microsoft.com Twitter: @_cloudguy Ben Lackey DataStax Email: ben.lackey@datastax.com
  • 4. Introduction – Office 365 © 2015. All Rights Reserved. Email Collaboration Document Authoring Social Networking Calendaring File Storage Business Intelligence Etc…
  • 5. Introduction – Azure © 2015. All Rights Reserved. Azure is Microsoft’s cloud computing platform, a growing collection of integrated services—analytics, computing, database, mobile, networking, storage, and web—for moving faster, achieving more, and saving money.
  • 6. What We Built - Overview © 2015. All Rights Reserved. A way to understand our users and organizations at a deeper level! • Are users happy with the service they are receiving? • Are users fully utilizing the services they are paying us for? • Are users hitting issues that we can proactively help them with? • How has a user’s experience been over their lifetime? • Can we discover insights that we aren’t even aware of? This requires ingesting and storing a lot of data. We need to be able to perform fast, scalable analytics on that data, or we will discover issues too late! Questions:
  • 7. What We Built – Why Cassandra © 2015. All Rights Reserved. The Good • Low Latency ✓ • Linear Scale ✓ • Highly Available ✓ • Aggregations (Spark/Spark Streaming) ✓ • Machine Learning (Spark ML) ✓ • No Enforcement of Full Consistency ✓ ✓ ✓ The Not-So-Good • No Hosted Option in Azure ✗ • Have to Install and Configure it Ourselves ✗
  • 8. Cassandra: 12 Nodes Analytics: 12 Nodes VM Size: G4 Heap Size: 30 GB GC: G1 Ingestion: 20k – 50k events/sec Data on ephemeral SSD drives. RF = 3 in both DCs Cassandra: 30 Nodes Analytics: 15 Nodes (30 within 1 month) VM Size: G4 Heap Size: 30 GB GC: G1 Ingestion: 200k+ events/sec Data on ephemeral SSD drives. RF = 3 in both DCs © 2015. All Rights Reserved. What We Built – DSE Clusters Cluster 1: Cluster 2:
  • 9. What We Built - Pipeline Evolution RESTAPI O365 Event Hub Ingestion Worker (Azure worker role using DataStax C# driver) C* Analytics RESTAPI O365 Kafka C*/ Spark Streaming Analytics G4 – Local SSD Kafka: G4 – Data Disk ZooKeeper: A7 – Data Disk PaaS Small G4 – Local SSD © 2015. All Rights Reserved. Cluster 1: Cluster 2:
  • 10. What to Pay Close Attention To – Azure Disks VHD Storage: No more than 40 VMs per-storage account “… and for a Standard Tier VM, it is about 40 (20,000/500 IOPS per disk)…..” https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/ Disk Choice: 1. Local SSD (Ephemeral) – Fast but allows data loss. 2. Data Disk (Standard Storage) – No data loss, network-attached which can add latency. 20k IOPs account Limit. 3. Data Disk (Premium Storage) – No data loss, network-attached which can add latency. Per-disk IOPs Limit. https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-how-to-attach-disk/ https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/#storage-limits VM SSD: /dev/sdb Storage Account (Data Disk) Storage Account (OS Disk) OS: /dev/sda © 2015. All Rights Reserved.
  • 11. What to Pay Close Attention To – Azure VM Size VM Size: We chose G4 nodes, but are investigating moving to D14 nodes. Having a larger number of smaller nodes will allow for faster rebuild which can reduce recovery time. https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-size-specs/ © 2015. All Rights Reserved.
  • 12. What to Pay Close Attention To – Azure Networking Networking: Virtual Network (VNet) vs Public IP 1. Public IPs – Default limit of 5 per subscription. Allows geo-redundant replication over Internet. 2. VNet – Define your own subnets and IP ranges. Allows geo-redundant replication via Gateways/Express Route. No bandwidth limit within Vnet. 1. Standard Gateway – Max 100Mbs. 2. High-Performance Gateway – Max 200Mbs. 3. Express Route – Max 10Gbs. https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-instance-level-public-ip/ https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-vnet-vnet-rm-ps/ https://msdn.microsoft.com/en-us/library/azure/mt586720.aspx © 2015. All Rights Reserved.
  • 13. What to Pay Close Attention To – Azure Networking Test performance of every dependency and see if it meets the expectations of your application. Network Performance: Iperf (https://iperf.fr/) – Test bandwidth between two VMs within various DCs VNet VM 10.1.0.10 Iperf -s VM 10.1.0.11 Iperf –c 10.1.0.10 user@machine:~$ iperf -c 10.1.0.10 ------------------------------------------------------------ Client connecting to 10.1.0.10, TCP port 5001 TCP window size: 2.50 MByte (default) ------------------------------------------------------------ [ 3] local 10.1.0.10 port 42892 connected with 10.1.0.10 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 45.7 GBytes 39.2 Gbits/sec © 2015. All Rights Reserved.
  • 14. What to Pay Close Attention To – Azure Storage Test performance of every dependency and see if it meets the expectations of your application. Disk: SysBench (https://wiki.gentoo.org/wiki/Sysbench) – Test write throughput and IOPs user@machine:/mnt$ sysbench --test=fileio --file-total-size=1000G --file-test-mode=rndrw --init-rng=on --max-time=300 --max-requests=0 run sysbench 0.4.12: multi-threaded system evaluation benchmark <….. Excess Logging Removed….> Operations performed: 402240 Read, 268160 Write, 858065 Other = 1528465 Total Read 6.1377Gb Written 4.0918Gb Total transferred 10.229Gb (34.917Mb/sec) 2234.67 Requests/sec executed Test execution summary: total time: 300.0002s total number of events: 670400 total time taken by event execution: 16.1526 per-request statistics: min: 0.00ms avg: 0.02ms max: 2.20ms approx. 95 percentile: 0.05ms Threads fairness: events (avg/stddev): 670400.0000/0.00 execution time (avg/stddev): 16.1526/0.00 © 2015. All Rights Reserved.
  • 15. What to Pay Close Attention To – Cassandra Metrics! Need to tune? Al Tobey can help - https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html © 2015. All Rights Reserved.
  • 16. What to Pay Close Attention To – Cassandra SSTable Count • Too many SSTables can lead to OOM errors and nodes becoming unavailable. • Watch count and balance compaction throughput with system limits. • SSTable count may spike during repairs if data is inconsistent. Dropped Mutations • Dropped mutations mean more repairs need to be done. • Impact of dropped mutations can be controlled by tuning write consistency. • Check iostat to see if disk queue is building up or write latency is high. • iostat -x /dev/sdb 1 5 • Do drops only happen when Spark Jobs batch write? Tune Spark write throughput (https://github.com/datastax/spark-cassandra-connector/blob/v1.2.5/doc/FAQ.md) See memtables & flushing in Al’s Tuning Guide. © 2015. All Rights Reserved.
  • 17. What to Pay Close Attention To – Cassandra Pending Compactions • If you aren’t keeping up with compactions, performance will suffer. • Too many SSTables impact read speed, but also can lead to hitting OS limits. See: • /etc/sysctl.conf - vm.max_map_count • /etc/security/limits.d/cassandra.conf – nofile • /etc/init.d/dse – Certain DSE versions overwrite nofile with: FD_LIMIT=100000 Heap Used • Heap usage changes over time. What works in week one, may not work in week 10. • We used a 20GB heap until nodes started hitting OOM when they needed 25 GB. • Use G1 if at all possible to see GC times decrease, and use a large (25 – 30 GB) heap. • Let G1 tune your young generation heap size. © 2015. All Rights Reserved.
  • 18. What to Pay Close Attention To – Spark We are still learning! Scheduler Output: NOT CRON! Spark UI: Spark Job Logs: If you don’t enable Spark UI for security reasons, ship your Spark logs off box for analysis. You may also find that jobs fail to read data because partitions are missing or nodes are timing out. This can indicate you are overwhelming Cassandra. © 2015. All Rights Reserved.
  • 19. Deployment Use the Azure/DataStax Template Azure will be investing in building more features into the Azure template, and you will get those easier if you use the existing template. https://www.youtube.com/watch?v=vacp267zLBA&noredirect=1 https://github.com/DSPN/azure-resource-manager-dse We Didn’t Use the Template because it wasn’t ready yet. We had to write our own logic to deploy nodes and need to transition to the template so we can get all of these new features. We are scheduling time to do this because it will save us a lot of work! Consider Security and Compliance: This will influence how you deploy (VNet vs Public IP), what Cassandra configuration you use (internode encryption, require_client_auth: true), and what OS configuration you use (CIS standards). C* Hardening: http://thelastpickle.com/blog/2015/09/30/hardening-cassandra-step-by-step-part-1-server-to-server.html CIS Standards: https://benchmarks.cisecurity.org/downloads/show-single/?file=ubuntu1404.100 © 2015. All Rights Reserved.
  • 20. Azure Templates can: • Ensure Idempotency • Simplify Orchestration • Simplify Roll-back • Provide Cross-Resource Configuration and Update Support Azure Templates are: • Source file, checked-in • Specifies resources and dependencies (VMs, WebSites, DBs) and connections (config, LB sets) • Parametized input/output Instantiation of repeatable config. Configuration  Resource Group Power of Repeatability SQL - A Website Virtual Machines SQL-A Website [SQL CONFIG] VM (2x) DEPENDS ON SQLDEPENDS ON SQL SQL CONFIG
  • 21. Extending the power of your VM Enable easier management Support partner ecosystem Full control still with you! Azure VM Extensions Curated ExtensionsAgent
  • 22. Thank you Sean Usher Office 365 Email: seusher@microsoft.com Twitter: @seanushermsft Mahesh Thiagarajan Microsoft Azure Email: mahthi@microsoft.com Twitter: @_cloudguy Ben Lackey DataStax Email: ben.lackey@datastax.com

Editor's Notes

  1. Premium – p10, p20, p30