SlideShare a Scribd company logo
PERFORMANCE BENCHMARKING OF CLOUDS
EVALUATING OPENSTACK
Pradeep Kumar surisetty
#WHOAMI
Pradeep Kumar surisetty
Associate Engineering Manager
Performance and Scale Engineering, Red Hat
psuriset@redhat.com
Believe in Open source
Collaborate or Die
RED HAT PERFORMANCE & SCALE TEAM
TOPICS
CLOUD CHARACTERISTICS
PERFORMANCE MEASURING TOOLS
SPEC CLOUD Iaas 2016 BENCHMARK
PERFORMANCE MONITORING TOOLS
TUNING TIPS
CLOUD CHARACTERISTICS
SPEC RESEARCH GROUP - CLOUD WORKING GROUP
https://research.spec.org/working-groups/rg-cloud-working-group.html
READY FOR RAIN? A VIEW FROM SPEC RESEARCH ON
THE FUTURE OF CLOUD METRICS
https://research.spec.org/fileadmin/user_upload/documents/rg_cl
oud/endorsed_publications/SPEC-RG-2016-01_CloudMetrics.pdf
ELASTICITY
- THE DEGREE TO WHICH A SYSTEM IS ABLE TO ADAPT TO WORKLOAD CHANGES BY
PROVISIONING AND DE-PROVISIONING RESOURCES IN AN AUTONOMIC MANNER, SUCH
THAT AT EACH POINT IN TIME THE AVAILABLE RESOURCES MATCH THE CURRENT
DEMAND AS CLOSELY AS POSSIBLE
Source: READY FOR RAIN? A VIEW FROM SPEC RESEARCH ON THE FUTURE OF CLOUD METRICS, SPEC RG Cloud Working Group
Source: http://content.time.com/time/specials/packages/article/0,28804,2049243_2048657_2049165,00.html
ELASTICITY
Source: http://www.today.com/news/remember-stretch-armstrong-how-buy-your-favorite-retro-toys-your-1D80377927
HOW FAR WILL HE STRETCH?
AS YOU STRETCH HIM DOES IT GET HARDER TO STRETCH HIM MORE?
WHEN I LET GO DOES HE RETURN TO HIS ORIGINAL SHAPE?
WILL HE BREAK WHEN STRETCHED?
HOW LONG DOES HE TAKE TO RETURN TO HIS NORMAL SHAPE?
Source: http://content.time.com/time/specials/packages/article/0,28804,2049243_2048657_2049165,00.html
ELASTICITY
SCALABILITY
- THE ABILITY OF THE SYSTEM TO SUSTAIN INCREASING WORKLOADS BY MAKING USE
OF ADDITIONAL RESOURCES, AND THEREFORE, IN CONTRAST TO ELASTICITY, IT IS NOT
DIRECTLY RELATED TO HOW WELL THE ACTUAL RESOURCE DEMANDS ARE MATCHED BY
THE PROVISIONED RESOURCES AT ANY POINT IN TIME.
Source: READY FOR RAIN? A VIEW FROM SPEC RESEARCH ON THE FUTURE OF CLOUD METRICS, SPEC RG Cloud Working Group
PERFORMANCE MEASURING TOOLS
RALLY
RALLY IS A FAMILIAR OPENSTACK PROJECT
HTTPS://GITHUB.COM/OPENSTACK/RALLY
AN AUTOMATED BENCHMARK TOOL FOR OPENSTACK
BENCHMARKING
MULTIPLE USE CASES
DEVELOPMENT AND QA
DEVOPS
CI/CD
RALLY
Source: https://github.com/OpenStack/rally/blob/master/doc/source/images/Rally-Actions.png
BROWBEAT
BROWBEAT
SCALE AND PERFORMANCE AUTOMATION
ANSIBLE PLAYBOOKS FOR AUTOMATION
PROVIDES AUTOMATION WRAPPER AROUND EXISTING TOOLING
RALLY - CONTROL PLANE TESTS
SHAKER - DATA PLANE NETWORK TESTS
PERFKIT - DATA PLANE TESTS
CBTOOL - DATA PLANE TESTS
LEVERAGES EXISTING UPSTREAM TEST FRAMEWORKS RATHER THAN
REPLACING THEM
PERFORMANCE MONITORING
COLLECTED/GRAPHITE/GRAPHANA
RESULTS CAPTURE AND STORAGE
ELK STACK
ALLOWS FOR ELASTICSEARCH RESULTS COMPARISON
ONCAPTURE METADATA LIKE #API WORKER, NEUTRON CONFIGURATION
..ETC
BROWBEAT
WEB PRESENCE
LOTS OF GREAT INFORMATION ABOUT BROWBEAT
INSTALLING GRAFANA AND GRAPHITE-WEB + CARBON-CACHE AS DOCKER
IMAGES
BROWBEAT IS NOW AN OPENSTACK PROJECT
BROWBEAT HAS NOW MOVED TO THE OPENSTACK.ORG NAMESPACE
NOW ABLE TO USE THE UPSTREAM OPENSTACK INFRASTRUCTURE AND CI
SEEING INTEREST PICK UP
BROWBEATPROJECT.ORG
HTTPS://GITHUB.COM/OPENSTACK/BROWBEAT
BROWBEAT
install and configure all of our
workloads ,
ELK (or ES, FluentD, and Kibana
under/overcloud with collectd
graphite and grafana,
OpenStack specific Grafana Dashboards that we push to Grafana based on your deployment.
BROWBEAT
REPEATABLE AUTOMATED TESTING
BROWBEAT
PERFKIT BENCHMARKER
Source: Introduction to Perfkit Benchmark and How to Extend it, https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/wiki/Tech-Talks
PERFKIT BENCHMARKER
PERFKIT BENCHMARKER
PERFKIT BENCHMARKER
Source: Introduction to Perfkit Benchmark and How to Extend it, https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/wiki/Tech-Talks
PERFKIT BENCHMARKER
CLOUDBENCH
FRAMEWORK THAT AUTOMATES CLOUD-SCALE EVALUATION AND
BENCHMARKING
BENCHMARK HARNESS
REQUESTS THE CLOUD MANAGER TO CREATE AN INSTANCE(S)
SUBMIT CONFIGURATION PLAN AND STEPS TO THE CLOUD
MANAGER ON HOW THE TEST WILL BE PERFORMED
AT THE END OF THE TEST, COLLECT AND LOG APPLICABLE
PERFORMANCE DATA AND LOGS
DESTROY INSTANCES NO LONGER NEEDED FOR THE TEST.
BENCHMARK HARNESS
HARNESS AND WORKLOAD CONTROL
Benchmark Harness
Benchmark Harness. It comprises of Cloud Bench (CBTOOL)
and baseline/elasticity drivers, and report generators.
For white-box clouds the benchmark harness is outside the
SUT. For black-box clouds, it can be in the same location or
campus.
Cloud SUT
Group of boxes represents an
application instance
SUPPORTED WORKLOADS
SPEC CLOUD IAAS 2016 BENCHMARK
SPEC CLOUD IAAS 2016 BENCHMARK
MEASURES PERFORMANCE OF INFRASTRUCTURE-AS-A-SERVICE
(IAAS) CLOUDS.
MEASURES BOTH CONTROL AND DATA PLANE
CONTROL: MANAGEMENT OPERATIONS, E.G., INSTANCE
PROVISIONING TIME
DATA: VIRTUALIZATION, NETWORK PERFORMANCE, RUNTIME
PERFORMANCE
USES WORKLOADS THAT
RESEMBLE “REAL” CUSTOMER APPLICATIONS
BENCHMARKS THE CLOUD, NOT THE APPLICATION
PRODUCES METRICS (“ELASTICITY”, “SCALABILITY”, “PROVISIONING
TIME”) WHICH ALLOW COMPARISON.
HTTP://EN.COMMUNITY.DELL.COM/TECHCENTER/CLOUD/B/DELL-CLOUD-BLOG/ARCHIVE/2016/06/24/SPEC-CLOUD-
IAAS-BENCHMARKING-DELL-LEADS-THE-WAY
SPEC CLOUD IAAS BENCHMARKING : DELL LEADS THE WAY
SPEC CLOUD WORKLOADS
YCSB
FRAMEWORK USED BY A COMMON SET OF
WORKLOADS FOR EVALUATING
PERFORMANCE OF DIFFERENT KEY-VALUE
AND CLOUD SERVING STORES.
KMEANS
- HADOOP-BASED CPU INTENSIVE WORKLOAD
- CHOSE INTEL HIBENCH IMPLEMENTATION
WHAT IS MEASURED
MEASURES THE NUMBER OF AIS THAT CAN BE LOADED
ONTO A CLUSTER BEFORE SLA VIOLATIONS OCCUR
MEASURES THE SCALABILITY AND ELASTICITY OF THE
CLOUD UNDER TEST (CUT)
NOT A MEASURE OF INSTANCE DENSITY
SPEC CLOUD WORKLOADS CAN INDIVIDUALLY BE USED TO
STRESS THE CUT:
KMEANS – CPU/MEMORY
YCSB - IO
Performance Benchmarking of Clouds                Evaluating OpenStack
BENCHMARK STOPPING CONDITIONS
20% AIS FAIL TO PROVISION
10% AIS HAVE ERRORS IN ANY RUN
MAX NUMBER OF AIS SET BY CLOUD PROVIDER
50% AIS HAVE QOS VIOLATIONS
KMEANS COMPLETION TIME ≤ 3.33X BASELINE PHASE
YCSB THROUGHPUT ≥ BASELINETHROUGHPUT / 3
YCSB READ RESPONSE TIME ≤ 20 X BASELINEREADRESPONSE TIME
YCSB INSERT RESPONSE TIME ≤ 20 X BASELINEINSERTRESPONSE
TIME
GH LEVEL REPORT SUMMARY
RESULTS COMPARED
PUBLISHED RESULTS WEBSITE
https://www.spec.org/cloud_iaas2016/results/cloudiaas2016.html
PERFORMANCE MONITORING TOOLS
CEILOMETER
ANOTHER FAMILIAR OPENSTACK PROJECT
GOAL IS TO EFFICIENTLY COLLECT, NORMALIZE AND TRANSFORM
DATA PRODUCED BY OPENSTACK SERVICES
INTERACTS DIRECTLY WITH THE OPENSTACK SERVICES THROUGH
DEFINED INTERFACES
MANY TOOLS UTILIZE CEILOMETER TO GATHER OPENSTACK
PERFORMANCE DATA
HTTPS://GITHUB.COM/OPENSTACK/CEILOMETER
CEILOMETER
Source: http://docs.OpenStack.org/developer/ceilometer/architecture.html
COLLECTD/GRAPHITE/GRAPHANA
COLLECTD
DAEMON TO COLLECT SYSTEM PERFORMANCE STATISTIC
CPU, MEMORY, DISK, NETWORK, PER PROCESS STATS (REGEX),
POSTGRESQL AND MORE
GRAPHITE/CARBON
CARBON RECEIVES METRICS, AND FLUSHES THEM TO WHISPER
DATABASE FILES
GRAPHITE IS WEBAPP FRONTEND TO CARBON
GRAFANA
VISUALIZE METRICS FROM MULTIPLE BACKENDS.
DASHBOARDS SAVED IN JSON AND CUSTOMIZED BY ANSIBLE DURING
DEPLOYMENT
COLLECTD/GRAPHITE/GRAPHANA
Example Graphana dashboards
GANGLIA
SCALABLE DISTRIBUTED MONITORING SYSTEM FOR
HIGH-PERFORMANCE COMPUTING
WIDELY USED IN UNIVERSITIES, PRIVATE AND
GOVERNMENT LABORATORIES.
GREAT TOOL FOR MONITORING HARDWARE
COMPONENT UTILIZATION AND GATHERING STATS.
GANGLIA
TUNING TIPS
HARDWARE/OS TUNING
Latest BIOS and Firmware revs
Appropriate BIOS settings
RAID/JBOD
Disk controller
NIC driver- Interrupt coalescing and affinitization
NIC bonding
NIC jumbo frames
OS configuration settings
INSTANCE CONFIGURATION
Performance is
impacted by
Instance type
(flavor)
Number of
Instances
OVER-SUBSCRIPTION
Beware of over-subscription !!!
LOCAL STORAGE
Use of local storage
instead of shared
storage like Ceph could
improve performance
by over
50%...depending on
Ceph replication.
Source: OpenStack: Install and con gure a storage node - OpenStackkilo.
http://docs.OpenStack.org/kilo/install-guide/install/yum/content/cinder-install-storage-node.html (2015)
NUMA NODES
Pinning instance CPU
to physical CPUs
(NUMA nodes) on
local storage further
improves
performance.
Source: Red Hat: Cpu pinning and numa topology awareness in OpenStackcompute. http://redhatstackblog.redhat.com/2015/05/05/cpu-
pinning-and-numa-topology-awareness-in-OpenStack-compute/ (2015)
DISK PINNING
Source: OpenStack: OpenStack cinder multibackend. https://wiki.OpenStack.org/wiki/Cinder-
multi-backend (2015)
Disk Pinning
shows a 15%
performance
improvement
UNEVEN CONTROLLER USAGE
One controller had more cores
available than the other two and
ended up with all the jobs. This
scenario was identified easily
because the correct dashboarding
was in place.
HEAT MEMORY USAGE
About 1GB of memory used by Heat for every 10 compute nodes deployed. Size your
controller memory appropriately.
DEPLOYMENT TIMINGS
Saw many instance reschedules with default scheduler. Deployment time dropped dramatically by
setting up assignments via ironic.
CONCLUSION
DEFINE WHAT YOU ARE TRYING TO MEASURE
DEFINE A CLOUD
DEFINE WHAT METRICS ARE IMPORTANT
USE THE CORRECT TOOLS
RALLY
PERFKIT BENCHMARKER
CLOUDBENCH
SPEC CLOUD IAAS 2016 BENCHMARK
CEILOMETER
COLLECTD/GRAPHITE/GRAPHANA
GANGLIA
GATHER AND ANALYZE DATA
APPLY TUNING TIPS BASED ON THE DATA
THANKS
Thanks to Andy Bond, Douglas Shakshober , Joe Talerico for some of the content
ADDITIONAL INFORMATION
GUIDELINES AND CONSIDERATIONS FOR PERFORMANCE AND SCALING YOUR
RED HAT ENTERPRISE LINUX OPENSTACK PLATFORM 6 CLOUD
HTTPS://ACCESS.REDHAT.COM/ARTICLES/1507893
GUIDELINES AND CONSIDERATIONS FOR PERFORMANCE AND SCALING YOUR
RED HAT ENTERPRISE LINUX OPENSTACK PLATFORM 7 CLOUD
HTTPS://ACCESS.REDHAT.COM/ARTICLES/2165131
RED HAT OPENSTACK BLOG
HTTP://REDHATSTACKBLOG.REDHAT.COM/
RED HAT DEVELOPER BLOG
HTTP://DEVELOPERBLOG.REDHAT.COM/
RED HAT ENTERPRISE LINUX BLOG
HTTP://RHELBLOG.REDHAT.COM/

More Related Content

Performance Benchmarking of Clouds Evaluating OpenStack

  • 1. PERFORMANCE BENCHMARKING OF CLOUDS EVALUATING OPENSTACK Pradeep Kumar surisetty
  • 2. #WHOAMI Pradeep Kumar surisetty Associate Engineering Manager Performance and Scale Engineering, Red Hat psuriset@redhat.com Believe in Open source Collaborate or Die
  • 3. RED HAT PERFORMANCE & SCALE TEAM
  • 4. TOPICS CLOUD CHARACTERISTICS PERFORMANCE MEASURING TOOLS SPEC CLOUD Iaas 2016 BENCHMARK PERFORMANCE MONITORING TOOLS TUNING TIPS
  • 5. CLOUD CHARACTERISTICS SPEC RESEARCH GROUP - CLOUD WORKING GROUP https://research.spec.org/working-groups/rg-cloud-working-group.html READY FOR RAIN? A VIEW FROM SPEC RESEARCH ON THE FUTURE OF CLOUD METRICS https://research.spec.org/fileadmin/user_upload/documents/rg_cl oud/endorsed_publications/SPEC-RG-2016-01_CloudMetrics.pdf
  • 6. ELASTICITY - THE DEGREE TO WHICH A SYSTEM IS ABLE TO ADAPT TO WORKLOAD CHANGES BY PROVISIONING AND DE-PROVISIONING RESOURCES IN AN AUTONOMIC MANNER, SUCH THAT AT EACH POINT IN TIME THE AVAILABLE RESOURCES MATCH THE CURRENT DEMAND AS CLOSELY AS POSSIBLE Source: READY FOR RAIN? A VIEW FROM SPEC RESEARCH ON THE FUTURE OF CLOUD METRICS, SPEC RG Cloud Working Group Source: http://content.time.com/time/specials/packages/article/0,28804,2049243_2048657_2049165,00.html
  • 7. ELASTICITY Source: http://www.today.com/news/remember-stretch-armstrong-how-buy-your-favorite-retro-toys-your-1D80377927 HOW FAR WILL HE STRETCH? AS YOU STRETCH HIM DOES IT GET HARDER TO STRETCH HIM MORE? WHEN I LET GO DOES HE RETURN TO HIS ORIGINAL SHAPE? WILL HE BREAK WHEN STRETCHED? HOW LONG DOES HE TAKE TO RETURN TO HIS NORMAL SHAPE?
  • 9. SCALABILITY - THE ABILITY OF THE SYSTEM TO SUSTAIN INCREASING WORKLOADS BY MAKING USE OF ADDITIONAL RESOURCES, AND THEREFORE, IN CONTRAST TO ELASTICITY, IT IS NOT DIRECTLY RELATED TO HOW WELL THE ACTUAL RESOURCE DEMANDS ARE MATCHED BY THE PROVISIONED RESOURCES AT ANY POINT IN TIME. Source: READY FOR RAIN? A VIEW FROM SPEC RESEARCH ON THE FUTURE OF CLOUD METRICS, SPEC RG Cloud Working Group
  • 11. RALLY RALLY IS A FAMILIAR OPENSTACK PROJECT HTTPS://GITHUB.COM/OPENSTACK/RALLY AN AUTOMATED BENCHMARK TOOL FOR OPENSTACK BENCHMARKING MULTIPLE USE CASES DEVELOPMENT AND QA DEVOPS CI/CD
  • 14. BROWBEAT SCALE AND PERFORMANCE AUTOMATION ANSIBLE PLAYBOOKS FOR AUTOMATION PROVIDES AUTOMATION WRAPPER AROUND EXISTING TOOLING RALLY - CONTROL PLANE TESTS SHAKER - DATA PLANE NETWORK TESTS PERFKIT - DATA PLANE TESTS CBTOOL - DATA PLANE TESTS LEVERAGES EXISTING UPSTREAM TEST FRAMEWORKS RATHER THAN REPLACING THEM PERFORMANCE MONITORING COLLECTED/GRAPHITE/GRAPHANA RESULTS CAPTURE AND STORAGE ELK STACK ALLOWS FOR ELASTICSEARCH RESULTS COMPARISON ONCAPTURE METADATA LIKE #API WORKER, NEUTRON CONFIGURATION ..ETC
  • 15. BROWBEAT WEB PRESENCE LOTS OF GREAT INFORMATION ABOUT BROWBEAT INSTALLING GRAFANA AND GRAPHITE-WEB + CARBON-CACHE AS DOCKER IMAGES BROWBEAT IS NOW AN OPENSTACK PROJECT BROWBEAT HAS NOW MOVED TO THE OPENSTACK.ORG NAMESPACE NOW ABLE TO USE THE UPSTREAM OPENSTACK INFRASTRUCTURE AND CI SEEING INTEREST PICK UP BROWBEATPROJECT.ORG HTTPS://GITHUB.COM/OPENSTACK/BROWBEAT
  • 16. BROWBEAT install and configure all of our workloads , ELK (or ES, FluentD, and Kibana under/overcloud with collectd graphite and grafana, OpenStack specific Grafana Dashboards that we push to Grafana based on your deployment.
  • 19. PERFKIT BENCHMARKER Source: Introduction to Perfkit Benchmark and How to Extend it, https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/wiki/Tech-Talks
  • 22. PERFKIT BENCHMARKER Source: Introduction to Perfkit Benchmark and How to Extend it, https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/wiki/Tech-Talks
  • 24. CLOUDBENCH FRAMEWORK THAT AUTOMATES CLOUD-SCALE EVALUATION AND BENCHMARKING BENCHMARK HARNESS REQUESTS THE CLOUD MANAGER TO CREATE AN INSTANCE(S) SUBMIT CONFIGURATION PLAN AND STEPS TO THE CLOUD MANAGER ON HOW THE TEST WILL BE PERFORMED AT THE END OF THE TEST, COLLECT AND LOG APPLICABLE PERFORMANCE DATA AND LOGS DESTROY INSTANCES NO LONGER NEEDED FOR THE TEST.
  • 26. HARNESS AND WORKLOAD CONTROL Benchmark Harness Benchmark Harness. It comprises of Cloud Bench (CBTOOL) and baseline/elasticity drivers, and report generators. For white-box clouds the benchmark harness is outside the SUT. For black-box clouds, it can be in the same location or campus. Cloud SUT Group of boxes represents an application instance
  • 28. SPEC CLOUD IAAS 2016 BENCHMARK
  • 29. SPEC CLOUD IAAS 2016 BENCHMARK MEASURES PERFORMANCE OF INFRASTRUCTURE-AS-A-SERVICE (IAAS) CLOUDS. MEASURES BOTH CONTROL AND DATA PLANE CONTROL: MANAGEMENT OPERATIONS, E.G., INSTANCE PROVISIONING TIME DATA: VIRTUALIZATION, NETWORK PERFORMANCE, RUNTIME PERFORMANCE USES WORKLOADS THAT RESEMBLE “REAL” CUSTOMER APPLICATIONS BENCHMARKS THE CLOUD, NOT THE APPLICATION PRODUCES METRICS (“ELASTICITY”, “SCALABILITY”, “PROVISIONING TIME”) WHICH ALLOW COMPARISON. HTTP://EN.COMMUNITY.DELL.COM/TECHCENTER/CLOUD/B/DELL-CLOUD-BLOG/ARCHIVE/2016/06/24/SPEC-CLOUD- IAAS-BENCHMARKING-DELL-LEADS-THE-WAY SPEC CLOUD IAAS BENCHMARKING : DELL LEADS THE WAY
  • 30. SPEC CLOUD WORKLOADS YCSB FRAMEWORK USED BY A COMMON SET OF WORKLOADS FOR EVALUATING PERFORMANCE OF DIFFERENT KEY-VALUE AND CLOUD SERVING STORES. KMEANS - HADOOP-BASED CPU INTENSIVE WORKLOAD - CHOSE INTEL HIBENCH IMPLEMENTATION
  • 31. WHAT IS MEASURED MEASURES THE NUMBER OF AIS THAT CAN BE LOADED ONTO A CLUSTER BEFORE SLA VIOLATIONS OCCUR MEASURES THE SCALABILITY AND ELASTICITY OF THE CLOUD UNDER TEST (CUT) NOT A MEASURE OF INSTANCE DENSITY SPEC CLOUD WORKLOADS CAN INDIVIDUALLY BE USED TO STRESS THE CUT: KMEANS – CPU/MEMORY YCSB - IO
  • 33. BENCHMARK STOPPING CONDITIONS 20% AIS FAIL TO PROVISION 10% AIS HAVE ERRORS IN ANY RUN MAX NUMBER OF AIS SET BY CLOUD PROVIDER 50% AIS HAVE QOS VIOLATIONS KMEANS COMPLETION TIME ≤ 3.33X BASELINE PHASE YCSB THROUGHPUT ≥ BASELINETHROUGHPUT / 3 YCSB READ RESPONSE TIME ≤ 20 X BASELINEREADRESPONSE TIME YCSB INSERT RESPONSE TIME ≤ 20 X BASELINEINSERTRESPONSE TIME
  • 34. GH LEVEL REPORT SUMMARY
  • 38. CEILOMETER ANOTHER FAMILIAR OPENSTACK PROJECT GOAL IS TO EFFICIENTLY COLLECT, NORMALIZE AND TRANSFORM DATA PRODUCED BY OPENSTACK SERVICES INTERACTS DIRECTLY WITH THE OPENSTACK SERVICES THROUGH DEFINED INTERFACES MANY TOOLS UTILIZE CEILOMETER TO GATHER OPENSTACK PERFORMANCE DATA HTTPS://GITHUB.COM/OPENSTACK/CEILOMETER
  • 40. COLLECTD/GRAPHITE/GRAPHANA COLLECTD DAEMON TO COLLECT SYSTEM PERFORMANCE STATISTIC CPU, MEMORY, DISK, NETWORK, PER PROCESS STATS (REGEX), POSTGRESQL AND MORE GRAPHITE/CARBON CARBON RECEIVES METRICS, AND FLUSHES THEM TO WHISPER DATABASE FILES GRAPHITE IS WEBAPP FRONTEND TO CARBON GRAFANA VISUALIZE METRICS FROM MULTIPLE BACKENDS. DASHBOARDS SAVED IN JSON AND CUSTOMIZED BY ANSIBLE DURING DEPLOYMENT
  • 42. GANGLIA SCALABLE DISTRIBUTED MONITORING SYSTEM FOR HIGH-PERFORMANCE COMPUTING WIDELY USED IN UNIVERSITIES, PRIVATE AND GOVERNMENT LABORATORIES. GREAT TOOL FOR MONITORING HARDWARE COMPONENT UTILIZATION AND GATHERING STATS.
  • 45. HARDWARE/OS TUNING Latest BIOS and Firmware revs Appropriate BIOS settings RAID/JBOD Disk controller NIC driver- Interrupt coalescing and affinitization NIC bonding NIC jumbo frames OS configuration settings
  • 46. INSTANCE CONFIGURATION Performance is impacted by Instance type (flavor) Number of Instances
  • 48. LOCAL STORAGE Use of local storage instead of shared storage like Ceph could improve performance by over 50%...depending on Ceph replication. Source: OpenStack: Install and con gure a storage node - OpenStackkilo. http://docs.OpenStack.org/kilo/install-guide/install/yum/content/cinder-install-storage-node.html (2015)
  • 49. NUMA NODES Pinning instance CPU to physical CPUs (NUMA nodes) on local storage further improves performance. Source: Red Hat: Cpu pinning and numa topology awareness in OpenStackcompute. http://redhatstackblog.redhat.com/2015/05/05/cpu- pinning-and-numa-topology-awareness-in-OpenStack-compute/ (2015)
  • 50. DISK PINNING Source: OpenStack: OpenStack cinder multibackend. https://wiki.OpenStack.org/wiki/Cinder- multi-backend (2015) Disk Pinning shows a 15% performance improvement
  • 51. UNEVEN CONTROLLER USAGE One controller had more cores available than the other two and ended up with all the jobs. This scenario was identified easily because the correct dashboarding was in place.
  • 52. HEAT MEMORY USAGE About 1GB of memory used by Heat for every 10 compute nodes deployed. Size your controller memory appropriately.
  • 53. DEPLOYMENT TIMINGS Saw many instance reschedules with default scheduler. Deployment time dropped dramatically by setting up assignments via ironic.
  • 54. CONCLUSION DEFINE WHAT YOU ARE TRYING TO MEASURE DEFINE A CLOUD DEFINE WHAT METRICS ARE IMPORTANT USE THE CORRECT TOOLS RALLY PERFKIT BENCHMARKER CLOUDBENCH SPEC CLOUD IAAS 2016 BENCHMARK CEILOMETER COLLECTD/GRAPHITE/GRAPHANA GANGLIA GATHER AND ANALYZE DATA APPLY TUNING TIPS BASED ON THE DATA
  • 55. THANKS Thanks to Andy Bond, Douglas Shakshober , Joe Talerico for some of the content
  • 56. ADDITIONAL INFORMATION GUIDELINES AND CONSIDERATIONS FOR PERFORMANCE AND SCALING YOUR RED HAT ENTERPRISE LINUX OPENSTACK PLATFORM 6 CLOUD HTTPS://ACCESS.REDHAT.COM/ARTICLES/1507893 GUIDELINES AND CONSIDERATIONS FOR PERFORMANCE AND SCALING YOUR RED HAT ENTERPRISE LINUX OPENSTACK PLATFORM 7 CLOUD HTTPS://ACCESS.REDHAT.COM/ARTICLES/2165131 RED HAT OPENSTACK BLOG HTTP://REDHATSTACKBLOG.REDHAT.COM/ RED HAT DEVELOPER BLOG HTTP://DEVELOPERBLOG.REDHAT.COM/ RED HAT ENTERPRISE LINUX BLOG HTTP://RHELBLOG.REDHAT.COM/