SlideShare a Scribd company logo
Goal Driven Performance
      Optimization
              Highload++,
              October 25-26,2010
              Moscow, Russia
              Peter Zaitsev
              Percona Inc
What is this all about ?
• First step to successful performance optimization is
  setting right goals
• In most cases goals are not set (or unclear) and a lot
  of resources wasted on not important things
• This presentation is about setting the right goals and
  using them to optimize performance of existing
  system




Goal Driven Performance Optimization
When is it Applicable ?
• Optimizing Performance for Existing Applications
• Can be used with load testing for scaling application
  and testing new features
• A way to implement monitoring and spot problems
  before users start complain




Goal Driven Performance Optimization
Understanding Performance
• Latency/Response Time
      – Always Important
      – Tolerance can be very different
             • 50ms of Ajax Request
             • 30minutes for report
• Throughtput
      – Often important for multi-user systems
      – System can do 1000 transactions/second




Goal Driven Performance Optimization
Throughput/Latency Relation
• Response time tends to increase with throughput
      – When system overload response time goes to infinity
• Call Center analogy
      – Fewer people servicing calls = better utilization
             • Same as throughput per person
      – More people servicing calls = better response time
             • Calls spend less time waiting in the queue
• Classical Performance Optimization Goal
      – Maximizing Throughput/Utilization while maintaining
        Response time within a guidelines


Goal Driven Performance Optimization
Response Time Metrics
• Average/Medium/Response Time
      – Not a good metric for adequate performance
      – Same as average person temperature in hospital
      – Can be helpful for historical trending
• Maximum Response Time
      – Good in theory. We want No requests taking longer than X
      – Hard to work in practice – some requests will take too long
• Define Percentile response time
      – 95% or requests serviced within 500ms
      – 99% or requests serviced within 1000ms

Goal Driven Performance Optimization
Alternative Measurments
• 95 percentille response time is hard/expensive to
  compute in SQL
      – Can use other metrics
• APDEX
      – http://en.wikipedia.org/wiki/Apdex
• Portion where response time is within response time
      – SUM(response_time<0.5)/count(*)
      – Returning 0.95 Is same as 95% response time of 0.5 sec




Goal Driven Performance Optimization
Even Response Time
• 95% response time goal will allow your system to be
  non responsive for an hour every day
      – Ie extremely bad performance when taking backup
• You want to ensure there is no stalls/performance
  dips.
• If page loads slow and user presses reload and it
  loads quickly it is OK – there are always network
  glitches.
• Define your performance goals at short intervals.
      – Goals should be met at ALL 5 minutes intervals.

Goal Driven Performance Optimization
Even Response Time math
• If you only can work with long intervals you can
  define stricter performance goals
      – 99.9% metrics means 2 min slow response will affect it
             • 86400/1000~=86 (sec) – assuming uniform traffic
• The longer response time is OK the larger intervals
  you can have
      – 1min allowed response time in 99% cases means 1 hour
        check interval should be enough




Goal Driven Performance Optimization
Response Time and an Object
• Not all the pages are created Equal
• Complexity and User Requirement Differ
• Ajax Pop Ups
      – 50ms
• Profile Page Generation
      – 150ms
• Search
      – 300ms
• Site Usage Report
      – 1000ms
Goal Driven Performance Optimization
Responses by Type of Client
• Human Being
      – Actual Human waiting and being impatient
      – Response Time critical
• Bots
      – Some systems have over 80% of bot traffic
      – Bot response time is less critical
             • Though should be good enough to be indexed
• Interactive Web Services
      – Can be used to generate pages on other sites
      – Low Response time is even more critical

Goal Driven Performance Optimization
Different kinds of Slowness
• System “randomly” responds slowly
      – OK as long as rare enough.
      – Users will write it off as Internet/computer slowness
• Sustained Slowness is bad
      – Search request which is always slow
      – User with many friends which is “always” slow
• Are these users/cases important ?
      – Track them separately. They may be invisible with 99%
        alone. ie Performance per customer
      – Consider Firing users/Blocking cases otherwise

Goal Driven Performance Optimization
Where to measure performance
• Client Side (the actual data)
      – http://code.google.com/p/jiffy-web/
      – Firebug etc (but only for development)
• External Performance Monitoring
      – Gomez, Keynote etc
      – Selected pages from selected locations
• Web Server Performance Analyses
      – Focused on one dynamil request response time
      – http://code.google.com/p/instrumentation-for-php/
      – Mk-query-digest; tcprstat

Goal Driven Performance Optimization
Summary of the Goal
• Define 95%, 99% etc response time
• For each User Interaction/Class, each application
  instance/user
• Measured/Monitored each 5 minutes
• From Front End and Backend observation
• Avoiding Performance Holes
      – Some actions or users which are rare but often slow




Goal Driven Performance Optimization
Performance Black Swans
• Queries can be intrinsically slow or caused to be
  slow by side load (queueing)
• You can ignore outliers only if their impact to system
  performance is limited.
• Discover Such Queries
      – Mk-query-digest will report outliers by default
      – Check SHOW PROCESSLIST for never completing
        queries
      – Optimize; Build protection to kill overly slow queries.



Goal Driven Performance Optimization
Production Instrumentation
• Many People Instrument Test System
      – Option to print out Queries/Web Service Requests
      – Great for Debugging/Testing
      – Will not show a lot of performance problems
             • Cold vs hot requests
             • Contention happening in production
             • Special User Cases
• Run Instrumented App in Production and Store Data
      – Can instrument only one of Web servers if overhead is
        large.
      – Can log only 1% of user sessions if can't handle all data

Goal Driven Performance Optimization
What to Instrument
• Total Response Time
• CPU Time
• “Wait Time”
      –   Connections/Database Queries
      –   MemCache
      –   Web Services Request
      –   Other Network Requests
• Additional Information
      – Number and Nature of different queries
      – Hits/Misses for Queries
      – Options which can affect performance
Goal Driven Performance Optimization
Where to Store
• Plain old log files
      – Or directly to the database for smaller systems
•   Load them to the database
•   Or Hadoop on the larger scale
•   Generate standard reports
•   Provide Ad-Hoc way to do deep data analyses




Goal Driven Performance Optimization
Start from what is most important
• Optimize Most important User Interactions first
• Pick What case to focus in
      – Queries which do not meet response time
      – But not Worse Case Scenario
             • Unless outliers kill your system
             • There are always going to be outliers
• Do not analyze just queries above response time
  threshold
      – It is much easier to reach 95% of 1 second if 50% of the
        queries are below 500ms.


Goal Driven Performance Optimization
Benefits of Such Approach
• Direct connection to the business goals
• High Priority problems targeted first
• Focus on real stuff
      – No guess work like “is my buffer pool hit ratio bad?” or “am
        I doing too much full table scans ?”
      – If these there the issues you will find and fix them anyway.
• Understandable and predictable result
      – If MySQL contributes 15% to the response time I can't
        possibly double performance focusing on MySQL
        optimization.


Goal Driven Performance Optimization
Final Notes
• Spikes; Special Cases should not be discarded
      – They are the most interesting/challenging are
• Understand what you're trying to achieve
      – The method is best for optimization of current scale for
        system already in production.
• Check out goal driven performance optimization
  whitepaper
      – http://www.percona.com/files/white-papers/goal-driven-
        performance-optimization.pdf



Goal Driven Performance Optimization
-22-


                             Thanks for Coming
• Questions ? Followup ?
      – pz@percona.com
• Yes, we do MySQL and Web Scaling Consulting
      – http://www.percona.com
• Check out our book
      – Complete rewrite of 1st edition
      – Available in Russian Too
• And Yes we're hiring
      – http://www.percona.com/contact/careers/



Goal Driven Performance Optimization

More Related Content

Goal driven performance optimization (Пётр Зайцев)

  • 1. Goal Driven Performance Optimization Highload++, October 25-26,2010 Moscow, Russia Peter Zaitsev Percona Inc
  • 2. What is this all about ? • First step to successful performance optimization is setting right goals • In most cases goals are not set (or unclear) and a lot of resources wasted on not important things • This presentation is about setting the right goals and using them to optimize performance of existing system Goal Driven Performance Optimization
  • 3. When is it Applicable ? • Optimizing Performance for Existing Applications • Can be used with load testing for scaling application and testing new features • A way to implement monitoring and spot problems before users start complain Goal Driven Performance Optimization
  • 4. Understanding Performance • Latency/Response Time – Always Important – Tolerance can be very different • 50ms of Ajax Request • 30minutes for report • Throughtput – Often important for multi-user systems – System can do 1000 transactions/second Goal Driven Performance Optimization
  • 5. Throughput/Latency Relation • Response time tends to increase with throughput – When system overload response time goes to infinity • Call Center analogy – Fewer people servicing calls = better utilization • Same as throughput per person – More people servicing calls = better response time • Calls spend less time waiting in the queue • Classical Performance Optimization Goal – Maximizing Throughput/Utilization while maintaining Response time within a guidelines Goal Driven Performance Optimization
  • 6. Response Time Metrics • Average/Medium/Response Time – Not a good metric for adequate performance – Same as average person temperature in hospital – Can be helpful for historical trending • Maximum Response Time – Good in theory. We want No requests taking longer than X – Hard to work in practice – some requests will take too long • Define Percentile response time – 95% or requests serviced within 500ms – 99% or requests serviced within 1000ms Goal Driven Performance Optimization
  • 7. Alternative Measurments • 95 percentille response time is hard/expensive to compute in SQL – Can use other metrics • APDEX – http://en.wikipedia.org/wiki/Apdex • Portion where response time is within response time – SUM(response_time<0.5)/count(*) – Returning 0.95 Is same as 95% response time of 0.5 sec Goal Driven Performance Optimization
  • 8. Even Response Time • 95% response time goal will allow your system to be non responsive for an hour every day – Ie extremely bad performance when taking backup • You want to ensure there is no stalls/performance dips. • If page loads slow and user presses reload and it loads quickly it is OK – there are always network glitches. • Define your performance goals at short intervals. – Goals should be met at ALL 5 minutes intervals. Goal Driven Performance Optimization
  • 9. Even Response Time math • If you only can work with long intervals you can define stricter performance goals – 99.9% metrics means 2 min slow response will affect it • 86400/1000~=86 (sec) – assuming uniform traffic • The longer response time is OK the larger intervals you can have – 1min allowed response time in 99% cases means 1 hour check interval should be enough Goal Driven Performance Optimization
  • 10. Response Time and an Object • Not all the pages are created Equal • Complexity and User Requirement Differ • Ajax Pop Ups – 50ms • Profile Page Generation – 150ms • Search – 300ms • Site Usage Report – 1000ms Goal Driven Performance Optimization
  • 11. Responses by Type of Client • Human Being – Actual Human waiting and being impatient – Response Time critical • Bots – Some systems have over 80% of bot traffic – Bot response time is less critical • Though should be good enough to be indexed • Interactive Web Services – Can be used to generate pages on other sites – Low Response time is even more critical Goal Driven Performance Optimization
  • 12. Different kinds of Slowness • System “randomly” responds slowly – OK as long as rare enough. – Users will write it off as Internet/computer slowness • Sustained Slowness is bad – Search request which is always slow – User with many friends which is “always” slow • Are these users/cases important ? – Track them separately. They may be invisible with 99% alone. ie Performance per customer – Consider Firing users/Blocking cases otherwise Goal Driven Performance Optimization
  • 13. Where to measure performance • Client Side (the actual data) – http://code.google.com/p/jiffy-web/ – Firebug etc (but only for development) • External Performance Monitoring – Gomez, Keynote etc – Selected pages from selected locations • Web Server Performance Analyses – Focused on one dynamil request response time – http://code.google.com/p/instrumentation-for-php/ – Mk-query-digest; tcprstat Goal Driven Performance Optimization
  • 14. Summary of the Goal • Define 95%, 99% etc response time • For each User Interaction/Class, each application instance/user • Measured/Monitored each 5 minutes • From Front End and Backend observation • Avoiding Performance Holes – Some actions or users which are rare but often slow Goal Driven Performance Optimization
  • 15. Performance Black Swans • Queries can be intrinsically slow or caused to be slow by side load (queueing) • You can ignore outliers only if their impact to system performance is limited. • Discover Such Queries – Mk-query-digest will report outliers by default – Check SHOW PROCESSLIST for never completing queries – Optimize; Build protection to kill overly slow queries. Goal Driven Performance Optimization
  • 16. Production Instrumentation • Many People Instrument Test System – Option to print out Queries/Web Service Requests – Great for Debugging/Testing – Will not show a lot of performance problems • Cold vs hot requests • Contention happening in production • Special User Cases • Run Instrumented App in Production and Store Data – Can instrument only one of Web servers if overhead is large. – Can log only 1% of user sessions if can't handle all data Goal Driven Performance Optimization
  • 17. What to Instrument • Total Response Time • CPU Time • “Wait Time” – Connections/Database Queries – MemCache – Web Services Request – Other Network Requests • Additional Information – Number and Nature of different queries – Hits/Misses for Queries – Options which can affect performance Goal Driven Performance Optimization
  • 18. Where to Store • Plain old log files – Or directly to the database for smaller systems • Load them to the database • Or Hadoop on the larger scale • Generate standard reports • Provide Ad-Hoc way to do deep data analyses Goal Driven Performance Optimization
  • 19. Start from what is most important • Optimize Most important User Interactions first • Pick What case to focus in – Queries which do not meet response time – But not Worse Case Scenario • Unless outliers kill your system • There are always going to be outliers • Do not analyze just queries above response time threshold – It is much easier to reach 95% of 1 second if 50% of the queries are below 500ms. Goal Driven Performance Optimization
  • 20. Benefits of Such Approach • Direct connection to the business goals • High Priority problems targeted first • Focus on real stuff – No guess work like “is my buffer pool hit ratio bad?” or “am I doing too much full table scans ?” – If these there the issues you will find and fix them anyway. • Understandable and predictable result – If MySQL contributes 15% to the response time I can't possibly double performance focusing on MySQL optimization. Goal Driven Performance Optimization
  • 21. Final Notes • Spikes; Special Cases should not be discarded – They are the most interesting/challenging are • Understand what you're trying to achieve – The method is best for optimization of current scale for system already in production. • Check out goal driven performance optimization whitepaper – http://www.percona.com/files/white-papers/goal-driven- performance-optimization.pdf Goal Driven Performance Optimization
  • 22. -22- Thanks for Coming • Questions ? Followup ? – pz@percona.com • Yes, we do MySQL and Web Scaling Consulting – http://www.percona.com • Check out our book – Complete rewrite of 1st edition – Available in Russian Too • And Yes we're hiring – http://www.percona.com/contact/careers/ Goal Driven Performance Optimization