SlideShare a Scribd company logo
Chirp 2010: Scaling Twitter
Billions of Hits:
Scaling Twitter
John Adams
Twitter Operations
#chirpscale
John Adams                            @netik
•   Early Twitter employee (mid-2008)

•   Lead engineer: Outward Facing Services (Apache,
    Unicorn, SMTP), Auth, Security

•   Keynote Speaker: O’Reilly Velocity 2009

•   O’Reilly Web 2.0 Speaker (2008, 2010)

•   Previous companies: Inktomi, Apple, c|net

•   Working on Web Operations book with John Alspaw
    (flickr, etsy), out in June
Growth.
752%
                       2008 Growth
source: comscore.com - (based only on www traffic, not API)
1358%
                       2009 Growth
source: comscore.com - (based only on www traffic, not API)
12 th
                    most popular
source: alexa.com
55M
                   Tweets per day
                  (640 TPS/sec, 1000 TPS/sec peak)
source: twitter.com internal
600M
                   Searches/Day
source: twitter.com internal
25%




Web               API
            75%
Operations
•   What do we do?

    •   Site Availability

    •   Capacity Planning (metrics-driven)

    •   Configuration Management

    •   Security

    •   Much more than basic Sysadmin
What have we done?
•   Improved response time, reduced latency

•   Less errors during deploys (Unicorn!)

•   Faster performance

•   Lower MTTD (Mean time to Detect)

•   Lower MTTR (Mean time to Recovery)
Operations Mantra

                                  Move to
     Find            Take
                                   Next
    Weakest        Corrective
                                  Weakest
     Point          Action
                                   Point


   Metrics +
Logs + Science =    Process     Repeatability
   Analysis
Make an attack plan.
 Symptom    Bottleneck   Vector     Solution

                          HTTP
Bandwidth   Network                Servers++
                         Latency
 Timeline                Update      Better
            Database
  Delay                   Delay    algorithm
  Status                             Flock
            Database     Delays
 Growth                            Cassandra
 Updates    Algorithm    Latency   Algorithms
Finding Weakness
•   Metrics + Graphs

    •   Individual metrics are irrelevant

    •   We aggregate metrics to find knowledge

•   Logs

•   SCIENCE!
Monitoring
•   Twitter graphs and reports critical metrics in
    as near real time as possible

•   If you build tools against our API, you should
    too.

    •   RRD, other Time-Series DB solutions

    •   Ganglia + custom gmetric scripts

•   dev.twitter.com - API availability
Analyze
•   Turn data into information

    •   Where is the code base going?

    •   Are things worse than they were?

        •   Understand the impact of the last software
            deploy

        •   Run check scripts during and after deploys

•   Capacity Planning, not Fire Fighting!
Data Analysis
•   Instrumenting the world pays off.

•   “Data analysis, visualization, and other
    techniques for seeing patterns in data are
    going to be an increasingly valuable skill set.
    Employers take notice!”
          “Web Squared: Web 2.0 Five Years On”, Tim O’Reilly, Web 2.0 Summit, 2009
Forecasting             Curve-fitting for capacity planning
                        (R, fityk, Mathematica, CurveFit)



              unsigned int (32 bit)
                Twitpocolypse



  status_id

                                      signed int (32 bit)
                                        Twitpocolypse




                                                  r2=0.99
Internal Dashboard
External API Dashbord




   http://dev.twitter.com/status
What’s a Robot ?
•   Actual error in the Rails stack (HTTP 500)

•   Uncaught Exception

•   Code problem, or failure / nil result

•   Increases our exception count

•   Shows up in Reports
What’s a Whale ?
•   HTTP Error 502, 503

•   Twitter has a hard and fast five second timeout

•   We’d rather fail fast than block on requests

•   We also kill long-running queries (mkill)

•   Timeout
Whale Watcher
•   Simple shell script,

    •   MASSIVE WIN by @ronpepsi

•   Whale = HTTP 503 (timeout)

•   Robot = HTTP 500 (error)

•   Examines last 60 seconds of
    aggregated daemon / www logs

•   “Whales per Second” > Wthreshold

    •   Thar be whales! Call in ops.
Deploy Watcher
Sample window: 300.0 seconds

First start time:
Mon Apr 5 15:30:00 2010 (Mon Apr   5 08:30:00 PDT 2010)
Second start time:
Tue Apr 6 02:09:40 2010 (Mon Apr   5 19:09:40 PDT 2010)

PRODUCTION APACHE: ALL OK
PRODUCTION OTHER: ALL OK
WEB0049 CANARY APACHE: ALL OK
WEB0049 CANARY BACKEND SERVICES: ALL OK
DAEMON0031 CANARY BACKEND SERVICES: ALL OK
DAEMON0031 CANARY OTHER: ALL OK
Feature “Darkmode”
•   Specific site controls to enable and disable
    computationally or IO-Heavy site function

•   The “Emergency Stop” button

•   Changes logged and reported to all teams

•   Around 60 switches we can throw

•   Static / Read-only mode
request flow
           Load Balancers

         Apache mod_proxy

           Rails (Unicorn)

 Flock      memcached        Kestrel

         MySQL      Cassandra

             Daemons
Servers
•   Co-located, dedicated machines at NTT America

•   No clouds; Only for monitoring, not serving

    •   Need raw processing power, latency too high
        in existing cloud offerings

•   Frees us to deal with real, intellectual, computer
    science problems.

•   Moving to our own data center soon
unicorn
•   A single socket Rails application Server (Rack)

•   Zero Downtime Deploys (!)

    •   Controlled, shuffled transfer to new code

•   Less memory, 30% less CPU

•   Shift from mod_proxy_balancer to
    mod_proxy_pass

    •   HAProxy, Ngnix wasn’t any better. really.
Rails
•   Mostly only for front-end.

•   Back end mostly Scala and pure ruby

•   Not to blame for our issues. Analysis found:

    •   Caching + Cache invalidation problems

    •   Bad queries generated by ActiveRecord, resulting in
        slow queries against the db

    •   Queue Latency

•   Replication Lag
memcached
•   memcached isn’t perfect.

    •   Memcached SEGVs hurt us early on.

•   Evictions make the cache unreliable for
    important configuration data
    (loss of darkmode flags, for example)

•   Network Memory Bus isn’t infinite

•   Segmented into pools for better performance
Loony
•   Central machine database (MySQL)

    •   Python, Django, Paraminko SSH

        •   Paraminko - Twitter OSS (@robey)

    •   Ties into LDAP groups

•   When data center sends us email, machine
    definitions built in real-time
Murder
•   @lg rocks!

•   Bittorrent based replication for deploys

•   ~30-60 seconds to update >1k machines

•   P2P - Legal, valid, Awesome.
Kestrel
•   @robey

•   Works like memcache (same protocol)

•   SET = enqueue | GET = dequeue

•   No strict ordering of jobs

•   No shared state between servers

•   Written in Scala.
Asynchronous Requests
•   Inbound traffic consumes a unicorn worker

•   Outbound traffic consumes a unicorn worker

•   The request pipeline should not be used to
    handle 3rd party communications or
    back-end work.

•   Reroute traffic to daemons
Daemons
•   Daemons touch every tweet

•   Many different daemon types at Twitter

•   Old way: One daemon per type (Rails)

    •   New way: Fewer Daemons (Pure Ruby)

•   Daemon Slayer - A Multi Daemon that could
    do many different jobs, all at once.
Disk is the new Tape.
•   Social Networking application profile has
    many O(ny) operations.

•   Page requests have to happen in < 500mS or
    users start to notice. Goal: 250-300mS

•   Web 2.0 isn’t possible without lots of RAM

•   SSDs? What to do?
Caching
•   We’re the real-time web, but lots of caching
    opportunity. You should cache what you get from us.

•   Most caching strategies rely on long TTLs (>60 s)

•   Separate memcache pools for different data types to
    prevent eviction

•   Optimize Ruby Gem to libmemcached + FNV Hash
    instead of Ruby + MD5

•   Twitter now largest contributor to libmemcached
MySQL
•   Sharding large volumes of data is hard

•   Replication delay and cache eviction produce
    inconsistent results to the end user.

•   Locks create resource contention for popular
    data
MySQL Challenges
•   Replication Delay

    •   Single threaded. Slow.

•   Social Networking not good for RDBMS

    •   N x N relationships and social graph / tree
        traversal

    •   Disk issues (FS Choice, noatime, scheduling
        algorithm)
Relational Databases
not a Panacea
•   Good for:

    •   Users, Relational Data, Transactions

•   Bad:

    •   Queues. Polling operations. Social Graph.

•   You don’t need ACID for everything.
Database Replication
•   Major issues around users and statuses tables

•   Multiple functional masters (FRP, FWP)

•   Make sure your code reads and writes to the
    write DBs. Reading from master = slow death

    •   Monitor the DB. Find slow / poorly designed
        queries

•   Kill long running queries before they kill you
    (mkill)
Flock
                                          Flock
•   Scalable Social Graph Store

•   Sharding via Gizzard
                                          Gizzard
•   MySQL backend (many.)

•   13 billion edges,
    100K reads/second
                                  Mysql   Mysql     Mysql
•   Open Source!
Cassandra
•   Originally written by Facebook

•   Distributed Data Store

•   @rk’s changes to Cassandra Open Sourced

•   Currently double-writing into it

•   Transitioning to 100% soon.
Lessons Learned
•   Instrument everything. Start graphing early.

•   Cache as much as possible

•   Start working on scaling early.

•   Don’t rely on memcache, and don’t rely on the
    database

•   Don’t use mongrel. Use Unicorn.
Join Us!
@jointheflock
Q&A
Thanks!
•   @jointheflock

•   http://twitter.com/jobs

•   Download our work

    •   http://twitter.com/about/opensource

More Related Content

Chirp 2010: Scaling Twitter

  • 2. Billions of Hits: Scaling Twitter John Adams Twitter Operations
  • 4. John Adams @netik • Early Twitter employee (mid-2008) • Lead engineer: Outward Facing Services (Apache, Unicorn, SMTP), Auth, Security • Keynote Speaker: O’Reilly Velocity 2009 • O’Reilly Web 2.0 Speaker (2008, 2010) • Previous companies: Inktomi, Apple, c|net • Working on Web Operations book with John Alspaw (flickr, etsy), out in June
  • 6. 752% 2008 Growth source: comscore.com - (based only on www traffic, not API)
  • 7. 1358% 2009 Growth source: comscore.com - (based only on www traffic, not API)
  • 8. 12 th most popular source: alexa.com
  • 9. 55M Tweets per day (640 TPS/sec, 1000 TPS/sec peak) source: twitter.com internal
  • 10. 600M Searches/Day source: twitter.com internal
  • 11. 25% Web API 75%
  • 12. Operations • What do we do? • Site Availability • Capacity Planning (metrics-driven) • Configuration Management • Security • Much more than basic Sysadmin
  • 13. What have we done? • Improved response time, reduced latency • Less errors during deploys (Unicorn!) • Faster performance • Lower MTTD (Mean time to Detect) • Lower MTTR (Mean time to Recovery)
  • 14. Operations Mantra Move to Find Take Next Weakest Corrective Weakest Point Action Point Metrics + Logs + Science = Process Repeatability Analysis
  • 15. Make an attack plan. Symptom Bottleneck Vector Solution HTTP Bandwidth Network Servers++ Latency Timeline Update Better Database Delay Delay algorithm Status Flock Database Delays Growth Cassandra Updates Algorithm Latency Algorithms
  • 16. Finding Weakness • Metrics + Graphs • Individual metrics are irrelevant • We aggregate metrics to find knowledge • Logs • SCIENCE!
  • 17. Monitoring • Twitter graphs and reports critical metrics in as near real time as possible • If you build tools against our API, you should too. • RRD, other Time-Series DB solutions • Ganglia + custom gmetric scripts • dev.twitter.com - API availability
  • 18. Analyze • Turn data into information • Where is the code base going? • Are things worse than they were? • Understand the impact of the last software deploy • Run check scripts during and after deploys • Capacity Planning, not Fire Fighting!
  • 19. Data Analysis • Instrumenting the world pays off. • “Data analysis, visualization, and other techniques for seeing patterns in data are going to be an increasingly valuable skill set. Employers take notice!” “Web Squared: Web 2.0 Five Years On”, Tim O’Reilly, Web 2.0 Summit, 2009
  • 20. Forecasting Curve-fitting for capacity planning (R, fityk, Mathematica, CurveFit) unsigned int (32 bit) Twitpocolypse status_id signed int (32 bit) Twitpocolypse r2=0.99
  • 22. External API Dashbord http://dev.twitter.com/status
  • 23. What’s a Robot ? • Actual error in the Rails stack (HTTP 500) • Uncaught Exception • Code problem, or failure / nil result • Increases our exception count • Shows up in Reports
  • 24. What’s a Whale ? • HTTP Error 502, 503 • Twitter has a hard and fast five second timeout • We’d rather fail fast than block on requests • We also kill long-running queries (mkill) • Timeout
  • 25. Whale Watcher • Simple shell script, • MASSIVE WIN by @ronpepsi • Whale = HTTP 503 (timeout) • Robot = HTTP 500 (error) • Examines last 60 seconds of aggregated daemon / www logs • “Whales per Second” > Wthreshold • Thar be whales! Call in ops.
  • 26. Deploy Watcher Sample window: 300.0 seconds First start time: Mon Apr 5 15:30:00 2010 (Mon Apr 5 08:30:00 PDT 2010) Second start time: Tue Apr 6 02:09:40 2010 (Mon Apr 5 19:09:40 PDT 2010) PRODUCTION APACHE: ALL OK PRODUCTION OTHER: ALL OK WEB0049 CANARY APACHE: ALL OK WEB0049 CANARY BACKEND SERVICES: ALL OK DAEMON0031 CANARY BACKEND SERVICES: ALL OK DAEMON0031 CANARY OTHER: ALL OK
  • 27. Feature “Darkmode” • Specific site controls to enable and disable computationally or IO-Heavy site function • The “Emergency Stop” button • Changes logged and reported to all teams • Around 60 switches we can throw • Static / Read-only mode
  • 28. request flow Load Balancers Apache mod_proxy Rails (Unicorn) Flock memcached Kestrel MySQL Cassandra Daemons
  • 29. Servers • Co-located, dedicated machines at NTT America • No clouds; Only for monitoring, not serving • Need raw processing power, latency too high in existing cloud offerings • Frees us to deal with real, intellectual, computer science problems. • Moving to our own data center soon
  • 30. unicorn • A single socket Rails application Server (Rack) • Zero Downtime Deploys (!) • Controlled, shuffled transfer to new code • Less memory, 30% less CPU • Shift from mod_proxy_balancer to mod_proxy_pass • HAProxy, Ngnix wasn’t any better. really.
  • 31. Rails • Mostly only for front-end. • Back end mostly Scala and pure ruby • Not to blame for our issues. Analysis found: • Caching + Cache invalidation problems • Bad queries generated by ActiveRecord, resulting in slow queries against the db • Queue Latency • Replication Lag
  • 32. memcached • memcached isn’t perfect. • Memcached SEGVs hurt us early on. • Evictions make the cache unreliable for important configuration data (loss of darkmode flags, for example) • Network Memory Bus isn’t infinite • Segmented into pools for better performance
  • 33. Loony • Central machine database (MySQL) • Python, Django, Paraminko SSH • Paraminko - Twitter OSS (@robey) • Ties into LDAP groups • When data center sends us email, machine definitions built in real-time
  • 34. Murder • @lg rocks! • Bittorrent based replication for deploys • ~30-60 seconds to update >1k machines • P2P - Legal, valid, Awesome.
  • 35. Kestrel • @robey • Works like memcache (same protocol) • SET = enqueue | GET = dequeue • No strict ordering of jobs • No shared state between servers • Written in Scala.
  • 36. Asynchronous Requests • Inbound traffic consumes a unicorn worker • Outbound traffic consumes a unicorn worker • The request pipeline should not be used to handle 3rd party communications or back-end work. • Reroute traffic to daemons
  • 37. Daemons • Daemons touch every tweet • Many different daemon types at Twitter • Old way: One daemon per type (Rails) • New way: Fewer Daemons (Pure Ruby) • Daemon Slayer - A Multi Daemon that could do many different jobs, all at once.
  • 38. Disk is the new Tape. • Social Networking application profile has many O(ny) operations. • Page requests have to happen in < 500mS or users start to notice. Goal: 250-300mS • Web 2.0 isn’t possible without lots of RAM • SSDs? What to do?
  • 39. Caching • We’re the real-time web, but lots of caching opportunity. You should cache what you get from us. • Most caching strategies rely on long TTLs (>60 s) • Separate memcache pools for different data types to prevent eviction • Optimize Ruby Gem to libmemcached + FNV Hash instead of Ruby + MD5 • Twitter now largest contributor to libmemcached
  • 40. MySQL • Sharding large volumes of data is hard • Replication delay and cache eviction produce inconsistent results to the end user. • Locks create resource contention for popular data
  • 41. MySQL Challenges • Replication Delay • Single threaded. Slow. • Social Networking not good for RDBMS • N x N relationships and social graph / tree traversal • Disk issues (FS Choice, noatime, scheduling algorithm)
  • 42. Relational Databases not a Panacea • Good for: • Users, Relational Data, Transactions • Bad: • Queues. Polling operations. Social Graph. • You don’t need ACID for everything.
  • 43. Database Replication • Major issues around users and statuses tables • Multiple functional masters (FRP, FWP) • Make sure your code reads and writes to the write DBs. Reading from master = slow death • Monitor the DB. Find slow / poorly designed queries • Kill long running queries before they kill you (mkill)
  • 44. Flock Flock • Scalable Social Graph Store • Sharding via Gizzard Gizzard • MySQL backend (many.) • 13 billion edges, 100K reads/second Mysql Mysql Mysql • Open Source!
  • 45. Cassandra • Originally written by Facebook • Distributed Data Store • @rk’s changes to Cassandra Open Sourced • Currently double-writing into it • Transitioning to 100% soon.
  • 46. Lessons Learned • Instrument everything. Start graphing early. • Cache as much as possible • Start working on scaling early. • Don’t rely on memcache, and don’t rely on the database • Don’t use mongrel. Use Unicorn.
  • 48. Q&A
  • 49. Thanks! • @jointheflock • http://twitter.com/jobs • Download our work • http://twitter.com/about/opensource