SlideShare a Scribd company logo
Big Bird.
(scaling twitter)
Rails Scales.
(but not out of the box)
First, Some Facts
• 600 requests per second. Growing fast.
• 180 Rails Instances (Mongrel). Growing fast.
• 1 Database Server (MySQL) + 1 Slave.
• 30-odd Processes for Misc. Jobs
• 8 Sun X4100s
• Many users, many updates.
Scaling Twitter
Scaling Twitter
Scaling Twitter
Joy          Pain




Oct   Nov   Dec    Jan   Feb     March   Apr
IM IN UR RAILZ




     MAKIN EM GO FAST
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Messaging
5. Deal With Abuse
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Messaging
5. Deal With Abuse
6. Profit
the
     more
      you
        know

{ Part the First }
We Failed at This.
Don’t Be Like Us

• Munin
• Nagios
• AWStats & Google Analytics
• Exception Notifier / Exception Logger
• Immediately add reporting to track problems.
Test Everything

•   Start Before You Start

•   No Need To Be Fancy

•   Tests Will Save Your Life

•   Agile Becomes
    Important When Your
    Site Is Down

 
 



                  Benchmarks?
                       let your users do it.
The Database
  { Part the Second }
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going
to Be Easily Partitionable” - S. Butterfield
Too Late.
Index Everything
class AddIndex < ActiveRecord::Migration
     def self.up
       add_index :users, :email
     end

     def self.down
       remove_index :users, :email
     end
   end


Repeat for any column that appears in a WHERE clause

             Rails won’t do this for you.
Denormalize A Lot
class DenormalizeFriendsIds < ActiveRecord::Migration
  def self.up
    add_column "users", "friends_ids", :text
  end

  def self.down
    remove_column "users", "friends_ids"
  end
end
class Friendship < ActiveRecord::Base
  belongs_to :user
  belongs_to :friend

 after_create :add_to_denormalized_friends
 after_destroy :remove_from_denormalized_friends

  def add_to_denormalized_friends
    user.friends_ids << friend.id
    user.friends_ids.uniq!
    user.save_without_validation
  end

  def remove_from_denormalized_friends
    user.friends_ids.delete(friend.id)
    user.save_without_validation
  end
end
Don’t be Stupid
bob.friends.map(&:email)
     Status.count()
“email like ‘%#{search}%’”
That’s where we are.
                  Seriously.
  If your Rails application is doing anything more
complex than that, you’re doing something wrong*.



        * or you observed the First Rule of Butterfield.
Partitioning Comes Later.
   (we’ll let you know how it goes)
The Cache
 { Part the Third }
MemCache
MemCache
MemCache
!
class Status < ActiveRecord::Base
  class << self
    def count_with_memcache(*args)
      return count_without_memcache unless args.empty?
      count = CACHE.get(“status_count”)
      if count.nil?
        count = count_without_memcache
        CACHE.set(“status_count”, count)
      end
      count
    end
    alias_method_chain :count, :memcache
  end
  after_create :increment_memcache_count
  after_destroy :decrement_memcache_count
  ...
end
class User < ActiveRecord::Base
  def friends_statuses
    ids = CACHE.get(“friends_statuses:#{id}”)
    Status.find(:all, :conditions => [“id IN (?)”, ids])
  end
end

class Status < ActiveRecord::Base
  after_create :update_caches
  def update_caches
    user.friends_ids.each do |friend_id|
      ids = CACHE.get(“friends_statuses:#{friend_id}”)
      ids.pop
      ids.unshift(id)
      CACHE.set(“friends_statuses:#{friend_id}”, ids)
    end
  end
end
The Future


            ve d
          ti r
         co
         Ac
           e
         R
90% API Requests
     Cache Them!
“There are only two hard things in CS:
 cache invalidation and naming things.”

             – Phil Karlton, via Tim Bray
Messaging
{ Part the Fourth }
You Already Knew All
That Other Stuff, Right?
Producer             Consumer
           Message
Producer             Consumer
           Queue
Producer             Consumer
DRb
• The Good:
 • Stupid Easy
 • Reasonably Fast
• The Bad:
 • Kinda Flaky
 • Zero Redundancy
 • Tightly Coupled
ejabberd


            Jabber Client
                (drb)




           Incoming         Outgoing
Presence
           Messages         Messages


              MySQL
Server
     DRb.start_service ‘druby://localhost:10000’, myobject




                         Client
myobject = DRbObject.new_with_uri(‘druby://localhost:10000’)
Rinda

• Shared Queue (TupleSpace)
• Built with DRb
• RingyDingy makes it stupid easy
• See Eric Hodel’s documentation
• O(N) for take(). Sigh.
Timestamp: 12/22/06 01:53:14 (4 months ago)
      Author: lattice
      Message: Fugly. Seriously. Fugly.




        SELECT * FROM messages WHERE
substring(truncate(id,0),-2,1) = #{@fugly_dist_idx}
It Scales.
(except it stopped on Tuesday)
Options

• ActiveMQ (Java)
• RabbitMQ (erlang)
• MySQL + Lightweight Locking
• Something Else?
erlang?


What are you doing?
 Stabbing my eyes out with a fork.
Starling

• Ruby, will be ported to something faster
• 4000 transactional msgs/s
• First pass written in 4 hours
• Speaks MemCache (set, get)
Use Messages to
Invalidate Cache
   (it’s really not that hard)
Abuse
{ Part the Fifth }
The Italians
9000 friends in 24 hours
        (doesn’t scale)
http://flickr.com/photos/heather/464504545/
http://flickr.com/photos/curiouskiwi/165229284/
http://flickr.com/photo_zoom.gne?id=42914103&size=l
http://flickr.com/photos/madstillz/354596905/
http://flickr.com/photos/laughingsquid/382242677/
http://flickr.com/photos/bng/46678227/

More Related Content

What's hot

Serverless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on KubernetesServerless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on Kubernetes
Claus Ibsen
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
Dan McKinley
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
Taras Matyashovsky
 
MySQL GTID 시작하기
MySQL GTID 시작하기MySQL GTID 시작하기
MySQL GTID 시작하기
I Goo Lee
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
Danny Yuan
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
Gleb Kanterov
 
Introduction to big data and apache spark
Introduction to big data and apache sparkIntroduction to big data and apache spark
Introduction to big data and apache spark
Mohammed Guller
 
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
ClickHouse Introduction by Alexander Zaitsev, Altinity CTOClickHouse Introduction by Alexander Zaitsev, Altinity CTO
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
Altinity Ltd
 
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query PerformanceORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
DataWorks Summit
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
Masahiko Sawada
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
Jean-François Gagné
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
pflueras
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
Alexander Korotkov
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
Yoshinori Matsunobu
 
The Dream Stream Team for Pulsar and Spring
The Dream Stream Team for Pulsar and SpringThe Dream Stream Team for Pulsar and Spring
The Dream Stream Team for Pulsar and Spring
Timothy Spann
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
Corey Huinker
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
DataStax Academy
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar Framework
ScyllaDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systems
Dave Gardner
 

What's hot (20)

Serverless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on KubernetesServerless integration with Knative and Apache Camel on Kubernetes
Serverless integration with Knative and Apache Camel on Kubernetes
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
MySQL GTID 시작하기
MySQL GTID 시작하기MySQL GTID 시작하기
MySQL GTID 시작하기
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Introduction to big data and apache spark
Introduction to big data and apache sparkIntroduction to big data and apache spark
Introduction to big data and apache spark
 
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
ClickHouse Introduction by Alexander Zaitsev, Altinity CTOClickHouse Introduction by Alexander Zaitsev, Altinity CTO
ClickHouse Introduction by Alexander Zaitsev, Altinity CTO
 
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query PerformanceORC File & Vectorization - Improving Hive Data Storage and Query Performance
ORC File & Vectorization - Improving Hive Data Storage and Query Performance
 
Introduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparoundIntroduction VAUUM, Freezing, XID wraparound
Introduction VAUUM, Freezing, XID wraparound
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
The Dream Stream Team for Pulsar and Spring
The Dream Stream Team for Pulsar and SpringThe Dream Stream Team for Pulsar and Spring
The Dream Stream Team for Pulsar and Spring
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar Framework
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systems
 

Similar to Scaling Twitter

Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret Sauce
Jesse Vincent
 
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Jesse Vincent
 
Microblogging via XMPP
Microblogging via XMPPMicroblogging via XMPP
Microblogging via XMPP
Stoyan Zhekov
 
Aprendendo solid com exemplos
Aprendendo solid com exemplosAprendendo solid com exemplos
Aprendendo solid com exemplos
vinibaggio
 
Socket applications
Socket applicationsSocket applications
Socket applications
João Moura
 
From crash to testcase
From crash to testcaseFrom crash to testcase
From crash to testcase
Roel Van de Paar
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factory
moonpolysoft
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails Applications
Serge Smetana
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
MongoDB
 
WebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan WintermeyerWebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan Wintermeyer
Elixir Club
 
NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0
Cosimo Streppone
 
Fisl - Deployment
Fisl - DeploymentFisl - Deployment
Fisl - Deployment
Fabio Akita
 
SD, a P2P bug tracking system
SD, a P2P bug tracking systemSD, a P2P bug tracking system
SD, a P2P bug tracking system
Jesse Vincent
 
RubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - KeynoteRubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - Keynote
Dr Nic Williams
 
Sinatra for REST services
Sinatra for REST servicesSinatra for REST services
Sinatra for REST services
Emanuele DelBono
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
Server Density
 
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
PROIDEA
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
adunne
 
How to avoid hanging yourself with Rails
How to avoid hanging yourself with RailsHow to avoid hanging yourself with Rails
How to avoid hanging yourself with Rails
Rowan Hick
 
Monkeybars in the Manor
Monkeybars in the ManorMonkeybars in the Manor
Monkeybars in the Manor
martinbtt
 

Similar to Scaling Twitter (20)

Hiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret SauceHiveminder - Everything but the Secret Sauce
Hiveminder - Everything but the Secret Sauce
 
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret SauceBeijing Perl Workshop 2008 Hiveminder Secret Sauce
Beijing Perl Workshop 2008 Hiveminder Secret Sauce
 
Microblogging via XMPP
Microblogging via XMPPMicroblogging via XMPP
Microblogging via XMPP
 
Aprendendo solid com exemplos
Aprendendo solid com exemplosAprendendo solid com exemplos
Aprendendo solid com exemplos
 
Socket applications
Socket applicationsSocket applications
Socket applications
 
From crash to testcase
From crash to testcaseFrom crash to testcase
From crash to testcase
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factory
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails Applications
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
WebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan WintermeyerWebPerformance: Why and How? – Stefan Wintermeyer
WebPerformance: Why and How? – Stefan Wintermeyer
 
NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0
 
Fisl - Deployment
Fisl - DeploymentFisl - Deployment
Fisl - Deployment
 
SD, a P2P bug tracking system
SD, a P2P bug tracking systemSD, a P2P bug tracking system
SD, a P2P bug tracking system
 
RubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - KeynoteRubyEnRails2007 - Dr Nic Williams - Keynote
RubyEnRails2007 - Dr Nic Williams - Keynote
 
Sinatra for REST services
Sinatra for REST servicesSinatra for REST services
Sinatra for REST services
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
JDD2015: Sharding with Akka Cluster: From Theory to Production - Krzysztof Ot...
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
How to avoid hanging yourself with Rails
How to avoid hanging yourself with RailsHow to avoid hanging yourself with Rails
How to avoid hanging yourself with Rails
 
Monkeybars in the Manor
Monkeybars in the ManorMonkeybars in the Manor
Monkeybars in the Manor
 

More from Blaine

Social Privacy for HTTP over Webfinger
Social Privacy for HTTP over WebfingerSocial Privacy for HTTP over Webfinger
Social Privacy for HTTP over Webfinger
Blaine
 
Social Software for Robots
Social Software for RobotsSocial Software for Robots
Social Software for Robots
Blaine
 
OAuth
OAuthOAuth
OAuth
Blaine
 
Building the Real Time Web
Building the Real Time WebBuilding the Real Time Web
Building the Real Time Web
Blaine
 
You & Me & Everyone We Know
You & Me & Everyone We KnowYou & Me & Everyone We Know
You & Me & Everyone We Know
Blaine
 
Social Software for Robots
Social Software for RobotsSocial Software for Robots
Social Software for Robots
Blaine
 

More from Blaine (6)

Social Privacy for HTTP over Webfinger
Social Privacy for HTTP over WebfingerSocial Privacy for HTTP over Webfinger
Social Privacy for HTTP over Webfinger
 
Social Software for Robots
Social Software for RobotsSocial Software for Robots
Social Software for Robots
 
OAuth
OAuthOAuth
OAuth
 
Building the Real Time Web
Building the Real Time WebBuilding the Real Time Web
Building the Real Time Web
 
You & Me & Everyone We Know
You & Me & Everyone We KnowYou & Me & Everyone We Know
You & Me & Everyone We Know
 
Social Software for Robots
Social Software for RobotsSocial Software for Robots
Social Software for Robots
 

Recently uploaded

Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Bert Blevins
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
ScyllaDB
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
KAMAL CHOUDHARY
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
 

Recently uploaded (20)

Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
 

Scaling Twitter

  • 2. Rails Scales. (but not out of the box)
  • 3. First, Some Facts • 600 requests per second. Growing fast. • 180 Rails Instances (Mongrel). Growing fast. • 1 Database Server (MySQL) + 1 Slave. • 30-odd Processes for Misc. Jobs • 8 Sun X4100s • Many users, many updates.
  • 7. Joy Pain Oct Nov Dec Jan Feb March Apr
  • 8. IM IN UR RAILZ MAKIN EM GO FAST
  • 9. It’s Easy, Really. 1. Realize Your Site is Slow 2. Optimize the Database 3. Cache the Hell out of Everything 4. Scale Messaging 5. Deal With Abuse
  • 10. It’s Easy, Really. 1. Realize Your Site is Slow 2. Optimize the Database 3. Cache the Hell out of Everything 4. Scale Messaging 5. Deal With Abuse 6. Profit
  • 11. the more you know { Part the First }
  • 12. We Failed at This.
  • 13. Don’t Be Like Us • Munin • Nagios • AWStats & Google Analytics • Exception Notifier / Exception Logger • Immediately add reporting to track problems.
  • 14. Test Everything • Start Before You Start • No Need To Be Fancy • Tests Will Save Your Life • Agile Becomes Important When Your Site Is Down
  • 15. <!-- served to you through a copper wire by sampaati at 22 Apr 15:02 in 343 ms (d 102 / r 217). thank you, come again. --> <!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:02 in 235 ms (d 87 / r 130). thank you, come again. --> <!-- served to you through a copper wire by raven.twitter.com at 22 Apr 15:01 in 450 ms (d 96 / r 337). thank you, come again. --> Benchmarks? let your users do it. <!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:00 in 409 ms (d 88 / r 307). thank you, come again. --> <!-- served to you through a copper wire by firebird at 22 Apr 15:03 in 2094 ms (d 643 / r 1445). thank you, come again. --> <!-- served to you through a copper wire by quetzal at 22 Apr 15:01 in 384 ms (d 70 / r 297). thank you, come again. -->
  • 16. The Database { Part the Second }
  • 17. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 18. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 19. “The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
  • 22. class AddIndex < ActiveRecord::Migration def self.up add_index :users, :email end def self.down remove_index :users, :email end end Repeat for any column that appears in a WHERE clause Rails won’t do this for you.
  • 24. class DenormalizeFriendsIds < ActiveRecord::Migration def self.up add_column "users", "friends_ids", :text end def self.down remove_column "users", "friends_ids" end end
  • 25. class Friendship < ActiveRecord::Base belongs_to :user belongs_to :friend after_create :add_to_denormalized_friends after_destroy :remove_from_denormalized_friends def add_to_denormalized_friends user.friends_ids << friend.id user.friends_ids.uniq! user.save_without_validation end def remove_from_denormalized_friends user.friends_ids.delete(friend.id) user.save_without_validation end end
  • 27. bob.friends.map(&:email) Status.count() “email like ‘%#{search}%’”
  • 28. That’s where we are. Seriously. If your Rails application is doing anything more complex than that, you’re doing something wrong*. * or you observed the First Rule of Butterfield.
  • 29. Partitioning Comes Later. (we’ll let you know how it goes)
  • 30. The Cache { Part the Third }
  • 34. !
  • 35. class Status < ActiveRecord::Base class << self def count_with_memcache(*args) return count_without_memcache unless args.empty? count = CACHE.get(“status_count”) if count.nil? count = count_without_memcache CACHE.set(“status_count”, count) end count end alias_method_chain :count, :memcache end after_create :increment_memcache_count after_destroy :decrement_memcache_count ... end
  • 36. class User < ActiveRecord::Base def friends_statuses ids = CACHE.get(“friends_statuses:#{id}”) Status.find(:all, :conditions => [“id IN (?)”, ids]) end end class Status < ActiveRecord::Base after_create :update_caches def update_caches user.friends_ids.each do |friend_id| ids = CACHE.get(“friends_statuses:#{friend_id}”) ids.pop ids.unshift(id) CACHE.set(“friends_statuses:#{friend_id}”, ids) end end end
  • 37. The Future ve d ti r co Ac e R
  • 38. 90% API Requests Cache Them!
  • 39. “There are only two hard things in CS: cache invalidation and naming things.” – Phil Karlton, via Tim Bray
  • 41. You Already Knew All That Other Stuff, Right?
  • 42. Producer Consumer Message Producer Consumer Queue Producer Consumer
  • 43. DRb • The Good: • Stupid Easy • Reasonably Fast • The Bad: • Kinda Flaky • Zero Redundancy • Tightly Coupled
  • 44. ejabberd Jabber Client (drb) Incoming Outgoing Presence Messages Messages MySQL
  • 45. Server DRb.start_service ‘druby://localhost:10000’, myobject Client myobject = DRbObject.new_with_uri(‘druby://localhost:10000’)
  • 46. Rinda • Shared Queue (TupleSpace) • Built with DRb • RingyDingy makes it stupid easy • See Eric Hodel’s documentation • O(N) for take(). Sigh.
  • 47. Timestamp: 12/22/06 01:53:14 (4 months ago) Author: lattice Message: Fugly. Seriously. Fugly. SELECT * FROM messages WHERE substring(truncate(id,0),-2,1) = #{@fugly_dist_idx}
  • 48. It Scales. (except it stopped on Tuesday)
  • 49. Options • ActiveMQ (Java) • RabbitMQ (erlang) • MySQL + Lightweight Locking • Something Else?
  • 50. erlang? What are you doing? Stabbing my eyes out with a fork.
  • 51. Starling • Ruby, will be ported to something faster • 4000 transactional msgs/s • First pass written in 4 hours • Speaks MemCache (set, get)
  • 52. Use Messages to Invalidate Cache (it’s really not that hard)
  • 55. 9000 friends in 24 hours (doesn’t scale)