SlideShare a Scribd company logo
Scalable web apps
execution time
development time

Piotr Pelczar
Types of scaling
Vertical scaling

Horizontal scaling

scale up

scale out
Think about your app as a worker
not single instance

Load balancer


Server #1

App #1

App #2

Server #2
App #3

App #4

Server #3

App #5
Think about your app as a worker
not single instance
Load balancer

Server #1
App #1

Server #3
Load balancer

App #2

Server #2
App #3

App #4

App #5

Server #n
We need:
• Common
• Fast
• Persistent

Storage for sessions.


Load balancer


Server #1

App #1

App #2

Server #2
App #3

Session storage

App #4

Server #3

App #5
Sessions - Redis


Key-value in memory database (hash-tabled)
Scalable up to 1k nodes
Partitioning with Query routing
Non blocking M-S replication on nodes
Clustered (currently not production ready)
Redis - Partitioning with Query routing


Node #1

Hit, abort

Node #2

Node #3

Also supported:
• Client-side partitioning (app calls appropriate
• Proxy assisted partitioning (proxy selects
appropriate node)
Centralized Logging
• Logs should be centrailzed to avoid taking
notice to each node separately
• Approaches:
– File replication (rsync + cron)
– syslog (easy to integrate with log4j)
• syslogd over UDP p:514
• rsyslog over TCP, stores data in db
Common storage, no local changes!
• Keep storage avaliable to all nodes
– Symfony2 Gaufrette Bundle

Amazon S3

Load balancer


Server #1
App #1

App #2


Session storage

Server #2
App #3

App #4

Server #3
App #5

Files storage abstraction

Centralized logging
Continuous Integration
• To keep all nodes up-to-date, you need CI
• Automatize disabling nodes, building,
– Jenkins CI
Contineous Integration
1. Disable service on node
2. Deploy/build app
1. Copy files
2. Update db schema (liquibase, ORM schema
3. Execute scripts

3. Re-run service
Balance the payload - HAProxy
Yeah guys, this is logo :)
But no schema is needed
just imagine how it works.

• Very, very fast proxy!
• Software TCP/HTTP load balancer
• Different node selecting algorithms:
– roudrobin (limit 4128)
– static-rr
– leastconn (lowest number of connections)
Balance the payload - HAProxy
• You can check node’s status by pinging
• Dead node is excluded from balancing strategy
vi /etc/haproxy/haproxy.cfg
option httpchk HEAD /check.txt HTTP/1.0
server webA check
server webB check
Balance the payload - HAProxy
• Monitor node’s status by read stats from
socket via socat.

echo "show stat" | socat
/tmp/haproxy.sock stdio
Balance the payload - HAProxy
• Monitor node’s status by native stats webapp
Nodes Monitoring - Zabbix
• Zabbix, centralized server monitoring
Zabbix + HAProxy
• UserParameter=haproxy.qcur[*],
echo "show stat" | socat
/tmp/haproxy.sock stdio | grep -i
'$1' | sed 's/,/ /g' | awk
'{print $$3}'
Reverse Proxy and Varnish cache
• Global virtual user = global cache
Reverse Proxy – Expiration model
Reverse Proxy – Expiration model
Reverse Proxy – Validation model
Reverse Proxy – Validation model
Reverse Proxy and Varnish cache



Reverse Proxy and Varnish cache



Reverse Proxy and Varnish cache




Varnish and ESI
<!DOCTYPE html>

<esi:include src="http://..." />

Scaling databases - Master slave



• All data redundancy


MongoDB scaling
• Common models to spread data over nodes:
– range keys
– hash keys

• Many nodes on cheap machines
• No all data redundancy in each node
MongoDB – range-based keys

• Awesome for range queries (grab data from min nodes –
Query isolation)
• Not good enough to distribute data over nodes in case of
monotinic incemental
MongoDB – hash-based keys

• Take notice: not good for range queries while
merge-sorting, no Query isolation in this case
• Write scaling – Write to many nodes simultaneously (take
notice to readers-writer lock, where write is exclusive)
Mongodb sharding and clustering
• Command Query Responsibility Segregation
– separate application service layers for writing and
readng from DB (possibility to use different data
sources like RAM or DB)
• Examples
– post-insert population cache
• all SELECTs are from cache (even invalid)
• consider LFU instead of LRU to invaidate cache

– pre-insert into memory
• dump results periodicaly

In both approaches there is convenient to use
Queues or data bus !
Queues, RabbitMQ
• RabbitMQ is based on AMQP (Advanced
Message Queuing Protocol)
– point-to-point
– publish-and-subscribe
– queueing, routing

• AMQP is not JMS (Java Message Service is an
API, not protocol)
• Happy Rabit is empty Rabbit
– do not try to store any data (messages) in queue
system in persistent mode to keep HA
Queues, RabbitMQ
• Simple queue
• Work queues
(one consumer)

• Publish/Subscribe
(many consumers)
Box vs spread architecture.
• Box architecture
– no scaling
– easy to maintenance





Box vs spread architecture.
• Spread architecture
– High availability
– more integrations, more administrative
Server #1



Server #2

Server #3



DB shard


DB shard

Scalable web apps
execution time
development time

Piotr Pelczar

More Related Content

What's hot

introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.js
Treasure Data Summer Internship Final Report
Treasure Data Summer Internship Final ReportTreasure Data Summer Internship Final Report
Treasure Data Summer Internship Final Report
Ritta Narita
Non-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsNon-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.js
Marcus Frödin
Node.js - A Quick Tour
Node.js - A Quick TourNode.js - A Quick Tour
Node.js - A Quick Tour
Felix Geisendörfer
Vert.x v3 - high performance polyglot application toolkit
Vert.x v3 - high performance  polyglot application toolkitVert.x v3 - high performance  polyglot application toolkit
Vert.x v3 - high performance polyglot application toolkit
Rex - Lightning Talk 2013
Rex - Lightning Talk 2013Rex - Lightning Talk 2013
Rex - Lightning Talk 2013
Jan Gehring
Node ppt
Node pptNode ppt
Scaling Django with gevent
Scaling Django with geventScaling Django with gevent
Scaling Django with gevent
Mahendra M
Comet with node.js and V8
Comet with node.js and V8Comet with node.js and V8
Comet with node.js and V8
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
Richard Lee
Nodejs Explained with Examples
Nodejs Explained with ExamplesNodejs Explained with Examples
Nodejs Explained with Examples
Gabriele Lana
Massively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHPMassively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHP
Demin Yin
Java script at backend nodejs
Java script at backend   nodejsJava script at backend   nodejs
Java script at backend nodejs
Amit Thakkar
Node.js Patterns for Discerning Developers
Node.js Patterns for Discerning DevelopersNode.js Patterns for Discerning Developers
Node.js Patterns for Discerning Developers
Node.js and How JavaScript is Changing Server Programming
Node.js and How JavaScript is Changing Server Programming  Node.js and How JavaScript is Changing Server Programming
Node.js and How JavaScript is Changing Server Programming
Tom Croucher
NodeJS Concurrency
NodeJS ConcurrencyNodeJS Concurrency
NodeJS Concurrency
Introduction Node.js
Introduction Node.jsIntroduction Node.js
Introduction Node.js
Erik van Appeldoorn
All you need to know about the JavaScript event loop
All you need to know about the JavaScript event loopAll you need to know about the JavaScript event loop
All you need to know about the JavaScript event loop
Saša Tatar
Intro to Node.js (v1)
Intro to Node.js (v1)Intro to Node.js (v1)
Intro to Node.js (v1)
Chris Cowan
[231] the simplicity of cluster apps with circuit
[231] the simplicity of cluster apps with circuit[231] the simplicity of cluster apps with circuit
[231] the simplicity of cluster apps with circuit

What's hot (20)

introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.js
Treasure Data Summer Internship Final Report
Treasure Data Summer Internship Final ReportTreasure Data Summer Internship Final Report
Treasure Data Summer Internship Final Report
Non-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsNon-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.js
Node.js - A Quick Tour
Node.js - A Quick TourNode.js - A Quick Tour
Node.js - A Quick Tour
Vert.x v3 - high performance polyglot application toolkit
Vert.x v3 - high performance  polyglot application toolkitVert.x v3 - high performance  polyglot application toolkit
Vert.x v3 - high performance polyglot application toolkit
Rex - Lightning Talk 2013
Rex - Lightning Talk 2013Rex - Lightning Talk 2013
Rex - Lightning Talk 2013
Node ppt
Node pptNode ppt
Node ppt
Scaling Django with gevent
Scaling Django with geventScaling Django with gevent
Scaling Django with gevent
Comet with node.js and V8
Comet with node.js and V8Comet with node.js and V8
Comet with node.js and V8
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
Nodejs Explained with Examples
Nodejs Explained with ExamplesNodejs Explained with Examples
Nodejs Explained with Examples
Massively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHPMassively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHP
Java script at backend nodejs
Java script at backend   nodejsJava script at backend   nodejs
Java script at backend nodejs
Node.js Patterns for Discerning Developers
Node.js Patterns for Discerning DevelopersNode.js Patterns for Discerning Developers
Node.js Patterns for Discerning Developers
Node.js and How JavaScript is Changing Server Programming
Node.js and How JavaScript is Changing Server Programming  Node.js and How JavaScript is Changing Server Programming
Node.js and How JavaScript is Changing Server Programming
NodeJS Concurrency
NodeJS ConcurrencyNodeJS Concurrency
NodeJS Concurrency
Introduction Node.js
Introduction Node.jsIntroduction Node.js
Introduction Node.js
All you need to know about the JavaScript event loop
All you need to know about the JavaScript event loopAll you need to know about the JavaScript event loop
All you need to know about the JavaScript event loop
Intro to Node.js (v1)
Intro to Node.js (v1)Intro to Node.js (v1)
Intro to Node.js (v1)
[231] the simplicity of cluster apps with circuit
[231] the simplicity of cluster apps with circuit[231] the simplicity of cluster apps with circuit
[231] the simplicity of cluster apps with circuit

Similar to Scalable Web Apps

HAProxy HAProxy
Arindam Nayak
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
振东 刘
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
HPC Controls Future
HPC Controls FutureHPC Controls Future
HPC Controls Future
What no one tells you about writing a streaming app
What no one tells you about writing a streaming appWhat no one tells you about writing a streaming app
What no one tells you about writing a streaming app
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
Spark Summit
Follow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHPFollow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHP
Eric Rodriguez (Hiring in Lex)
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
Alexey Rybak
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
Jez Halford
Spy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformSpy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platform
Redge Technologies
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
Sander Temme
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
Arnab Biswas
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Apache Apex
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
Alex Moskvin
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
Andriy Zabavskyy

Similar to Scalable Web Apps (20)

HAProxy HAProxy
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
HPC Controls Future
HPC Controls FutureHPC Controls Future
HPC Controls Future
What no one tells you about writing a streaming app
What no one tells you about writing a streaming appWhat no one tells you about writing a streaming app
What no one tells you about writing a streaming app
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
Follow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHPFollow the White Rabbit - Message Queues with PHP
Follow the White Rabbit - Message Queues with PHP
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
Spy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformSpy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platform
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu

More from Piotr Pelczar

Pragmatic Monolith-First, easy to decompose, clean architecture
Pragmatic Monolith-First, easy to decompose, clean architecturePragmatic Monolith-First, easy to decompose, clean architecture
Pragmatic Monolith-First, easy to decompose, clean architecture
Piotr Pelczar
Piotr Pelczar
[BDD] Introduction to Behat (PL)
[BDD] Introduction to Behat (PL)[BDD] Introduction to Behat (PL)
[BDD] Introduction to Behat (PL)
Piotr Pelczar
Asynchronous programming done right - Node.js
Asynchronous programming done right - Node.jsAsynchronous programming done right - Node.js
Asynchronous programming done right - Node.js
Piotr Pelczar
Liquibase - database structure versioning
Liquibase - database structure versioningLiquibase - database structure versioning
Liquibase - database structure versioning
Piotr Pelczar

More from Piotr Pelczar (6)

Pragmatic Monolith-First, easy to decompose, clean architecture
Pragmatic Monolith-First, easy to decompose, clean architecturePragmatic Monolith-First, easy to decompose, clean architecture
Pragmatic Monolith-First, easy to decompose, clean architecture
[BDD] Introduction to Behat (PL)
[BDD] Introduction to Behat (PL)[BDD] Introduction to Behat (PL)
[BDD] Introduction to Behat (PL)
Asynchronous programming done right - Node.js
Asynchronous programming done right - Node.jsAsynchronous programming done right - Node.js
Asynchronous programming done right - Node.js
Liquibase - database structure versioning
Liquibase - database structure versioningLiquibase - database structure versioning
Liquibase - database structure versioning

Recently uploaded

RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Andrey Yasko
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Awais Yaseen
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.

Recently uploaded (20)

RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
Cookies program to display the information though cookie creation
Cookies program to display the information though cookie creationCookies program to display the information though cookie creation
Cookies program to display the information though cookie creation
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Recent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS InfrastructureRecent Advancements in the NIST-JARVIS Infrastructure
Recent Advancements in the NIST-JARVIS Infrastructure
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing SystemsMitigating the Impact of State Management in Cloud Stream Processing Systems
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
UiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs ConferenceUiPath Community Day Kraków: Devs4Devs Conference
UiPath Community Day Kraków: Devs4Devs Conference
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck

Scalable Web Apps

  • 1. Scalable web apps execution time vs development time Piotr Pelczar
  • 2. Types of scaling Vertical scaling Horizontal scaling scale up scale out
  • 3. Think about your app as a worker not single instance OS Load balancer App Server #1 App #1 App #2 Server #2 App #3 App #4 Server #3 App #5
  • 4. Think about your app as a worker not single instance Load balancer Server #1 App #1 Server #3 Load balancer App #2 Server #2 App #3 App #4 App #5 Server #n
  • 5. Sessions We need: • Common • Fast • Persistent Storage for sessions.
  • 6. Sessions OS Load balancer App Server #1 App #1 App #2 Server #2 App #3 Session storage App #4 Server #3 App #5
  • 7. Sessions - Redis • • • • • Key-value in memory database (hash-tabled) Scalable up to 1k nodes Partitioning with Query routing Non blocking M-S replication on nodes Clustered (currently not production ready)
  • 8. Redis - Partitioning with Query routing Query random node Miss Node #1 Hit, abort Node #2 Node #3 Also supported: • Client-side partitioning (app calls appropriate node) • Proxy assisted partitioning (proxy selects appropriate node)
  • 9. Centralized Logging • Logs should be centrailzed to avoid taking notice to each node separately • Approaches: – File replication (rsync + cron) – syslog (easy to integrate with log4j) • syslogd over UDP p:514 • rsyslog over TCP, stores data in db
  • 10. Common storage, no local changes! • Keep storage avaliable to all nodes – Symfony2 Gaufrette Bundle • • • • • FTP Amazon S3 OpenCloud AzureBlobStorage Rackspace
  • 11. Architecture OS Load balancer App Server #1 App #1 App #2 OS Session storage Server #2 App #3 App #4 Server #3 App #5 Files storage abstraction Centralized logging
  • 12. Continuous Integration • To keep all nodes up-to-date, you need CI • Automatize disabling nodes, building, deploying – Jenkins CI
  • 13. Contineous Integration 1. Disable service on node 2. Deploy/build app 1. Copy files 2. Update db schema (liquibase, ORM schema update) 3. Execute scripts 3. Re-run service
  • 14. Balance the payload - HAProxy Yeah guys, this is logo :) But no schema is needed just imagine how it works. • Very, very fast proxy! • Software TCP/HTTP load balancer • Different node selecting algorithms: – roudrobin (limit 4128) – static-rr – leastconn (lowest number of connections)
  • 15. Balance the payload - HAProxy • You can check node’s status by pinging • Dead node is excluded from balancing strategy vi /etc/haproxy/haproxy.cfg option httpchk HEAD /check.txt HTTP/1.0 server webA check server webB check
  • 16. Balance the payload - HAProxy • Monitor node’s status by read stats from socket via socat. echo "show stat" | socat /tmp/haproxy.sock stdio
  • 17. Balance the payload - HAProxy • Monitor node’s status by native stats webapp console
  • 18. Nodes Monitoring - Zabbix • Zabbix, centralized server monitoring
  • 19. Zabbix + HAProxy • UserParameter=haproxy.qcur[*], echo "show stat" | socat /tmp/haproxy.sock stdio | grep -i '$1' | sed 's/,/ /g' | awk '{print $$3}'
  • 20. Reverse Proxy and Varnish cache • Global virtual user = global cache
  • 21. Reverse Proxy – Expiration model
  • 22. Reverse Proxy – Expiration model
  • 23. Reverse Proxy – Validation model
  • 24. Reverse Proxy – Validation model
  • 25. Reverse Proxy and Varnish cache Apache :81 Varnish :80 App
  • 26. Reverse Proxy and Varnish cache Apache :8081 Varnish :8080 App HAProxy :80 Apache :8083 Varnish :8082 App
  • 27. Reverse Proxy and Varnish cache Apache :8081 App Varnish :80 HAProxy :81 Apache :8082 App
  • 28. Varnish and ESI <!DOCTYPE html> <html> <body> <!-- ... some content --> <!-- Embed the content of another page here --> <esi:include src="http://..." /> <!-- ... more content --> </body> </html>
  • 29. Scaling databases - Master slave Write Master Slave Read • All data redundancy Slave Slave
  • 30. MongoDB scaling • Common models to spread data over nodes: – range keys – hash keys • Many nodes on cheap machines • No all data redundancy in each node
  • 31. MongoDB – range-based keys • Awesome for range queries (grab data from min nodes – Query isolation) • Not good enough to distribute data over nodes in case of monotinic incemental
  • 32. MongoDB – hash-based keys • Take notice: not good for range queries while merge-sorting, no Query isolation in this case • Write scaling – Write to many nodes simultaneously (take notice to readers-writer lock, where write is exclusive)
  • 33. Mongodb sharding and clustering
  • 34. CQRS • Command Query Responsibility Segregation – separate application service layers for writing and readng from DB (possibility to use different data sources like RAM or DB)
  • 35. CQRS • Examples – post-insert population cache • all SELECTs are from cache (even invalid) • consider LFU instead of LRU to invaidate cache – pre-insert into memory • dump results periodicaly In both approaches there is convenient to use Queues or data bus !
  • 36. Queues, RabbitMQ • RabbitMQ is based on AMQP (Advanced Message Queuing Protocol) – point-to-point – publish-and-subscribe – queueing, routing • AMQP is not JMS (Java Message Service is an API, not protocol) • Happy Rabit is empty Rabbit – do not try to store any data (messages) in queue system in persistent mode to keep HA
  • 37. Queues, RabbitMQ • Simple queue • Work queues (one consumer) • Publish/Subscribe (many consumers)
  • 38. Box vs spread architecture. • Box architecture – no scaling – easy to maintenance Server Webapp Redis RabbitMQ Varnish DB
  • 39. Box vs spread architecture. • Spread architecture – High availability – more integrations, more administrative Server #1 RabbitMQ Redis HAProxy Server #2 Server #3 Webapp Webapp DB shard Varnish DB shard Varnish
  • 40. Scalable web apps execution time vs development time Piotr Pelczar