SlideShare a Scribd company logo
e-Commerce web app Architecture and
Scalability
So, what’s in an E-Commerce Store?
Slide 2
Catalogs
State
Discounts
Inventory
Fulfillment
Social Integration
Payment
Note: This image is for illustrative purposes only. MindTree does not recommend or associate itself with either this site, or the products displayed here. Any copyrights belong to their respective owners
Analytics
Cross-sell
Content
Search
Seasonality
00000
Ads
Personalization
Cust. Service
Conceptual View
Slide 3
Application View
Slide 4
IntegrationMiddleware
The Key to Scalability…
Slide 5
•Ability of a system to handle or be enlarged to handle growing amounts of work
gracefully
… is to understand what it is
•Identify bottlenecks, and mitigate
•Identify load sources, and minimize
… is to keep principles simple
•“How scalable do you want it?” (Do you have the NFRs?)
… is to understand what’s expected
•If my look-to-book ratio is 90:10, can I scale differently?
•Do I always need to depend on the scalability of my partners?
•Do all of my subsystems need to scale equally?
•If my load is seasonal, should I be safe and overprovision?
… is to exploit system characteristics
IntegrationMiddleware
So what’s Different?
Slide 6
Lots of integration –
and dependencies
Combo of content
and Data
Complicated rules
spanning entities
Rich and useful data
to be analyzed and
mined
The breadwinner!
Many different
channels – and loadLots of these guys
(hopefully!)
Some typical usage
patterns
Lots of potential plus
lots of load
Mapping to Scalability Considerations
Multiple channels
User behavior
Social Content
Slide 7
Content + Data
Rules and analytics
Integration – and dependencies
Scalability Strategies: Multiple Channels
• Scale out
• Minimize state for better load balancing, and to reduce memory footprint
• Use CDNs to farm out traffic to other sites, or consider A/A DC’s
• Exploit client capability, and minimize traffic
• Leverage device profile to serve appropriate content
• Scale out
• Minimize state for better load balancing, and to reduce memory footprint
• Use CDNs to farm out traffic to other sites, or consider A/A DC’s
• Exploit client capability, and minimize traffic
• Leverage device profile to serve appropriate content
Frontends
• Use lightweight protocols
• Split between presentation services and interface services and choose
granularity
• Use lightweight protocols
• Split between presentation services and interface services and choose
granularity
Services
Slide 8
Scalability Strategies: Content, Social Traffic
• Content and data scale differently – so scale them independently
• Mash up content, UGC, and catalog data in a portal
• Leverage CMS’ caching as far as possible
• Consider publishing static HTML
• Organize CMS structures appropriately
• Use CDNs or servers appropriate to the purpose
• Content and data scale differently – so scale them independently
• Mash up content, UGC, and catalog data in a portal
• Leverage CMS’ caching as far as possible
• Consider publishing static HTML
• Organize CMS structures appropriately
• Use CDNs or servers appropriate to the purpose
Content
• Separate onto other servers
• Leverage existing social network platforms
• Separate onto other servers
• Leverage existing social network platforms
UGC
Slide 9
Scalability Strategies: User Behavior
Slide 10
•Separate reads and writes; scale out read nodes using replication / master-slave, …
•Split functionality between database instances
•Use sharding (carefully) to partition and scale out writes
•Choose optimistic reads (carefully) to minimize locks
•Separate reads and writes; scale out read nodes using replication / master-slave, …
•Split functionality between database instances
•Use sharding (carefully) to partition and scale out writes
•Choose optimistic reads (carefully) to minimize locks
Data
•A fair bit of data changes slowly – cache aggressively: catalogs, content, …
•Try and cache inventory levels as well, based on thresholds
•A fair bit of data changes slowly – cache aggressively: catalogs, content, …
•Try and cache inventory levels as well, based on thresholds
Minimize I/O
•Create private clouds or leverage public / hybrid clouds•Create private clouds or leverage public / hybrid clouds
Elasticity
Scalability Strategies: Rules and Analytics
Slide 11
• Classify into cross-sell rules, transaction processing rules
• Cross-sell rules more heavily used, use more data
• Pre-compute results - not everything needs to be real-time
• Reduce target datasets - not everything is against the universe
• Classify into cross-sell rules, transaction processing rules
• Cross-sell rules more heavily used, use more data
• Pre-compute results - not everything needs to be real-time
• Reduce target datasets - not everything is against the universe
Rules
• Separate analytics from transactional systems
• Run them off of their own instance
• Separate analytics from transactional systems
• Run them off of their own instance
Analytics
Scalability Strategies: Integration
Slide 12
• Use synchronous calls only when needed (for example, payments)
• Use queuing for durability and throttling
• Have a fallback mechanism:
• Last day’s inventory in case the inventory system isn’t coping
• Backend payment if payment gateway isn’t coping
• Consider batch mode integration vs. real-time always
• Use synchronous calls only when needed (for example, payments)
• Use queuing for durability and throttling
• Have a fallback mechanism:
• Last day’s inventory in case the inventory system isn’t coping
• Backend payment if payment gateway isn’t coping
• Consider batch mode integration vs. real-time always
Dependencies
Strategies for Commerce Engines
Use products (commercial | open source) for commerce, content,
search, etc.
Choose taxonomy and design carefully
Catalogs, virtual catalogs, materialized catalogs, catalog sets, page
snapshots, …
Content taxonomy
Cache appropriately
Content caching at browser, CDN, CMS, commerce engine, …
Commerce caching for catalogs, ad rules, pricing and inventory, …
Scale appropriately
Consider data volumes (catalog sizes, user base, …) for sharding or
partitioning
Follow vendor best practices for scaling
Slide 13
General Scalability Strategies
 Use statelessness to scale better
 Choose shared-nothing models to scale best
 Choose shared-(something) or externalized state models with due considerations
 Reduce HTTP requests and payload
 Use techniques like file versioning, sprites, inline images, compression, …
 Be asynchronous where possible
 When calling backend systems, for example
 Choose faster (or lesser) I/O to minimize latency
 Cache aggressively
 Use faster I/O where possible
 Keep data small and archive aggressively to scale I/O and DBs
 Keep data close helps use network better and to reduce latency
 Write good code!
Slide 14
In Summary
Know what you are looking to achieve
NFRs, business outlook
Exploit usage characteristics
Read/write ratios, acceptable lags, functional separation
Design for scale
Statelessness, service granularity, protocols
Reduce load
Pre-compute data, cache aggressively, offload
Identify subsystem scalability needs
Scale what’s needed, not everything
Leverage others!
Product capabilities, CDNs, cloud providers, social networking platforms
Slide 15

More Related Content

e-Commerce web app Architecture and Scalability

  • 1. e-Commerce web app Architecture and Scalability
  • 2. So, what’s in an E-Commerce Store? Slide 2 Catalogs State Discounts Inventory Fulfillment Social Integration Payment Note: This image is for illustrative purposes only. MindTree does not recommend or associate itself with either this site, or the products displayed here. Any copyrights belong to their respective owners Analytics Cross-sell Content Search Seasonality 00000 Ads Personalization Cust. Service
  • 5. The Key to Scalability… Slide 5 •Ability of a system to handle or be enlarged to handle growing amounts of work gracefully … is to understand what it is •Identify bottlenecks, and mitigate •Identify load sources, and minimize … is to keep principles simple •“How scalable do you want it?” (Do you have the NFRs?) … is to understand what’s expected •If my look-to-book ratio is 90:10, can I scale differently? •Do I always need to depend on the scalability of my partners? •Do all of my subsystems need to scale equally? •If my load is seasonal, should I be safe and overprovision? … is to exploit system characteristics
  • 6. IntegrationMiddleware So what’s Different? Slide 6 Lots of integration – and dependencies Combo of content and Data Complicated rules spanning entities Rich and useful data to be analyzed and mined The breadwinner! Many different channels – and loadLots of these guys (hopefully!) Some typical usage patterns Lots of potential plus lots of load
  • 7. Mapping to Scalability Considerations Multiple channels User behavior Social Content Slide 7 Content + Data Rules and analytics Integration – and dependencies
  • 8. Scalability Strategies: Multiple Channels • Scale out • Minimize state for better load balancing, and to reduce memory footprint • Use CDNs to farm out traffic to other sites, or consider A/A DC’s • Exploit client capability, and minimize traffic • Leverage device profile to serve appropriate content • Scale out • Minimize state for better load balancing, and to reduce memory footprint • Use CDNs to farm out traffic to other sites, or consider A/A DC’s • Exploit client capability, and minimize traffic • Leverage device profile to serve appropriate content Frontends • Use lightweight protocols • Split between presentation services and interface services and choose granularity • Use lightweight protocols • Split between presentation services and interface services and choose granularity Services Slide 8
  • 9. Scalability Strategies: Content, Social Traffic • Content and data scale differently – so scale them independently • Mash up content, UGC, and catalog data in a portal • Leverage CMS’ caching as far as possible • Consider publishing static HTML • Organize CMS structures appropriately • Use CDNs or servers appropriate to the purpose • Content and data scale differently – so scale them independently • Mash up content, UGC, and catalog data in a portal • Leverage CMS’ caching as far as possible • Consider publishing static HTML • Organize CMS structures appropriately • Use CDNs or servers appropriate to the purpose Content • Separate onto other servers • Leverage existing social network platforms • Separate onto other servers • Leverage existing social network platforms UGC Slide 9
  • 10. Scalability Strategies: User Behavior Slide 10 •Separate reads and writes; scale out read nodes using replication / master-slave, … •Split functionality between database instances •Use sharding (carefully) to partition and scale out writes •Choose optimistic reads (carefully) to minimize locks •Separate reads and writes; scale out read nodes using replication / master-slave, … •Split functionality between database instances •Use sharding (carefully) to partition and scale out writes •Choose optimistic reads (carefully) to minimize locks Data •A fair bit of data changes slowly – cache aggressively: catalogs, content, … •Try and cache inventory levels as well, based on thresholds •A fair bit of data changes slowly – cache aggressively: catalogs, content, … •Try and cache inventory levels as well, based on thresholds Minimize I/O •Create private clouds or leverage public / hybrid clouds•Create private clouds or leverage public / hybrid clouds Elasticity
  • 11. Scalability Strategies: Rules and Analytics Slide 11 • Classify into cross-sell rules, transaction processing rules • Cross-sell rules more heavily used, use more data • Pre-compute results - not everything needs to be real-time • Reduce target datasets - not everything is against the universe • Classify into cross-sell rules, transaction processing rules • Cross-sell rules more heavily used, use more data • Pre-compute results - not everything needs to be real-time • Reduce target datasets - not everything is against the universe Rules • Separate analytics from transactional systems • Run them off of their own instance • Separate analytics from transactional systems • Run them off of their own instance Analytics
  • 12. Scalability Strategies: Integration Slide 12 • Use synchronous calls only when needed (for example, payments) • Use queuing for durability and throttling • Have a fallback mechanism: • Last day’s inventory in case the inventory system isn’t coping • Backend payment if payment gateway isn’t coping • Consider batch mode integration vs. real-time always • Use synchronous calls only when needed (for example, payments) • Use queuing for durability and throttling • Have a fallback mechanism: • Last day’s inventory in case the inventory system isn’t coping • Backend payment if payment gateway isn’t coping • Consider batch mode integration vs. real-time always Dependencies
  • 13. Strategies for Commerce Engines Use products (commercial | open source) for commerce, content, search, etc. Choose taxonomy and design carefully Catalogs, virtual catalogs, materialized catalogs, catalog sets, page snapshots, … Content taxonomy Cache appropriately Content caching at browser, CDN, CMS, commerce engine, … Commerce caching for catalogs, ad rules, pricing and inventory, … Scale appropriately Consider data volumes (catalog sizes, user base, …) for sharding or partitioning Follow vendor best practices for scaling Slide 13
  • 14. General Scalability Strategies  Use statelessness to scale better  Choose shared-nothing models to scale best  Choose shared-(something) or externalized state models with due considerations  Reduce HTTP requests and payload  Use techniques like file versioning, sprites, inline images, compression, …  Be asynchronous where possible  When calling backend systems, for example  Choose faster (or lesser) I/O to minimize latency  Cache aggressively  Use faster I/O where possible  Keep data small and archive aggressively to scale I/O and DBs  Keep data close helps use network better and to reduce latency  Write good code! Slide 14
  • 15. In Summary Know what you are looking to achieve NFRs, business outlook Exploit usage characteristics Read/write ratios, acceptable lags, functional separation Design for scale Statelessness, service granularity, protocols Reduce load Pre-compute data, cache aggressively, offload Identify subsystem scalability needs Scale what’s needed, not everything Leverage others! Product capabilities, CDNs, cloud providers, social networking platforms Slide 15