SlideShare a Scribd company logo
Design principles for Azure
applications
Masashi Narumoto
Principle lead PM
AzureCAT patterns&practices
Traditional vs. Modern application
Traditional on-premises Modern cloud
Relational database Polyglot persistence
Strong consistency Eventual consistency
Design for predictable scalability Design for unbound scalability
Serial and synchronized processing Parallel and asynchronous processing
Monolithic, centralized Decomposed, de-centralized
Snowflake servers Immutable infrastructure
Integrated authentication Federated authentication
Design to keep app running (MTBF) Design for failure (MTTR)
Onetime big update Frequent small update
Manual management Automated self-management
Functional &
Non-functional
requirements
Choose
architecture style
Choose technology
Apply design
patterns & best
practices
Process of Software development
Design principles
Cloud design principles
Design principles for Azure applications
• Use managed services
• Minimize coordination
• Partition around limits
• Design to scale out
• Design for self-healing
• Make all things redundant
• Use the best data store for the job
• Design for evolution
• Design for operations
• Build for the needs of business
Use managed services
Use managed services
• Managed service reduces management tasks significantly
• Patch, Version, Resource tuning, Cluster management
• Setting up elasticsearch yourself vs. using Azure search
• Managed services can be used even in IaaS workload
• Cache, Messaging, Storage etc.
• If version, scalability limit, cost , portability doesn’t meet your
requirements, then consider pure IaaS approach
Minimize coordination - Silence is golden
Minimize coordination - Silence is golden
Partition around limits
Cloud design principles
Cloud design principles
Cloud design principles
Design to scale out
Design to scale out
• Avoid instance stickiness
• Find the bottle-neck and resolve it instead of blindly scale up/out
• Stateful part of the system is most likely become the bottle-neck
• Use built-in auto-scaling feature
• Schedule based for predictable, parameter based for un-predictable
load
• Design for scale-in to make sure you won’t drop balls
• Consider aggressive auto-scaling for critical workload
Auto-scale
SPA
&
Mobile
Web
frontend
services
SQL
NoSQL
CDN
Remote
service
Backend
jobs
Messaging
CPU Queue length
Design for self-healing
Design for self-healing
• Retry operations at transient faults
• Protect failing remote services (Circuit breaker)
• Compensate failed transactions
• Bulkhead
• Throttling
• Fall back operation
• Service degradation
• Load leveling
• Leader election
• Fault injection
• Chaos engineering
• Check pointing long running transactions. Restart from where it failed.
Make all things redundant
Make all things redundant
• Load balancing
• Availability set
• Paired region
• Auto-Failover / Manual-failback
• Synchronize front and backend
• Redundant Traffic manager
• Geo-replica
• Partition for availability
• A/A vs. A/P topology
• Point in time Backup/Restore
• RTO/RPO
Use best data store for the job
Use best data store for the job
• Don’t use SQL for everything (monolithic persistence)
• Logging, Blob, Documents
• How to choose right storage
• Data type, Use case, Others
• Microservices architecture encourages use of polyglot storage
• Each service owns its private data in best format
• Shift from ACID to BASE transaction
• Eventual consistency
• Compensating transaction
Design for evolution
Design for evolution
• Key for continuous innovation (independent deployment)
• Keep high cohesion loose coupling
• Capture domain knowledge in one place
• Compose tightly coupled features together
• Use asynchronous messaging to avoid waiting
• Avoid fat GW, it should be dumb pipe
• Expose open standard interface
• Design and test against service contract
• Abstract infrastructure away from domain logic
• Offload common tasks to a separate service
Design for operations
Design for operations
• Make things observable
• Instrument for both monitoring and root cause analysis
• Use distributed tracing and correlation
• Automate management tasks
• Track and version configuration
• (Aggregate logs and metrics)
• Standardize logs and metrics
• Involve operation teams in design and planning
Build for the needs of business
Build for the needs of business
• Functional – DDD, DCA
• Bounded context leads to service boundary
• Context map leads to service dependency
• Aggregate, Domain service/event lead to microservices and inter service comm
• Non-functional - RTO/RPO/MTO, SLO/SLA
• RTO leads to failover period
• RPO leads to backup interval
• SLA leads to choice of services w/ level of redundancy
• Throughput/Latency leads to choice of SKU w/ partitioning
Traceability from business to software
Business Domain
Core
domain
Bounded context & context mapFurther breakdown per service
characteristics
Business modeling Group of high cohesive services
talking to each other via loosely
coupled API
Accounts
Drone management
3rd party
transportation
Call center
Video
surveillance
Drone
sharing
Drone management
Accounts
Drone sharing
3rd party
transportation
Shipping
Call center
Shipping
Surveillance
Shipping domain with aggregates
Shipping
Drone Package
Delivery DeliveryScheduler
DeliverySupervisor
DeliveryScheduler
Package
Drone
Delivery
Mobile
app
Event
sourcing
Delivery
Supervisor
DeliveryEvents
RequestEvents
GW
Status
3rd party
Service
Account
Service
DroneMgmt
Service
Microservices in
Shipping BC
AAD
Account
Service
Auth
Service
3rd party
transportation
Account

More Related Content

Cloud design principles

  • 1. Design principles for Azure applications Masashi Narumoto Principle lead PM AzureCAT patterns&practices
  • 2. Traditional vs. Modern application Traditional on-premises Modern cloud Relational database Polyglot persistence Strong consistency Eventual consistency Design for predictable scalability Design for unbound scalability Serial and synchronized processing Parallel and asynchronous processing Monolithic, centralized Decomposed, de-centralized Snowflake servers Immutable infrastructure Integrated authentication Federated authentication Design to keep app running (MTBF) Design for failure (MTTR) Onetime big update Frequent small update Manual management Automated self-management
  • 3. Functional & Non-functional requirements Choose architecture style Choose technology Apply design patterns & best practices Process of Software development Design principles
  • 5. Design principles for Azure applications • Use managed services • Minimize coordination • Partition around limits • Design to scale out • Design for self-healing • Make all things redundant • Use the best data store for the job • Design for evolution • Design for operations • Build for the needs of business
  • 7. Use managed services • Managed service reduces management tasks significantly • Patch, Version, Resource tuning, Cluster management • Setting up elasticsearch yourself vs. using Azure search • Managed services can be used even in IaaS workload • Cache, Messaging, Storage etc. • If version, scalability limit, cost , portability doesn’t meet your requirements, then consider pure IaaS approach
  • 8. Minimize coordination - Silence is golden
  • 9. Minimize coordination - Silence is golden
  • 15. Design to scale out • Avoid instance stickiness • Find the bottle-neck and resolve it instead of blindly scale up/out • Stateful part of the system is most likely become the bottle-neck • Use built-in auto-scaling feature • Schedule based for predictable, parameter based for un-predictable load • Design for scale-in to make sure you won’t drop balls • Consider aggressive auto-scaling for critical workload
  • 18. Design for self-healing • Retry operations at transient faults • Protect failing remote services (Circuit breaker) • Compensate failed transactions • Bulkhead • Throttling • Fall back operation • Service degradation • Load leveling • Leader election • Fault injection • Chaos engineering • Check pointing long running transactions. Restart from where it failed.
  • 19. Make all things redundant
  • 20. Make all things redundant • Load balancing • Availability set • Paired region • Auto-Failover / Manual-failback • Synchronize front and backend • Redundant Traffic manager • Geo-replica • Partition for availability • A/A vs. A/P topology • Point in time Backup/Restore • RTO/RPO
  • 21. Use best data store for the job
  • 22. Use best data store for the job • Don’t use SQL for everything (monolithic persistence) • Logging, Blob, Documents • How to choose right storage • Data type, Use case, Others • Microservices architecture encourages use of polyglot storage • Each service owns its private data in best format • Shift from ACID to BASE transaction • Eventual consistency • Compensating transaction
  • 24. Design for evolution • Key for continuous innovation (independent deployment) • Keep high cohesion loose coupling • Capture domain knowledge in one place • Compose tightly coupled features together • Use asynchronous messaging to avoid waiting • Avoid fat GW, it should be dumb pipe • Expose open standard interface • Design and test against service contract • Abstract infrastructure away from domain logic • Offload common tasks to a separate service
  • 26. Design for operations • Make things observable • Instrument for both monitoring and root cause analysis • Use distributed tracing and correlation • Automate management tasks • Track and version configuration • (Aggregate logs and metrics) • Standardize logs and metrics • Involve operation teams in design and planning
  • 27. Build for the needs of business
  • 28. Build for the needs of business • Functional – DDD, DCA • Bounded context leads to service boundary • Context map leads to service dependency • Aggregate, Domain service/event lead to microservices and inter service comm • Non-functional - RTO/RPO/MTO, SLO/SLA • RTO leads to failover period • RPO leads to backup interval • SLA leads to choice of services w/ level of redundancy • Throughput/Latency leads to choice of SKU w/ partitioning
  • 29. Traceability from business to software Business Domain Core domain Bounded context & context mapFurther breakdown per service characteristics Business modeling Group of high cohesive services talking to each other via loosely coupled API
  • 30. Accounts Drone management 3rd party transportation Call center Video surveillance Drone sharing Drone management Accounts Drone sharing 3rd party transportation Shipping Call center Shipping Surveillance
  • 31. Shipping domain with aggregates Shipping Drone Package Delivery DeliveryScheduler DeliverySupervisor

Editor's Notes

  1. I’m trying to compare the common characteristics of each These common characteristics raise questions that you need to answer. How to choose the right storage? (Polyglot cheat sheet) How to deal with eventual consistency issues? (Data consistency primer) How to make apps scalable? (Auto-scaling guidance) How to control concurrent access? (Concurrent access guidance, WIP) How to decompose a monolith to distributed components? (Data/Compute partitioning guidance) How to make apps immutable? How to choose the right authentication model? (Identity guidance) How to design multi-tenant apps? (Multi-tenant guidance) How to deal with transient/non-transient faults? (Retry guidance) https://dzone.com/articles/martin-fowler-snowflake
  2. Add practical examples per each bullet Minimize coordination - concurrency control HCLC - encapsulate domain knowledge, contact, Scale-out/in - avoid instance stickiness, deal with scale-in Decomposition - Decompose per functional / Non-functional reqs API - REST vs. RPC Redundancy - Different level of redundancy Self-healing - CB, Retry, compensation, throttling, fallback, Polyglot - Observable - correlating transactions
  3. Don’t write your own OS!! Master/Client node, Avoid split brain issue, Perf tuning, patch/version up etc. SQL DB, Azure Redis, DocumentDB, AAD, Azure Search, HDI
  4. https://www.youtube.com/watch?v=EYJnWttrC9k CouchDB supports MVCC Optimistic vs. Pessimistic concurrency control MVCC Data partitioning Event sourcing Exactly once operation (causes coordination) MapReduce Idempotent operations Leader election
  5. Partition for scalability/query-performance/size limits Three different partitioning strategy (V, H, F) Hybrid approach (V & H) Design the shard key to avoid hot spot Partition different level of envelop (DB, Node, Account, Subscription) Partition different part of application (DB, Storage, Cache, Queue, Cluster, LB)
  6. This is often refered to as sharding. Store different set of rows in different partitions. Each partition has the same schema. Choose shard key for even distribution to avoid hot spot.
  7. Store different columns in different partitions. Group the columns that commonly used together so you don’t need to join. Critical vs. Non-critical or Sensitive vs. Non-sensitive data. So you can manage them separately.
  8. More often than not, you take the hybrid approach. Store structured data in RDBMS while binary files in NoSQL store. Then horizontally partition the RDBMS.
  9. Tax accounting app has huge spike in Mar/Apr. A single rockstar causes a partition to be hotspot. Load testing, monitoring to figure bottle-neck!! Auto-scaling guidance
  10. Consider aggressive auto-scaling for critical workload Service fabric doesn't support auto-scale-in
  11. Resiliency guidance
  12. Resiliency guidance
  13. Average, count etc. Choosing storage guidance When you need different storage Design considerations Transactions and consistency integrity across multiple storages CAP theorem Compensating transaction using queue, supervisors High level Selection criteria (data type, skillset, other trade-off) CQRS and Event sourcing with microservices Polyglot persistence is becoming natural solution for microservices
  14. Microservices guidance
  15. Make things observable Automate management tasks Secret management Expose health endpoint to check system internals Make all things traceable Logging, tracing Instrument your app Correlate service interactions within a transaction Collect five key metrics business, client, app, system, service Use APM tools Look for outliers
  16. Capture business intent and trace the software design so when intent changes, you can identify where to modify
  17. Domain represents problem space (business) BC represents solution space (software) One BC can have multiple different architecture styles, infrastructures etc.?
  18. How delivery service know its status? Is it coming from delivery mgmt service? (pull or push) Do we want to merge requestHandler and GW? GW does only token checking, delegate auth to auth service in account BC Why it has Package, Drone, Delivery as service but no service for account and 3rd party? Do we need them? Why doesn’t delivery service contain drone and package aggregate? Does drone need persistent storage or cache? What is the best API style? Depending on the responsibility and latency req of the drone service in this context, it can be just caching status Every event from drone come via EventHub to only DroneMgmt or + Delivery service? Account service subscribes delivery events and do the following once it’s completed Collect ratings, send emails, schedule payment