Azure Cosmos DB by Mohammed Gadi AUG April 2019
- 3. A FULLY-MANAGED GLOBALLY DISTRIBUTED DATABASE SERVICE BUILT TO GUARANTEE
EXTREMELY LOW LATENCY AND MASSIVE SCALE FOR MODERN APPS
What is Azure Cosmos DB ?
- 4. Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
A globally distributed, massively scalable, multi-model database service
Azure Cosmos DB
- 6. Column-family
Document
Graph
Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
Table API
Key-value
A globally distributed, massively scalable, multi-model database service
Cosmos DB’s API for
MongoDB
Azure Cosmos DB
- 8. Elastically Scale Storage and Throughput
Independently and elastically scale storage and
throughput across regions – even during
unpredictable traffic bursts – with a database that
adapts to your app’s needs.
Database throughput is the number of reads and
writes that your database can perform in a single
second.
Elastically scale throughput from 10 to
100s of millions of requests/sec across
multiple regions
Pay only for the throughput and storage
you need
- 10. Guaranteed Low Latency
Provide users around the world with fast
access to data
Serve < 10 ms read and < 10 ms write
requests at the 99th percentile from the
region nearest to users, while delivering
data globally.
- 13. Turnkey Global Distribution
Put your data where your users are in minutes
Manual and automatic failover.
Available in all Azure regions.
Automatic & synchronous multi-
region replication.
Configure multiple write regions
to further reduce latency and
increase availability
- 14. Strong Bounded-stateless Session Consistent prefix Eventual
Five Well-Defined Consistency Models
Choose the best consistency model for your app
Offers five consistency models
Provides control over performance-consistency tradeoffs,
backed by comprehensive SLAs.
An intuitive programming model offering low latency and
high availability for your planet-scale app.
- 20. Top 10 Reasons Why Customers Use
Azure Cosmos DB
different types of data
multi-tenancy
and enterprise-grade
security
global
distribution turnkey
capability
mission
critical
massive
storage/throughput
scalability
to
optimize for speed and
cost
5 well-defined
consistency models
analytics-
ready
event-driven
architectures
single digit
millisecond latency at
99th percentile
worldwide
big data
high
availability and
reliability
- 22. Billing Model
2 components: Storage + Throughput
You are billed on consumed storage and provisioned throughput
Collections in a database can share throughput
Unit Price (for most Azure regions)
SSD Storage (per GB) $0.25 per month
Provisioned Throughput (single region
writes)
$0.008/hour per 100 RU/s
Provisioned Throughput (multi-region
writes)
$0.016/hour per 100 multi-master RU/s
* pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-db/
- 23. Request Units
Request Units (RUs) is a rate-based currency – e.g. 1000 RU/second
A single request unit, 1 RU, is equal to the approximate cost of performing a single GET request on a 1-KB
document using a document's ID. Performing a GET by using a document's ID is an efficient means for retrieving a
document, and thus the cost is small. Creating, replacing, or deleting the same item requires additional processing
by the service, and therefore requires more request units. Abstracts physical resources for performing requests.
% IOPS% CPU% Memory
- 24. Request Units
Each request consumes # of RU
Approx. 1 RU = 1 read of 1 KB document
Approx. 5 RU = 1 write of a 1KB document
Query: Depends on query & documents involved
GET
POST
PUT
Query
…
=
=
=
=
- 25. Request Units- Provisioned throughput
Provisioned in terms of RU/sec – e.g. 1000 RU/s
Billed for highest RU/s in 1 hour
Easy to increase and decrease on demand
Rate limiting based on amount of throughput provisioned
Background processes like TTL expiration, index transformations
scheduled when quiescent
Storage: 40RU per 1GB of data
Min RU/sec
Max
RU/sec
IncomingRequests
No rate limiting,
process background
operations
Rate limiting –
SDK retry
No rate limiting
- 27. Partitioning- Why Do We Do It In The First Place?
As data size grows, instead of buying more machines (scaling up)
we distribute our data across multiple machines
Each machine is responsible for serving subset of the data.
Analogy: Working in a team
- 33. Walmart Labs (aka jet.com) ensures reliable app experience for
customers on Black Friday, Cyber Monday, and other high traffic periods
Order & Inventory Management Systems
• Event-sourcing architecture, with Cosmos DB
Change Feed
• Moved from IaaS to PasS for inventory system
• Chosen to handle high write-ingest for events &
low latency guarantees
• Scaled for Black Friday: 1 trillion RU’s over 24
hours
- 36. Azure Cosmos DB: Value to Customer
Save money
Global Business
Store
Supplier
Partner
Become more productive
Become more flexible
Become more responsive
Become more innovative
- 43. The Data Migration tool is an open-source solution that imports data
to Azure Cosmos DB from a variety of sources, including:
Editor's Notes
- Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours.
Only Azure Cosmos DB makes global distribution turn-key.
You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.
Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
- Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours.
Only Azure Cosmos DB makes global distribution turn-key.
You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.
Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
- Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours.
Only Azure Cosmos DB makes global distribution turn-key.
You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.
Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
- Elastic Scale out -> Tunable Consistency
Small storage – large throughput (e.g. notification broadcast/poll)
Large storage – small throughput (e.g. classic data/log store)
- Single digit latency -> SLA
- Elastic Scale out -> Tunable Consistency
Small storage – large throughput (e.g. notification broadcast/poll)
Large storage – small throughput (e.g. classic data/log store)
- Tunable Consistency -> Single digit latency
Instead of forcing you to choose between eventual and strong consistency, Cosmos DB gives you many additional useful options.
Bounded Staleness - Consistent Prefix. Reads lag behind writes by k prefixes or t interval
Session - Consistent Prefix. Monotonic reads, monotonic writes, read-your-writes, write-follows-reads
Consistent Prefix - Updates returned are some prefix of all the updates, with no gaps
- The number of RU’s each operation consumes depends on many factors which include:
Document size
Number of indexed fields
Type of indexes
Consistency model choice
Not all queries will consume equal numbers of RU’s. Some operations are more computationally complex or require scans through more documents and therefore use more RU’s.
- We want to avoid throttling (rate limiting)
- https://blogs.msdn.microsoft.com/azurecat/2018/05/17/azure-cosmos-db-customer-profile-jet-com/