SlideShare a Scribd company logo
Azure Cosmos DB
Mohammed S. Gadi
Twitter/Instagram/Facebook: @mgadirocks
I’m super e x cited to be back at AUGAzureBootcamp!
A FULLY-MANAGED GLOBALLY DISTRIBUTED DATABASE SERVICE BUILT TO GUARANTEE
EXTREMELY LOW LATENCY AND MASSIVE SCALE FOR MODERN APPS
What is Azure Cosmos DB ?
Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
A globally distributed, massively scalable, multi-model database service
Azure Cosmos DB
Column-family
Document
Graph
Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
Key-value
A globally distributed, massively scalable, multi-
model database service
Azure Cosmos DB
Column-family
Document
Graph
Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
Table API
Key-value
A globally distributed, massively scalable, multi-model database service
Cosmos DB’s API for
MongoDB
Azure Cosmos DB
Throughput & Latency
Elastically Scale Storage and Throughput
Independently and elastically scale storage and
throughput across regions – even during
unpredictable traffic bursts – with a database that
adapts to your app’s needs.
Database throughput is the number of reads and
writes that your database can perform in a single
second.
 Elastically scale throughput from 10 to
100s of millions of requests/sec across
multiple regions
 Pay only for the throughput and storage
you need
HOW’S THE
THROUGHPUT ?
Guaranteed Low Latency
Provide users around the world with fast
access to data
Serve < 10 ms read and < 10 ms write
requests at the 99th percentile from the
region nearest to users, while delivering
data globally.
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Turnkey Global Distribution
Put your data where your users are in minutes
 Manual and automatic failover.
 Available in all Azure regions.
 Automatic & synchronous multi-
region replication.
 Configure multiple write regions
to further reduce latency and
increase availability
Strong Bounded-stateless Session Consistent prefix Eventual
Five Well-Defined Consistency Models
Choose the best consistency model for your app
Offers five consistency models
Provides control over performance-consistency tradeoffs,
backed by comprehensive SLAs.
An intuitive programming model offering low latency and
high availability for your planet-scale app.
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Top 10 Reasons Why Customers Use
Azure Cosmos DB
different types of data
multi-tenancy
and enterprise-grade
security
global
distribution turnkey
capability
mission
critical
massive
storage/throughput
scalability
to
optimize for speed and
cost
5 well-defined
consistency models
analytics-
ready
event-driven
architectures
single digit
millisecond latency at
99th percentile
worldwide
big data
high
availability and
reliability
Request Units & Billing
Billing Model
2 components: Storage + Throughput
You are billed on consumed storage and provisioned throughput
Collections in a database can share throughput
Unit Price (for most Azure regions)
SSD Storage (per GB) $0.25 per month
Provisioned Throughput (single region
writes)
$0.008/hour per 100 RU/s
Provisioned Throughput (multi-region
writes)
$0.016/hour per 100 multi-master RU/s
* pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-db/
Request Units
Request Units (RUs) is a rate-based currency – e.g. 1000 RU/second
A single request unit, 1 RU, is equal to the approximate cost of performing a single GET request on a 1-KB
document using a document's ID. Performing a GET by using a document's ID is an efficient means for retrieving a
document, and thus the cost is small. Creating, replacing, or deleting the same item requires additional processing
by the service, and therefore requires more request units. Abstracts physical resources for performing requests.
% IOPS% CPU% Memory
Request Units
Each request consumes # of RU
Approx. 1 RU = 1 read of 1 KB document
Approx. 5 RU = 1 write of a 1KB document
Query: Depends on query & documents involved
GET
POST
PUT
Query
…
=
=
=
=
Request Units- Provisioned throughput
Provisioned in terms of RU/sec – e.g. 1000 RU/s
Billed for highest RU/s in 1 hour
Easy to increase and decrease on demand
Rate limiting based on amount of throughput provisioned
Background processes like TTL expiration, index transformations
scheduled when quiescent
Storage: 40RU per 1GB of data
Min RU/sec
Max
RU/sec
IncomingRequests
No rate limiting,
process background
operations
Rate limiting –
SDK retry
No rate limiting
Partitioning
Partitioning- Why Do We Do It In The First Place?
As data size grows, instead of buying more machines (scaling up)
we distribute our data across multiple machines
Each machine is responsible for serving subset of the data.
Analogy: Working in a team
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB – MongoDB API
Who are using Cosmos DB?
Serving Industry-Leading Enterprise
Customers
Walmart Labs (aka jet.com) ensures reliable app experience for
customers on Black Friday, Cyber Monday, and other high traffic periods
Order & Inventory Management Systems
• Event-sourcing architecture, with Cosmos DB
Change Feed
• Moved from IaaS to PasS for inventory system
• Chosen to handle high write-ingest for events &
low latency guarantees
• Scaled for Black Friday: 1 trillion RU’s over 24
hours
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB: Value to Customer
Save money
Global Business
Store
Supplier
Partner
Become more productive
Become more flexible
Become more responsive
Become more innovative
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
Azure Cosmos DB by Mohammed Gadi AUG April 2019
http://microsoft.com/learn
https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/cosmos-db/import-data.md
The Data Migration tool is an open-source solution that imports data
to Azure Cosmos DB from a variety of sources, including:
Data Tool
Thank you
You Can Follow me on!
@mgadirocks mgadirocks /mgadirocks

More Related Content

Azure Cosmos DB by Mohammed Gadi AUG April 2019

  • 1. Azure Cosmos DB Mohammed S. Gadi Twitter/Instagram/Facebook: @mgadirocks
  • 2. I’m super e x cited to be back at AUGAzureBootcamp!
  • 3. A FULLY-MANAGED GLOBALLY DISTRIBUTED DATABASE SERVICE BUILT TO GUARANTEE EXTREMELY LOW LATENCY AND MASSIVE SCALE FOR MODERN APPS What is Azure Cosmos DB ?
  • 4. Turnkey global distribution Elastic scale out of storage & throughput Guaranteed low latency at the 99th percentile Comprehensive SLAs Five well-defined consistency models A globally distributed, massively scalable, multi-model database service Azure Cosmos DB
  • 5. Column-family Document Graph Turnkey global distribution Elastic scale out of storage & throughput Guaranteed low latency at the 99th percentile Comprehensive SLAs Five well-defined consistency models Key-value A globally distributed, massively scalable, multi- model database service Azure Cosmos DB
  • 6. Column-family Document Graph Turnkey global distribution Elastic scale out of storage & throughput Guaranteed low latency at the 99th percentile Comprehensive SLAs Five well-defined consistency models Table API Key-value A globally distributed, massively scalable, multi-model database service Cosmos DB’s API for MongoDB Azure Cosmos DB
  • 8. Elastically Scale Storage and Throughput Independently and elastically scale storage and throughput across regions – even during unpredictable traffic bursts – with a database that adapts to your app’s needs. Database throughput is the number of reads and writes that your database can perform in a single second.  Elastically scale throughput from 10 to 100s of millions of requests/sec across multiple regions  Pay only for the throughput and storage you need
  • 10. Guaranteed Low Latency Provide users around the world with fast access to data Serve < 10 ms read and < 10 ms write requests at the 99th percentile from the region nearest to users, while delivering data globally.
  • 13. Turnkey Global Distribution Put your data where your users are in minutes  Manual and automatic failover.  Available in all Azure regions.  Automatic & synchronous multi- region replication.  Configure multiple write regions to further reduce latency and increase availability
  • 14. Strong Bounded-stateless Session Consistent prefix Eventual Five Well-Defined Consistency Models Choose the best consistency model for your app Offers five consistency models Provides control over performance-consistency tradeoffs, backed by comprehensive SLAs. An intuitive programming model offering low latency and high availability for your planet-scale app.
  • 20. Top 10 Reasons Why Customers Use Azure Cosmos DB different types of data multi-tenancy and enterprise-grade security global distribution turnkey capability mission critical massive storage/throughput scalability to optimize for speed and cost 5 well-defined consistency models analytics- ready event-driven architectures single digit millisecond latency at 99th percentile worldwide big data high availability and reliability
  • 21. Request Units & Billing
  • 22. Billing Model 2 components: Storage + Throughput You are billed on consumed storage and provisioned throughput Collections in a database can share throughput Unit Price (for most Azure regions) SSD Storage (per GB) $0.25 per month Provisioned Throughput (single region writes) $0.008/hour per 100 RU/s Provisioned Throughput (multi-region writes) $0.016/hour per 100 multi-master RU/s * pricing may vary by region; for up-to-date pricing, see: https://azure.microsoft.com/pricing/details/cosmos-db/
  • 23. Request Units Request Units (RUs) is a rate-based currency – e.g. 1000 RU/second A single request unit, 1 RU, is equal to the approximate cost of performing a single GET request on a 1-KB document using a document's ID. Performing a GET by using a document's ID is an efficient means for retrieving a document, and thus the cost is small. Creating, replacing, or deleting the same item requires additional processing by the service, and therefore requires more request units. Abstracts physical resources for performing requests. % IOPS% CPU% Memory
  • 24. Request Units Each request consumes # of RU Approx. 1 RU = 1 read of 1 KB document Approx. 5 RU = 1 write of a 1KB document Query: Depends on query & documents involved GET POST PUT Query … = = = =
  • 25. Request Units- Provisioned throughput Provisioned in terms of RU/sec – e.g. 1000 RU/s Billed for highest RU/s in 1 hour Easy to increase and decrease on demand Rate limiting based on amount of throughput provisioned Background processes like TTL expiration, index transformations scheduled when quiescent Storage: 40RU per 1GB of data Min RU/sec Max RU/sec IncomingRequests No rate limiting, process background operations Rate limiting – SDK retry No rate limiting
  • 27. Partitioning- Why Do We Do It In The First Place? As data size grows, instead of buying more machines (scaling up) we distribute our data across multiple machines Each machine is responsible for serving subset of the data. Analogy: Working in a team
  • 30. Azure Cosmos DB – MongoDB API
  • 31. Who are using Cosmos DB?
  • 33. Walmart Labs (aka jet.com) ensures reliable app experience for customers on Black Friday, Cyber Monday, and other high traffic periods Order & Inventory Management Systems • Event-sourcing architecture, with Cosmos DB Change Feed • Moved from IaaS to PasS for inventory system • Chosen to handle high write-ingest for events & low latency guarantees • Scaled for Black Friday: 1 trillion RU’s over 24 hours
  • 36. Azure Cosmos DB: Value to Customer Save money Global Business Store Supplier Partner Become more productive Become more flexible Become more responsive Become more innovative
  • 43. The Data Migration tool is an open-source solution that imports data to Azure Cosmos DB from a variety of sources, including:
  • 45. Thank you You Can Follow me on! @mgadirocks mgadirocks /mgadirocks

Editor's Notes

  1. Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours. Only Azure Cosmos DB makes global distribution turn-key. You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.   Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
  2. Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours. Only Azure Cosmos DB makes global distribution turn-key. You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.   Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
  3. Azure Cosmos DB offers the first globally distributed, multi-model database service for building planet scale apps. It’s been powering Microsoft’s internet-scale services for years, and now it’s ready to launch yours. Only Azure Cosmos DB makes global distribution turn-key. You can add Azure locations to your database anywhere across the world, at any time, with a single click. Cosmos DB will seamlessly replicate your data and make it highly available.   Cosmos DB allows you to scale throughput and storage elastically, and globally! You only pay for the throughput and storage you need – anywhere in the world, at any time.
  4. Elastic Scale out -> Tunable Consistency Small storage – large throughput (e.g. notification broadcast/poll) Large storage – small throughput (e.g. classic data/log store)
  5. Single digit latency -> SLA
  6. Elastic Scale out -> Tunable Consistency Small storage – large throughput (e.g. notification broadcast/poll) Large storage – small throughput (e.g. classic data/log store)
  7. Tunable Consistency -> Single digit latency Instead of forcing you to choose between eventual and strong consistency, Cosmos DB gives you many additional useful options. Bounded Staleness - Consistent Prefix. Reads lag behind writes by k prefixes or t interval Session - Consistent Prefix. Monotonic reads, monotonic writes, read-your-writes, write-follows-reads Consistent Prefix - Updates returned are some prefix of all the updates, with no gaps
  8. The number of RU’s each operation consumes depends on many factors which include: Document size Number of indexed fields Type of indexes Consistency model choice Not all queries will consume equal numbers of RU’s. Some operations are more computationally complex or require scans through more documents and therefore use more RU’s.
  9. We want to avoid throttling (rate limiting)
  10. https://blogs.msdn.microsoft.com/azurecat/2018/05/17/azure-cosmos-db-customer-profile-jet-com/