SlideShare a Scribd company logo
Quick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos
Things every astronaut supposed to
know
Rafał Hryniewski
@r_hryniewskifb.me/hryniewskinet
Quick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos
Things every astronaut supposed to
know
Agenda
Cosmos DB origin and (very brief) story
Why Cosmos DB
Available data models
Consistency levels
Scaling, Pricing and Georeplication
Service Level Agreements
Getting started (for free)
Glossary - ACID
A – Atomicity
C – Consistency
I – Isolation
D - Durability
Glossary – CAP Theorem
C – Consistency
A – Availability
P – Partition Tolerant
Quick trip around the Cosmos - Things every astronaut supposed to know
Quick trip around the Cosmos - Things every astronaut supposed to know
In the beginning there was only chaos...
2010 – Project Florence goals
Scale elastically
Low read and write latency
At least 99,99% availability
Intuitive and predictable concurrency
Comprehensive SLA
No schema/index management
Multiple models
Low costs
https://azure.microsoft.com/en-us/blog/a-technical-overview-of-azure-cosmos-db/
V 2017 – Cosmos DB
What already lives under the clouds?
Do we need another database?
https://db-engines.com/en/ranking
Why don’t we stick with MS SQL?
PaaS databases on Azure
Azure SQL
Redis
PostgreSQL
MySQL
Cosmos DB
http://hryniewski.net/2017/05/28/paas-databases-available-on-azure/
Any other database that can be installed on VM
When I don’t love SQL anymore?
So... Should we all abandon earth?
Embrace the Cosmos
What is Cosmos DB?
Industry’s first globally distributed, multi-model
database service
Multi-model database?!
Cosmos DB structure
Database tools we all love
Stored Procedures
Triggers
User Defined Functions (can be used in SQL syntax)
All in JavaScript
Four models, Four APIs
Document Database with Document DB SQL(like) API
Document Database with Mongo DB API
Key-Value Database with Azure Table Storage API
Graph database with (Tinkerpop) Gremlin API
One database to rule them all
Document DB – data format sample
[
{
"id": "59be9f01b7cb3b8c0fffd52b",
"firstName": "Camacho",
"lastName": "Castro",
"birthday": "2016-10-14T03:53:19 -02:00",
"project": "in",
"_rid": "DJdRAIh6aAAGAAAAAAAAAA==",
"_self":
"dbs/DJdRAA==/colls/DJdRAIh6aAA=/docs/DJdRAIh6aAAGAAAAAAAAAA==/",
"_etag": ""4200c424-0000-0000-0000-59bea13d0000"",
"_attachments": "attachments/",
"_ts": 1505665341
},
...
]
Document DB – query sample
SELECT * FROM Collection c
SELECT * FROM Collection c WHERE c.gender = 'female’
SELECT c["firstName"] AS FirstName, c["lastName"] AS
LastName FROM Collection c
SELECT {"FirstName" : c.firstName, "LastName": c.lastName}
AS PersonalData FROM Collection c
Table Storage – data format sample
Table Storage - query sample
Table Storage - query sample (portal)
Table Storage - query sample (SDK)
Gremlin – data format sample (GraphSON)
{"id": "33843a67-0da8-4dde-81ff-eff607882f23",
"label": "59be9f016f12f58cd5eb0cac",
"type": "vertex",
"properties": {
"firstName": [
{
"id": "742b60bd-cd75-4ec1-b288-c694d88c14ed",
"value": "Santana"
}
],
"lastName": [
{
"id": "70ed7300-cc4d-41c4-8af9-ba65445898ee",
"value": "Peters"
}
],
"gender": [
{
"id": "78b124d0-73af-4ff7-8f93-881377086568",
"value": "male"
}
],
"birthday": [
{
"id": "9140a304-a073-4c55-a86f-f3206d4ecbd9",
"value": "2016-08-15T10:10:17 -02:00"
}
],
"project": [
{
"id": "6cc7ffd1-6427-49a0-85c2-e1f5b1bbcd42",
"value": "elit"
}
]}
Gremlin – data format
Gremlin – query sample
g.V()
g.V().has('gender', 'female')
g.V().valueMap('firstName', 'lastName')
Gremlin – everything can relate with everything
Gremlin – adding edges 101
g.V().has('project', 'magna’) //WHO
.addE('InLoveIn’) //RELATES HOW
.to(V().has('gender', 'female’)) //TO WHOM
Gremlin – traversing graphs 101
g.V().has('firstName', 'Ivy').in('InLoveIn').values('birthday’)
g.V().has('firstName', 'Davenport').out('InLoveIn').values('project')
Mongo DB – data format sample
[
{
"_id" : ObjectId("59beaaa47d85940950b876fd"),
"id" : "59be9f016f12f58cd5eb0cac",
"firstName" : "Santana",
"lastName" : "Peters",
"gender" : "male",
"birthday" : "2016-08-15T10:10:17 -02:00",
"project" : "elit"
},
...
]
Mongo DB – query sample
db.mycollection.find()
db.mycollection.find({gender : "female"})
db.mycollection.find({}, {firstName: 1, lastName: 1, _id:0})
Everything is JSON
Available SDKs
https://docs.microsoft.com/en-us/azure/cosmos-db/
Non-Azure Equivalents
Document DB - Mongo DB
Mongo DB – Mongo DB (Captain obvious to the rescue!)
Azure Table Storage – Cassandra, HBase
Graph API - Neo4J, Titan
Choose right tool for the job
Future
Consistency
Strong consistency & eventual consistency
Strong Consistency – always reading current data
Eventual Consistency – data will be consistent... eventually
There are other consistency levels? Like what?
Somewhat consistent?
5 of them in Cosmos DB
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Strong consistency
+ B
A + C
A
ABA ABC
ABC
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Strong consistency
You can use only one Azure region with strong consistency
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Bounded-staleness
+ B
A + C
A
AA ABC
ABAB
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Bounded-staleness
Introduces acceptable lag for time or number of item versions
Data will manage consistent ordering except while in
staleness window
Georeplication to other Azure region is available if staleness
window is more than 100 000 operations or 5 minutes
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Session
+ B
A + C
AB
AA AC
AB
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
ACB
Session
Scoped to a client session
Reading own writes in consistent order in own session
Better throughput and latency than strong and bounded
staleness consistency levels while managing great session-
scoped consistency
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Consistent prefix
+ B
A + C
AB
ABCA ABC
ABCD
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
AB
+ D
Consistent prefix
Data may not be consistent, but will always be readed in
updates order
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Eventual consistency
+ B
A + C
AD
ACA AC
ABD
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
AB
+ D
ABCD
Eventual consistency
Cheapest writes
Best in terms of latency and throughput
You can read data older than you’ve seen just a second ago
But data will be consinstent...eventually
https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
Consistency can be changed per request
Just add x-ms-consistency-level header to your request
Or look for this option in your SDK
Scalability
Request Units
Every request costs certain amount of Request Units
It depends on consistency and request itself
You’re paying for provisioning n RU per second
Request Unit is throughput metric
How much is request unit
~1kb read – 1RU
~1kb query by id – 2,5 RU
~1kb create – 15RU
~1kb worth of JSON data
{"id": "59b413f6112b5af91a117944",
"guid": "cf9aa26f-f863-4adf-b6bd-
f4419427219c",
"isActive": false,
"balance": "$1,053.74",
"picture": "http://placehold.it/32x32.jpg",
"age": 32,
"eyeColor": "blue",
"name": "Lynette Campos",
"gender": "female",
"company": "XURBAN",
"email": "lynettecampos@xurban.com",
"secondaryEmail":"lynettecampos@seconda
ry.com"
"phone": "+1 (826) 428-2753",
"address": "511 Rodney Street, Libertytown,
Guam, 5768",
"about": "Sit sunt est Lorem dolore id
magna. Irure non proident culpa dolor enim.
Ex veniam laborum consectetur pariatur
mollit elit non commodo incididunt Lorem
labore. Qui labore ut excepteur id laboris
adipisicing ullamco et nulla irure nostrud
exercitation adipisicing excepteur.
Commodo duis incididunt ea in anim veniam
eiusmod deserunt. In incididunt duis
laborum in.rn",
"registered": "2015-12-15T08:48:18 -01:00",
"tags":
["lorem","ipsum","dolor","nisi","magna","cul
pa","pariatur","tempor"],
"favoriteFruit": "banana",
"favouriteProgrammingLanguage":"C#"}
How much does it cost?
How much does it cost?
100 RU/s – ~5,02 EUR monthly
1 GB of storage (SSD) - ~0.21 EUR monthly
Minimum 400RU/s (~20.08 EUR monthly)
Each replica multiples the amount!
What if I use all of my Request Units?
What if I use all of my Request Units
HTTP Status 429
Status Line: RequestRateTooLarge
x-ms-retry-after-ms :100
What’s the limit
What’s the limit
Minimum request units that can be provisioned is 400RU/s
After 2500 RU/s – partition key is required
After 10000 RU/s – you need to contact support
Partitioning
What partition key should we choose?
{
"id": 9,
"firstName": "Rafał",
"lastName": "Hryniewski",
"birthday": "16-05-1988",
"project": "Hello world!"
}
Let’s go global
Let’s go global
~725 documents
West US~0,75-1,10s
West Europe 0,20-0,30s
Azure regions
SLA
https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
SLA
SLA – Service Level Agreement
Service Credit - Service Fee * Service Credit Percent
https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
Availability SLA
SLA for failed requests percent
99,99% Availability or 10% Service Credit
99% Availability or 25% Service Credit
https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
Throughput SLA
You will be available to use RU/s that you’ve provisioned
99,99% Throughput or 10% Service Credit
99% Throughput or 25% Service Credit
https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
Consistency SLA
Your reads and writes will be executed within chosen
consistency level
99,99% Consistency Attainment or 10% Service Credit
99% Consistency Attainment or 25% Service Credit
https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
Latency SLA
Guarantees 10ms latency for reading up to 1kb document in
same region
Guarantees 15ms latency for writing up to 1kb document in
same region
99,99% Latency Attainment or 10% Service Credit
99% Latency Attainment or 25% Service Credit
https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
Getting started
Try for free
https://azure.microsoft.com/en-us/try/cosmosdb/
Free Cosmos DB for up to 48h
Dev essentials
https://www.visualstudio.com/dev-essentials/
25 EUR monthly for Azure
Cosmos DB Emulator
https://docs.microsoft.com/en-us/azure/cosmos-db/local-
emulator
Free tool for offline cosmos development
It’s so awesome! Isn’t it?
Links
CosmosDB Technical Overview
• https://azure.microsoft.com/en-us/blog/a-technical-overview-of-azure-
cosmos-db/
Database popularity ranking
• https://db-engines.com/en/ranking
PaaS Database on Azure
• http://hryniewski.net/2017/05/28/paas-databases-available-on-azure/
CosmosDB documentation
• https://docs.microsoft.com/en-us/azure/cosmos-db/
CosmosDB SLAs
• https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
Questions?
Thank you!
@r_hryniewskifb.me/hryniewskinet

More Related Content

Quick trip around the Cosmos - Things every astronaut supposed to know

  • 2. Quick trip around the Cosmos Things every astronaut supposed to know
  • 6. Quick trip around the Cosmos Things every astronaut supposed to know
  • 7. Agenda Cosmos DB origin and (very brief) story Why Cosmos DB Available data models Consistency levels Scaling, Pricing and Georeplication Service Level Agreements Getting started (for free)
  • 8. Glossary - ACID A – Atomicity C – Consistency I – Isolation D - Durability
  • 9. Glossary – CAP Theorem C – Consistency A – Availability P – Partition Tolerant
  • 12. In the beginning there was only chaos...
  • 13. 2010 – Project Florence goals Scale elastically Low read and write latency At least 99,99% availability Intuitive and predictable concurrency Comprehensive SLA No schema/index management Multiple models Low costs https://azure.microsoft.com/en-us/blog/a-technical-overview-of-azure-cosmos-db/
  • 14. V 2017 – Cosmos DB
  • 15. What already lives under the clouds?
  • 16. Do we need another database? https://db-engines.com/en/ranking
  • 17. Why don’t we stick with MS SQL?
  • 18. PaaS databases on Azure Azure SQL Redis PostgreSQL MySQL Cosmos DB http://hryniewski.net/2017/05/28/paas-databases-available-on-azure/
  • 19. Any other database that can be installed on VM
  • 20. When I don’t love SQL anymore?
  • 21. So... Should we all abandon earth?
  • 24. Industry’s first globally distributed, multi-model database service
  • 27. Database tools we all love Stored Procedures Triggers User Defined Functions (can be used in SQL syntax) All in JavaScript
  • 28. Four models, Four APIs Document Database with Document DB SQL(like) API Document Database with Mongo DB API Key-Value Database with Azure Table Storage API Graph database with (Tinkerpop) Gremlin API
  • 29. One database to rule them all
  • 30. Document DB – data format sample [ { "id": "59be9f01b7cb3b8c0fffd52b", "firstName": "Camacho", "lastName": "Castro", "birthday": "2016-10-14T03:53:19 -02:00", "project": "in", "_rid": "DJdRAIh6aAAGAAAAAAAAAA==", "_self": "dbs/DJdRAA==/colls/DJdRAIh6aAA=/docs/DJdRAIh6aAAGAAAAAAAAAA==/", "_etag": ""4200c424-0000-0000-0000-59bea13d0000"", "_attachments": "attachments/", "_ts": 1505665341 }, ... ]
  • 31. Document DB – query sample SELECT * FROM Collection c SELECT * FROM Collection c WHERE c.gender = 'female’ SELECT c["firstName"] AS FirstName, c["lastName"] AS LastName FROM Collection c SELECT {"FirstName" : c.firstName, "LastName": c.lastName} AS PersonalData FROM Collection c
  • 32. Table Storage – data format sample
  • 33. Table Storage - query sample
  • 34. Table Storage - query sample (portal)
  • 35. Table Storage - query sample (SDK)
  • 36. Gremlin – data format sample (GraphSON) {"id": "33843a67-0da8-4dde-81ff-eff607882f23", "label": "59be9f016f12f58cd5eb0cac", "type": "vertex", "properties": { "firstName": [ { "id": "742b60bd-cd75-4ec1-b288-c694d88c14ed", "value": "Santana" } ], "lastName": [ { "id": "70ed7300-cc4d-41c4-8af9-ba65445898ee", "value": "Peters" } ], "gender": [ { "id": "78b124d0-73af-4ff7-8f93-881377086568", "value": "male" } ], "birthday": [ { "id": "9140a304-a073-4c55-a86f-f3206d4ecbd9", "value": "2016-08-15T10:10:17 -02:00" } ], "project": [ { "id": "6cc7ffd1-6427-49a0-85c2-e1f5b1bbcd42", "value": "elit" } ]}
  • 38. Gremlin – query sample g.V() g.V().has('gender', 'female') g.V().valueMap('firstName', 'lastName')
  • 39. Gremlin – everything can relate with everything
  • 40. Gremlin – adding edges 101 g.V().has('project', 'magna’) //WHO .addE('InLoveIn’) //RELATES HOW .to(V().has('gender', 'female’)) //TO WHOM
  • 41. Gremlin – traversing graphs 101 g.V().has('firstName', 'Ivy').in('InLoveIn').values('birthday’) g.V().has('firstName', 'Davenport').out('InLoveIn').values('project')
  • 42. Mongo DB – data format sample [ { "_id" : ObjectId("59beaaa47d85940950b876fd"), "id" : "59be9f016f12f58cd5eb0cac", "firstName" : "Santana", "lastName" : "Peters", "gender" : "male", "birthday" : "2016-08-15T10:10:17 -02:00", "project" : "elit" }, ... ]
  • 43. Mongo DB – query sample db.mycollection.find() db.mycollection.find({gender : "female"}) db.mycollection.find({}, {firstName: 1, lastName: 1, _id:0})
  • 46. Non-Azure Equivalents Document DB - Mongo DB Mongo DB – Mongo DB (Captain obvious to the rescue!) Azure Table Storage – Cassandra, HBase Graph API - Neo4J, Titan
  • 47. Choose right tool for the job
  • 50. Strong consistency & eventual consistency Strong Consistency – always reading current data Eventual Consistency – data will be consistent... eventually
  • 51. There are other consistency levels? Like what? Somewhat consistent?
  • 52. 5 of them in Cosmos DB https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 53. Strong consistency + B A + C A ABA ABC ABC https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 54. Strong consistency You can use only one Azure region with strong consistency https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 55. Bounded-staleness + B A + C A AA ABC ABAB https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 56. Bounded-staleness Introduces acceptable lag for time or number of item versions Data will manage consistent ordering except while in staleness window Georeplication to other Azure region is available if staleness window is more than 100 000 operations or 5 minutes https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 57. Session + B A + C AB AA AC AB https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels ACB
  • 58. Session Scoped to a client session Reading own writes in consistent order in own session Better throughput and latency than strong and bounded staleness consistency levels while managing great session- scoped consistency https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 59. Consistent prefix + B A + C AB ABCA ABC ABCD https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels AB + D
  • 60. Consistent prefix Data may not be consistent, but will always be readed in updates order https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 61. Eventual consistency + B A + C AD ACA AC ABD https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels AB + D ABCD
  • 62. Eventual consistency Cheapest writes Best in terms of latency and throughput You can read data older than you’ve seen just a second ago But data will be consinstent...eventually https://docs.microsoft.com/en-us/azure/cosmos-db/consistency-levels
  • 63. Consistency can be changed per request Just add x-ms-consistency-level header to your request Or look for this option in your SDK
  • 65. Request Units Every request costs certain amount of Request Units It depends on consistency and request itself You’re paying for provisioning n RU per second Request Unit is throughput metric
  • 66. How much is request unit ~1kb read – 1RU ~1kb query by id – 2,5 RU ~1kb create – 15RU
  • 67. ~1kb worth of JSON data {"id": "59b413f6112b5af91a117944", "guid": "cf9aa26f-f863-4adf-b6bd- f4419427219c", "isActive": false, "balance": "$1,053.74", "picture": "http://placehold.it/32x32.jpg", "age": 32, "eyeColor": "blue", "name": "Lynette Campos", "gender": "female", "company": "XURBAN", "email": "lynettecampos@xurban.com", "secondaryEmail":"lynettecampos@seconda ry.com" "phone": "+1 (826) 428-2753", "address": "511 Rodney Street, Libertytown, Guam, 5768", "about": "Sit sunt est Lorem dolore id magna. Irure non proident culpa dolor enim. Ex veniam laborum consectetur pariatur mollit elit non commodo incididunt Lorem labore. Qui labore ut excepteur id laboris adipisicing ullamco et nulla irure nostrud exercitation adipisicing excepteur. Commodo duis incididunt ea in anim veniam eiusmod deserunt. In incididunt duis laborum in.rn", "registered": "2015-12-15T08:48:18 -01:00", "tags": ["lorem","ipsum","dolor","nisi","magna","cul pa","pariatur","tempor"], "favoriteFruit": "banana", "favouriteProgrammingLanguage":"C#"}
  • 68. How much does it cost?
  • 69. How much does it cost? 100 RU/s – ~5,02 EUR monthly 1 GB of storage (SSD) - ~0.21 EUR monthly Minimum 400RU/s (~20.08 EUR monthly) Each replica multiples the amount!
  • 70. What if I use all of my Request Units?
  • 71. What if I use all of my Request Units HTTP Status 429 Status Line: RequestRateTooLarge x-ms-retry-after-ms :100
  • 73. What’s the limit Minimum request units that can be provisioned is 400RU/s After 2500 RU/s – partition key is required After 10000 RU/s – you need to contact support
  • 75. What partition key should we choose? { "id": 9, "firstName": "Rafał", "lastName": "Hryniewski", "birthday": "16-05-1988", "project": "Hello world!" }
  • 77. Let’s go global ~725 documents West US~0,75-1,10s West Europe 0,20-0,30s
  • 80. SLA SLA – Service Level Agreement Service Credit - Service Fee * Service Credit Percent https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
  • 81. Availability SLA SLA for failed requests percent 99,99% Availability or 10% Service Credit 99% Availability or 25% Service Credit https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
  • 82. Throughput SLA You will be available to use RU/s that you’ve provisioned 99,99% Throughput or 10% Service Credit 99% Throughput or 25% Service Credit https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
  • 83. Consistency SLA Your reads and writes will be executed within chosen consistency level 99,99% Consistency Attainment or 10% Service Credit 99% Consistency Attainment or 25% Service Credit https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
  • 84. Latency SLA Guarantees 10ms latency for reading up to 1kb document in same region Guarantees 15ms latency for writing up to 1kb document in same region 99,99% Latency Attainment or 10% Service Credit 99% Latency Attainment or 25% Service Credit https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/
  • 89. It’s so awesome! Isn’t it?
  • 90. Links CosmosDB Technical Overview • https://azure.microsoft.com/en-us/blog/a-technical-overview-of-azure- cosmos-db/ Database popularity ranking • https://db-engines.com/en/ranking PaaS Database on Azure • http://hryniewski.net/2017/05/28/paas-databases-available-on-azure/ CosmosDB documentation • https://docs.microsoft.com/en-us/azure/cosmos-db/ CosmosDB SLAs • https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_0/