SlideShare a Scribd company logo
JSON Data Modeling
Matthew D. Groves, @mgroves
David Segleau, @dsegleau
©2017 Couchbase Inc. 2
Agenda
Why NoSQL?
JSON Data Modeling
Accessing data
Migrating data
©2017 Couchbase Inc. 3
Where am I?
• PittsburghTech Fest
• http://www.pghtechfest.com/
©2017 Couchbase Inc. 4
Who am I?
• Matthew D. Groves
• Developer Advocate for Couchbase
• @mgroves onTwitter
• Podcast and blog: http://crosscuttingconcerns.com
• “I am not an expert, but I am an enthusiast.” – Alan Stevens
JSON Data Modeling
Matthew D. Groves, @mgroves
David Segleau, @dsegleau
©2017 Couchbase Inc. 6
Major Enterprises Across Industries are Adopting NoSQL
6
CommunicationsTechnology
Travel & Hospitality Media &
Entertainment
E-Commerce &
Digital Advertising
Retail & Apparel
Games & GamingFinance &
Business Services
©2017 Couchbase Inc. 7
Why NoSQL?
©2017 Couchbase Inc. 8
NoSQL Landscape
Document
• Couchbase
• MongoDB
• DynamoDB
• DocumentDB
Graph
• OrientDB
• Neo4J
• DEX
• GraphBase
Key-Value
• Couchbase
• Riak
• BerkeleyDB
• Redis
• … Wide Column
• Hbase
• Cassandra
• Hypertable
©2017 Couchbase Inc. 9
NoSQL Landscape
Document
• Couchbase
• MongoDB
• DynamoDB
• DocumentDB
• Get by key(s)
• Set by key(s)
• Replace by key(s)
• Delete by key(s)
• Map/Reduce
©2017 Couchbase Inc. 10
Why NoSQL? Scalability
©2017 Couchbase Inc. 11
Why NoSQL? Flexibility
©2017 Couchbase Inc. 12
Why NoSQL? Performance
©2017 Couchbase Inc. 13
Why NoSQL? Availability
©2017 Couchbase Inc. 14
JSON Data Modeling
©2017 Couchbase Inc. 15
Models for Representing Data
Data Concern Relational Model JSON Document Model
Rich Structure
Relationships
Value Evolution
Structure Evolution
©2017 Couchbase Inc. 16
Properties of Real-World Data
©2017 Couchbase Inc. 17
Modeling Data in RelationalWorld
Billing
ConnectionsPurchases
Contacts
Customer
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30”
}
Customer DocumentKey: CBL2015
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
}
]
}
Customer DocumentKey: CBL2015
CustomerID Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
Table: Billing
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2542-5847-3949",
"expiry" : "2018-12"
}
]
}
Customer DocumentKey: CBL2015
CustomerID Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
CBL2015 master 6274… 2018-12
Table: Billing
CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam Smith
Table: Connections
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2542-5847-3949",
"expiry" : "2018-12"
}
],
"Connections" : [
{
"ConnId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"ConnId" : ”SKR007",
"Name" : ”Sam Smith"
}
}
Customer DocumentKey: CBL2015
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-2847-3909",
"expiry" : "2019-03"
},
{
"type" : "master",
"cardnum" : "6274-2842-2847-3909",
"expiry" : "2019-03"
}
],
"Connections" : [
{
"CustId" : "XYZ987",
"Name" : "Joe Smith"
},
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
}
{
"CustId" : "PQR823",
"Name" : "Dylan Smith"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 }
{ "id":19, item: "ipad2", "amt": 623.52 }
]
}
DocumentKey: CBL2015
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Customer
ID
Type Cardnum Expiry
CBL2015 visa 5827… 2019-03
CBL2015 master 6274… 2018-12
CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam Smith
CustomerID item amt
CBL2015 mac 2823.52
CBL2015 ipad2 623.52
CustomerID ConnId Name
CBL2015 XYZ987 Joe Smith
CBL2015 SKR007 Sam
Smith
Contacts
Customer
Billing
ConnectionsPurchases
©2017 Couchbase Inc. 23
Models for Representing Data
Data Concern Relational Model
JSON Document Model
(NoSQL)
Rich Structure
 Multiple flat tables
 Constant assembly / disassembly
 Documents
 No assembly required!
Relationships
 Represented
 Queried (SQL)
 Represented
 Yes – N1QL (SQL for JSON)
Value Evolution  Data can be updated  Data can be updated
Structure Evolution
 Uniform and rigid
 Manual change (disruptive)
 Flexible
 Dynamic change
©2017 Couchbase Inc. 24
Demo: Modeling
©2016 Couchbase Inc. 25
Modeling your data: Strategies / rules of thumb
If … Then …
Relationship is one-to-one or one-to-many Store related data as nested objects
Relationship is many-to-one or many-to-many Store related data as separate documents
Data reads are mostly parent fields Store children as separate documents
Data reads are mostly parent + child fields Store children as nested objects
Data writes are mostly parent or child (not both) Store children as separate documents
Data writes are mostly parent and child (both) Store children as nested objects
©2017 Couchbase Inc. 26
Accessing Data
©2017 Couchbase Inc. 27
Accessing your data (Couchbase)
Key-Value
(CRUD)
N1QL
(Query)
Views
(Query)
Documents
Indexes MapReduce
FullText
(Search)
Geospatial
(Search)
Indexes MapReduce
©2017 Couchbase Inc. 28
Key/Value
public ShoppingCart GetCartById(Guid id)
{
return _bucket.Get<ShoppingCart>(id.ToString()).Value;
}
public void CreateShoppingCart()
{
_bucket.Insert(new Document<dynamic>
{
Id = Guid.NewGuid().ToString(),
Content = new { . . . }
});
}
©2016 Couchbase Inc. 29
Key/Value: Recommendations for keys
•Natural Keys
•Human Readable
•Deterministic
•Semantic
©2016 Couchbase Inc. 30
Key/Value: Example keys
• author::matt
• author::matt::blogs
• blog::csharp_7_features
• blog::csharp_7_features::comments
©2017 Couchbase Inc. 31
N1QL
©2017 Couchbase Inc. 32
Understanding your Query Plan
©2017 Couchbase Inc. 33
Map/Reduce
©2017 Couchbase Inc. 34
Accessing your data: Strategies and recommendation
Concept Strategies & Recommendations
Key-Value Operations provide the best
possible performance
• Create an effective key naming strategy
• Create an optimized data model
Incremental MapReduce (Views) are well
suited to aggregation
• Ideal for large data sets
• Data set can be used to create complex
view indexes
N1QL queries provide the most flexibility –
everything else
• Query data regardless of how it is modeled
• Good indexing is vital
©2017 Couchbase Inc. 35
Migrating Data
©2017 Couchbase Inc. 36
Migration options: Requirements
ETL / data cleanse / data enrichment
©2017 Couchbase Inc. 37
Migration options: Requirements
Duration vs. Resources
©2017 Couchbase Inc. 38
Migration options: Requirements
Data governance
©2017 Couchbase Inc. 39
Migration options: Pick your strategy
• Batch vs. Incremental
• Single threaded vs. multi-threaded
©2017 Couchbase Inc. 40
Migration options: Pick your tools
• Data migration tools:
• Informatica, Looker,Talend
• BYO-tool
• C# / Powershell / etc
• RhinoETL / DTS / SSIS
• Hadoop, Spark
©2017 Couchbase Inc. 41
Migration options: KISS
• CSV:
• Export to CSV
• Import as documents into a 'staging' bucket
• Use N1QL to transform
• Insert into new bucket
• SQL:
• Transform
• Export
• Insert into document database
©2017 Couchbase Inc. 42
Migration options: Recommendations
• Align with your data model
• Plan for failure
• Bad source data
• Hardware failure
• Resource limitations
• Ensure: Interruptible, restartable, logged, predictable
©2017 Couchbase Inc. 43
Sync NoSQL and relational? Automatic Replication
Couchbase
Kafka
Queue
Producer Consumer
RDBMSDCP
Stream
©2017 Couchbase Inc. 44
How can you sync NoSQL and relational?
RDBMS
CData
CouchbaseSSIS
https://www.cdata.com/drivers/couchbase
©2017 Couchbase Inc. 45
Sync NoSQL and relational? Manual.
©2017 Couchbase Inc. 46
Summary
©2017 Couchbase Inc. 47
Summary
Pick the right application
©2017 Couchbase Inc. 48
Summary
Drive data model from
data access patterns
©2017 Couchbase Inc. 49
Summary
Match the data access
method to requirements
©2017 Couchbase Inc. 50
Summary
Proof of Concept
©2017 Couchbase Inc. 51
Resources
 https://blog.couchbase.com/moving-from-sql-server-to-
couchbase-part-1-data-modeling/
– http://tinyurl.com/jsonmodel1
 https://blog.couchbase.com/sql-to-json-data-modeling-
hackolade/
– http://tinyurl.com/jsonmodel2
©2017 Couchbase Inc. 52
Couchbase, everybody!
©2017 Couchbase Inc. 53
Where do you find us?
• blog.couchbase.com
• @couchbasedev
• @mgroves
©2017 Couchbase Inc. 54
Frequently Asked Questions
1. How is Couchbase different than Mongo?
2. Is Couchbase the same thing as CouchDb?
3. How did you get to be both incredibly handsome and tremendously
intelligent?
4. What is the Couchbase licensing situation?
5. Is Couchbase a managed cloud service?
6. Transactions?

More Related Content

Json data modeling june 2017 - pittsburgh tech fest

  • 1. JSON Data Modeling Matthew D. Groves, @mgroves David Segleau, @dsegleau
  • 2. ©2017 Couchbase Inc. 2 Agenda Why NoSQL? JSON Data Modeling Accessing data Migrating data
  • 3. ©2017 Couchbase Inc. 3 Where am I? • PittsburghTech Fest • http://www.pghtechfest.com/
  • 4. ©2017 Couchbase Inc. 4 Who am I? • Matthew D. Groves • Developer Advocate for Couchbase • @mgroves onTwitter • Podcast and blog: http://crosscuttingconcerns.com • “I am not an expert, but I am an enthusiast.” – Alan Stevens
  • 5. JSON Data Modeling Matthew D. Groves, @mgroves David Segleau, @dsegleau
  • 6. ©2017 Couchbase Inc. 6 Major Enterprises Across Industries are Adopting NoSQL 6 CommunicationsTechnology Travel & Hospitality Media & Entertainment E-Commerce & Digital Advertising Retail & Apparel Games & GamingFinance & Business Services
  • 7. ©2017 Couchbase Inc. 7 Why NoSQL?
  • 8. ©2017 Couchbase Inc. 8 NoSQL Landscape Document • Couchbase • MongoDB • DynamoDB • DocumentDB Graph • OrientDB • Neo4J • DEX • GraphBase Key-Value • Couchbase • Riak • BerkeleyDB • Redis • … Wide Column • Hbase • Cassandra • Hypertable
  • 9. ©2017 Couchbase Inc. 9 NoSQL Landscape Document • Couchbase • MongoDB • DynamoDB • DocumentDB • Get by key(s) • Set by key(s) • Replace by key(s) • Delete by key(s) • Map/Reduce
  • 10. ©2017 Couchbase Inc. 10 Why NoSQL? Scalability
  • 11. ©2017 Couchbase Inc. 11 Why NoSQL? Flexibility
  • 12. ©2017 Couchbase Inc. 12 Why NoSQL? Performance
  • 13. ©2017 Couchbase Inc. 13 Why NoSQL? Availability
  • 14. ©2017 Couchbase Inc. 14 JSON Data Modeling
  • 15. ©2017 Couchbase Inc. 15 Models for Representing Data Data Concern Relational Model JSON Document Model Rich Structure Relationships Value Evolution Structure Evolution
  • 16. ©2017 Couchbase Inc. 16 Properties of Real-World Data
  • 17. ©2017 Couchbase Inc. 17 Modeling Data in RelationalWorld Billing ConnectionsPurchases Contacts Customer
  • 18. CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30” } Customer DocumentKey: CBL2015
  • 19. CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" } ] } Customer DocumentKey: CBL2015 CustomerID Type Cardnum Expiry CBL2015 visa 5827… 2019-03 Table: Billing
  • 20. CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2542-5847-3949", "expiry" : "2018-12" } ] } Customer DocumentKey: CBL2015 CustomerID Type Cardnum Expiry CBL2015 visa 5827… 2019-03 CBL2015 master 6274… 2018-12 Table: Billing
  • 21. CustomerID ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith Table: Connections { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2542-5847-3949", "expiry" : "2018-12" } ], "Connections" : [ { "ConnId" : "XYZ987", "Name" : "Joe Smith" }, { "ConnId" : ”SKR007", "Name" : ”Sam Smith" } } Customer DocumentKey: CBL2015
  • 22. { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "Connections" : [ { "CustId" : "XYZ987", "Name" : "Joe Smith" }, { "CustId" : "PQR823", "Name" : "Dylan Smith" } { "CustId" : "PQR823", "Name" : "Dylan Smith" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ] } DocumentKey: CBL2015 CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Customer ID Type Cardnum Expiry CBL2015 visa 5827… 2019-03 CBL2015 master 6274… 2018-12 CustomerID ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith CustomerID item amt CBL2015 mac 2823.52 CBL2015 ipad2 623.52 CustomerID ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith Contacts Customer Billing ConnectionsPurchases
  • 23. ©2017 Couchbase Inc. 23 Models for Representing Data Data Concern Relational Model JSON Document Model (NoSQL) Rich Structure  Multiple flat tables  Constant assembly / disassembly  Documents  No assembly required! Relationships  Represented  Queried (SQL)  Represented  Yes – N1QL (SQL for JSON) Value Evolution  Data can be updated  Data can be updated Structure Evolution  Uniform and rigid  Manual change (disruptive)  Flexible  Dynamic change
  • 24. ©2017 Couchbase Inc. 24 Demo: Modeling
  • 25. ©2016 Couchbase Inc. 25 Modeling your data: Strategies / rules of thumb If … Then … Relationship is one-to-one or one-to-many Store related data as nested objects Relationship is many-to-one or many-to-many Store related data as separate documents Data reads are mostly parent fields Store children as separate documents Data reads are mostly parent + child fields Store children as nested objects Data writes are mostly parent or child (not both) Store children as separate documents Data writes are mostly parent and child (both) Store children as nested objects
  • 26. ©2017 Couchbase Inc. 26 Accessing Data
  • 27. ©2017 Couchbase Inc. 27 Accessing your data (Couchbase) Key-Value (CRUD) N1QL (Query) Views (Query) Documents Indexes MapReduce FullText (Search) Geospatial (Search) Indexes MapReduce
  • 28. ©2017 Couchbase Inc. 28 Key/Value public ShoppingCart GetCartById(Guid id) { return _bucket.Get<ShoppingCart>(id.ToString()).Value; } public void CreateShoppingCart() { _bucket.Insert(new Document<dynamic> { Id = Guid.NewGuid().ToString(), Content = new { . . . } }); }
  • 29. ©2016 Couchbase Inc. 29 Key/Value: Recommendations for keys •Natural Keys •Human Readable •Deterministic •Semantic
  • 30. ©2016 Couchbase Inc. 30 Key/Value: Example keys • author::matt • author::matt::blogs • blog::csharp_7_features • blog::csharp_7_features::comments
  • 32. ©2017 Couchbase Inc. 32 Understanding your Query Plan
  • 33. ©2017 Couchbase Inc. 33 Map/Reduce
  • 34. ©2017 Couchbase Inc. 34 Accessing your data: Strategies and recommendation Concept Strategies & Recommendations Key-Value Operations provide the best possible performance • Create an effective key naming strategy • Create an optimized data model Incremental MapReduce (Views) are well suited to aggregation • Ideal for large data sets • Data set can be used to create complex view indexes N1QL queries provide the most flexibility – everything else • Query data regardless of how it is modeled • Good indexing is vital
  • 35. ©2017 Couchbase Inc. 35 Migrating Data
  • 36. ©2017 Couchbase Inc. 36 Migration options: Requirements ETL / data cleanse / data enrichment
  • 37. ©2017 Couchbase Inc. 37 Migration options: Requirements Duration vs. Resources
  • 38. ©2017 Couchbase Inc. 38 Migration options: Requirements Data governance
  • 39. ©2017 Couchbase Inc. 39 Migration options: Pick your strategy • Batch vs. Incremental • Single threaded vs. multi-threaded
  • 40. ©2017 Couchbase Inc. 40 Migration options: Pick your tools • Data migration tools: • Informatica, Looker,Talend • BYO-tool • C# / Powershell / etc • RhinoETL / DTS / SSIS • Hadoop, Spark
  • 41. ©2017 Couchbase Inc. 41 Migration options: KISS • CSV: • Export to CSV • Import as documents into a 'staging' bucket • Use N1QL to transform • Insert into new bucket • SQL: • Transform • Export • Insert into document database
  • 42. ©2017 Couchbase Inc. 42 Migration options: Recommendations • Align with your data model • Plan for failure • Bad source data • Hardware failure • Resource limitations • Ensure: Interruptible, restartable, logged, predictable
  • 43. ©2017 Couchbase Inc. 43 Sync NoSQL and relational? Automatic Replication Couchbase Kafka Queue Producer Consumer RDBMSDCP Stream
  • 44. ©2017 Couchbase Inc. 44 How can you sync NoSQL and relational? RDBMS CData CouchbaseSSIS https://www.cdata.com/drivers/couchbase
  • 45. ©2017 Couchbase Inc. 45 Sync NoSQL and relational? Manual.
  • 46. ©2017 Couchbase Inc. 46 Summary
  • 47. ©2017 Couchbase Inc. 47 Summary Pick the right application
  • 48. ©2017 Couchbase Inc. 48 Summary Drive data model from data access patterns
  • 49. ©2017 Couchbase Inc. 49 Summary Match the data access method to requirements
  • 50. ©2017 Couchbase Inc. 50 Summary Proof of Concept
  • 51. ©2017 Couchbase Inc. 51 Resources  https://blog.couchbase.com/moving-from-sql-server-to- couchbase-part-1-data-modeling/ – http://tinyurl.com/jsonmodel1  https://blog.couchbase.com/sql-to-json-data-modeling- hackolade/ – http://tinyurl.com/jsonmodel2
  • 52. ©2017 Couchbase Inc. 52 Couchbase, everybody!
  • 53. ©2017 Couchbase Inc. 53 Where do you find us? • blog.couchbase.com • @couchbasedev • @mgroves
  • 54. ©2017 Couchbase Inc. 54 Frequently Asked Questions 1. How is Couchbase different than Mongo? 2. Is Couchbase the same thing as CouchDb? 3. How did you get to be both incredibly handsome and tremendously intelligent? 4. What is the Couchbase licensing situation? 5. Is Couchbase a managed cloud service? 6. Transactions?

Editor's Notes

  1. Spend just a little time on why people are using NoSQL Talk about how data is modeled differently in JSON Let’s talk about why SQL is good and why SQL for JSON is needed Let’s talk about the exciting stuff happening in the database ecosystem Including but not limited to the stuff Couchbase is doing If we have time, we’ll look at how a .NET developer (or Java developer, etc) would interact with SQL for JSON
  2. This session is a WIP. It’s based on my knowledge of Couchbase, SQL server experience, and David Segleau’s engagement and lessons learned with customers, all combined into an hour presentation. David likes bullet points, I like to break up bullet points and use lots of pictures. David works with customers, I work with dev community. So you’re going to see a meshing of that, hopefully it works.
  3. What’s also interesting is that we’re seeing the use of NoSQL expand inside many of these companies. Orbitz, the online travel company, is a great example – they started using Couchbase to store their hotel rate data, and now they use Couchbase in many other ways. Same with ebay, they recently presented at the Couchbase conference with a chart tracking how many instances of various nosql databases are in use, and we see growth in Cassandra, mongo, and couchbase has actually surpassed them within ebay
  4. SQL (relational) databases are great. They give you LOT OF functionality. Great set of abstractions (tables, columns, data types, constraints, triggers, SQL, ACID TRANSACTIONS, stored procedures and more) at a highly reasonable cost. Change is inevitable One thing RDBMS does not handle well is CHANGE. Change of schema (both logical and physical), change of hardware, change of capacity. NoSQL databases ESPECIALLY ONES DESIGNED TO BE DISTRIBUTED tend to help solve problems with: agility, scalability, performance, and availability
  5. Let’s talk about what NoSQL is, first. NoSQL generally refers to databases which lack SQL or don’t use a relational model Once the SQL language, transaction became optional, flurry of databases were created using distinct approaches for common use-cases. KEY-Value simply provided quick access to data for a given KEY. Wide Column databases can store large number of arbitrary columns in each row Graph databases store data and relationships as first class concepts Document databases aggregate data into a hierarchical structure. With JSON is a means to the end. Document databases provide flexible schema,built-in data types, rich structure, implicit relationships using JSON.
  6. When we look at document databases, they originally came with a Minimal set of APIs and features But as they continue to mature, we’re seeing more features being added And generally I’m seeing a convergent trend between SQL and NoSQL But anyway, this set of minimal features, lacking a SQL language and tables gives us the buzzword “nosql”
  7. Elastic scaling Size your cluster for today Scale out on demand Cost effective scaling Commodity hardware On premise or on cloud Scale OUT instead of Scale UP [example: changing the channel to a soccer game or Game of Thrones, everyone makes the same API request in the same 5 minutes] [example: TV show lets watchers vote during some period of the week, so you can scale up during that period of time] [example: black Friday]
  8. Schema flexibility Easier management of change in the business requirements Easier management of change in the structure of the data Sometimes you're pulling together data, integrating from different sources (e.g. ELT) and that flexibility helps Document database means that you have no rigid schema. You can do whatever the heck you want. That being said, you SHOULDN’T. You should still have discipline about your data.
  9. NoSQL systems are optimized for specific access patterns Low response time for web & mobile user experience Millisecond latency Consistently high throughput to handle growth [perf measures can be subjective – talk about architecture, integrated cache, maybe mention MDS too]
  10. If one machine goes down, customers can still use the other. Or if you need to perform maintenance, upgrade, etc, you don't have to take the whole system down This is related to scaling Built-in replication and fail-over No application downtime when hardware fails Online maintenance & upgrade No application downtime
  11. Let’s talk about data modeling a bit, because storing data in JSON Is different that storing in tables.
  12. So I want to compare the approaches over 4 key areas. I’m going to fill in this table, traditional SQL on the left and JSON on the right
  13. Let’s look at modeling Customer data. This is an example of what a customer might look like There is a rich structure: attributes, potentially sub-attributes (first name and last name) Relationships: to other data (other customers, to products perhaps) Value evolution: Maybe we’d start with one connection, change to multiple (data is updated) Structure evolution: Maybe we start without connections and add those later, or we evolve name field to be more than first and last name (data is reshaped)
  14. Rich Structure In relational database, this customers data would be stored in five normalized tables. Each time you want to construct a customer object, you JOIN the data in these tables; Each time you persist, you find the appropriate rows in relevant tables and insert/update. Relationship Enforcement is via referential constraints. Objects are constructed by JOINS, EACH time. Value Evolution Additional values of the SAME TYPE (e.g. additional phone, additional address) is managed by additional ROWS in one of the tables. Customer:contacts will have 1:n relationship. Structure Evolution: Imagine we didn't start with a billing table. This is the most difficult part. Changing the structure is difficult, within a table, across tables. While you can do these via ALTER TABLE, requires downtime, migration and application versioning. This is one of the problem document databases try to handle by representing data in JSON.
  15. Let’s see how to represent customer data in JSON. The primary (CustomerID) becomes the DocumentKey Column name-Column value becomes KEY-VALUE pair.
  16. We aren’t normal form anymore Rich Structure & Relationships Billing information is stored as a sub-document There could be more than a single credit card. So, use an array.
  17. Value evolution Simply add additional array element or update a value.
  18. Structure evolution Simply add new key-value pairs No downtime to add new KV pairs Applications can validate data Structure evolution over time. Relations via Reference
  19. So, finally, you have a JSON document that represents a CUSTOMER. In a single JSON document, relationship between the data is implicit by use of sub-structures and arrays and arrays of sub-structures.
  20. Reference slide
  21. What types of relationships are being modeled? How are the relationships accessed?
  22. Let’s talk about data modeling a bit, because storing data in JSON Is different that storing in tables.
  23. We’ll focus on N1QL for now.
  24. Notice I’m using Guid That may not be a good idea
  25. N1QL is powerful in it's flexibility, declarative nature, familiar to developers, JOINs, etc. Indexing is very important, as it's not as performant as key/value or map/reduce (Maybe talk about indexing on a SQL table vs indexing on a whole bucket)
  26. Couchbase 5.0 has introduced some tools for analyzing query performance So you can see what indexes are being used, where the biggest costs are in the query And so on. There are a lot of different types of indexes for N1QL
  27. This is kinda like a materialized view It's powerful in that it can be run in parallel, can use JavaScript to do filtering/mapping, great for aggregation. It's limited in that it can't do anything like a JOIN, can't get input from other views, and more
  28. Let’s talk about data modeling a bit, because storing data in JSON Is different that storing in tables.
  29. Are you going to take the time to clean up the data? Do you need to? Do you need to enrich or restructure the data to take advantage of Json? Duration v resources: how long is it going to take? What tools and resources are available to you? Data governance: what are the rules for moving data, auditing, etc?
  30. Duration v resources: how long is it going to take? What tools and resources are available to you? What’s your biggest constraint – time or resources? Do you need to get the migration done in 1 hr (and have it use as many parallel resources as needed) or do you need to minimize/manage the resource impact on the existing system and it doesn’t matter how long it takes?
  31. Data governance: what are the rules for moving data, auditing, etc? Do you need to keep track of where the data came from and who is allowed to access it? Many newer systems need to track where sensitive data originated. 
  32. A whole bunch at a time, or one at a time Single threaded – easier Multi-threaded – faster, complicated is the migration a one-time event or does it need to happen incrementally (every day or over a 2-3 month period where both the old system and new system are both operating in parallel)? Do you plan to do the data migration as a single thread (read all the data, write all of the data) or using a multi-threaded or multi-process approach where each thread or process reads some percentage of the data.
  33. If you're writing your own, Entity Framework can be helpful, because it can do the mapping of aggregate root C# objects for you, which you can then write to a document database So if you already have EF mappings created, you're part way there.
  34. KISS: Either export to CVS and use N1QL to do any ETL that’s required (assuming that it’s Simple) or use SQL to do simple ETL on export and then just import into CB. Basically keep it as simple as you can and plan for failure. Developers often think of the migration process as “One and Done”, but the reality is that data migration is often an ongoing headache that DevOps needs to monitor and manage in a production environment. Make everyone’s life easier by thinking about the long game as much as possible.
  35. From NoSQL to relational
  36. From relational to NoSQL: Goldendate is from oracle Cdata for SSIS and Couchbase https://github.com/mahurtado/CouchbaseGoldenGateAdapter https://www.cdata.com/drivers/couchbase
  37. Make it part of your application directly May or may not be reusable This is a lot of work, so make sure you have a good reason
  38. Let’s talk about data modeling a bit, because storing data in JSON Is different that storing in tables.
  39. Focus on SOA, application/use case specific
  40. Use Document type, Versionid Create optimized, understandable keys Weigh nested, referenced or mixed designs Add indexes: Simple, Compound, Functional, Partial, Array, Covering, Memory Optimized
  41. N1QL, Key-value, Views,
  42. Focus, Success Criteria, Review Architecture consider using a tool like Hackolade to define models rigorously and collaboratively
  43. Start the animation
  44. Mongo: Features N1QL, XDCR, Full Text Search, Mobile & Sync. Memory-first architecture and proven, easy scaling. CouchDb: Couchbase started as a whole new piece of software that was basically a combination of memcache and CouchDb a long time ago, but has grown far beyond that. Couchbase isn’t a fork or vice versa. They share an acronym and they are both NoSQL. Like MySQL and SQL Server, for instance. Open source apache license for community edition, enterprise edition on a faster release schedule, some advanced features, and support license. Couchbase is software you can run in the cloud on a VM or on your own data center. CosmosDb is a manage cloud service, but there is a emulator you can run locally. Transactions: if you can use nested modeling, you don't need multi-document transactions.