JSON Data Modeling - July 2018 - Tulsa Techfest

JSON Data
Modeling
Matthew D. Groves, @mgroves

2
AGENDA
01/ Why NoSQL?
02/ JSON Data Modeling
03/ Accessing Data
04/ Migrating Data
05/ Summary / Q&A

Where am I?
3
• Tulsa Tech Fest
• https://grouplings.com/TulsaTechFest
• https://twitter.com/TulsaTechFest

Who am I?
4
• Matthew D. Groves
• Developer Advocate for Couchbase
• @mgroves on Twitter
• Podcast and blog: https://crosscuttingconcerns.com
• "I am not an expert, but I am an enthusiast." – Alan Stevens
by @natelovett

Major Enterprises Across Industries are Adopting NoSQL
CommunicationsTechnology
Travel & Hospitality Media &
Entertainment
E-Commerce &
DigitalAdvertising
Retail & Apparel
Games & GamingFinance &
Business Services

NoSQL Landscape
Document
• Couchbase
• MongoDB
• DynamoDB
• CosmosDB
Graph
• OrientDB
• Neo4J
• DEX
• GraphBase
Key-Value
• Couchbase
• Riak
• BerkeleyDB
• Redis Wide Column
• Hbase
• Cassandra
• Hypertable

NoSQL Landscape
• Get by key(s)
• Set by key(s)
• Replace by key(s)
• Delete by key(s)
• Map/Reduce
Document
• Couchbase
• MongoDB
• DynamoDB
• CosmosDB

Models for Representing Data
1
5
Data Concern Relational Model JSON Document Model
Rich Structure
Relationships
Value Evolution
Structure Evolution

Properties of Real-World Data
1
6

Modeling Data in a Relational World
1
7
Billing
ConnectionsPurchases
Contacts
Customer

CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30”
}
Customer DocumentKey: CBL2015

©2017 Couchbase Inc. 19
CustomerID Name DOB
Table: Customer {
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
}
]
}
CustomerID Item Amount Date
CBL2015 laptop 1499.99 2019-03
Table: Purchases

CustomerID Name DOB
Table: Customer {
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
CustomerID Item Amount Date
CBL2015 laptop 1499.99 2019-03
CBL2015 phone 99.99 2018-12
Table: Purchases

CustomerID ConnId Relation
CBL2015 XYZ987 Brother
CBL2015 SKR007 Father
Table: Connections {
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-...",
"expiry" : "2019-03"
}, ...
],
"Connections" : [
{
"ConnId" : "XYZ987",
"Relation" : "Brother"
},
{
"ConnId" : "SKR007",
"Relation" : "Father"
}
}

©2017 Couchbase Inc. 22
{
"DOB" : "1990-01-30",
"cardnum" : "5827-2842…",
"expiry" : "2019-03",
"cardType" : "visa",
"Connections" : [
{
"CustId" : "XYZ987",
},
{
"CustId" : "SKR007",
" Relation " : "Father"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52
}
{ "id":19, item: "ipad2", "amt": 623.52
}
]
}
DocumentKey: CBL2015
Custome
rID
Name DOB Cardnum Expiry CardType
CBL201
5
Jane
Smith
1990-01-
30
5827-
2842…
2019-03 visa
CustomerI
D
ConnId Relation
CustomerI
D
item amt
CBL2015 mac 2823.5
2
CBL2015 ipad2 623.52
CustomerI
D
ConnId Name
CBL2015 XYZ987 Joe
Smith
CBL2015 SKR007 Sam
Smith
Contacts
Customer

{
"Name" : "Bob Jones",
"DOB" : "1980-01-29",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5927-2842-2847-3909",
"expiry" : "2020-03"
},
{
"type" : "master",
"cardnum" : "6273-2842-2847-3909",
"expiry" : "2019-11"
}
],
"Connections" : [
{
"CustId" : "XYZ987",
},
{
"CustId" : "PQR823",
"Relation" : "Father"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 }
{ "id":19, item: "ipad2", "amt": 623.52 }
]
}
DocumentKey: CBL2016
CustomerID Name DOB
CBL2016 Bob Jones 1980-01-29
Custome
rID
Type Cardnum Expiry
CBL2016 visa 5927… 2020-03
CBL2016 maste
r
6273… 2019-11
CustomerI
D
ConnId Relation
CustomerI
D
item amt
CBL2016 mac 2823.5
2
CBL2016 ipad2 623.52
CustomerI
D
ConnI
d
Name
CBL201
6
XYZ98
7
Joe
Smith
CBL201
6
SKR0
07
Sam
Smith
Contacts
Customer
Billing

Models for Representing Data
2
4
Data Concern Relational Model JSON Document Model
Rich Structure
• Multiple flat tables
• Assembly / disassembly
 Documents
 No (or less) assembly required
Relationships
 Represented
 Queries with SQL
 Represented
 Queried…with?
Value Evolution  Data can be updated  Data can be updated
Structure Evolution
 Uniform, rigid, enforced
 Manual disruptive change
 Flexible
 Dynamic change
 Increased app responsibility

Relationship is one-to-one or one-to-many
Store related data as nested objects
{
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Modeling your data: Strategies / rules of thumb

Relationship is many-to-one or many-to-
many
Store related data as separate documents
{
"Name" : "Jane
Smith",
"DOB" : "1990-01-
30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}

Modeling tools
2
7
• Hackolade
• Erwin DM NoSQL
• Idera ER/Studio

Data reads are mostly parent fields
Store children as separate documents
{
"DOB" : "1990-01-30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}

Data reads are mostly parent + child fields
Store children as nested objects
{
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}

Data writes are mostly parent or child (not
both)
{
"DOB" : "1990-01-30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}

Data writes are mostly parent and child (both)
{
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}

If … Then …
Relationship is one-to-one or one-to-many Store related data as nested objects
Relationship is many-to-one or many-to-
many
Store related data as separate documents
Data reads are mostly parent fields Store children as separate documents
Data reads are mostly parent + child fields Store children as nested objects
Data writes are mostly parent or child (not
both)
Data writes are mostly parent and child
(both)

Subdocument access
3
4
{
"username": "mgroves",
"profile": {
"phoneNumber": "123-456-7890",
"address": {
"street": "123 main st",
"city": "Grove City",
"state": "Ohio"
}
}
}

Accessing your data (Couchbase)
Key-Value
(CRUD)
N1QL
(Query)
Views
(Query)
Documents
Indexes
MapReduc
e
Full Text
(Search)
Geospatial
(Search)
Indexes
MapReduc
e

Key/Value
public ShoppingCart GetCartById(Guid id)
{
return _bucket.Get<ShoppingCart>(id.ToString()).Value;
}
public void CreateShoppingCart()
{
_bucket.Insert(new Document<dynamic>
{
Id = Guid.NewGuid().ToString(),
Content = new { . . . }
});
}

Key/Value: Recommendations for keys
•Natural Keys
•Human Readable
•Deterministic
•Semantic

Key/Value: Example keys
• author::matt
• author::matt::blogs
• blog::csharp_7_features
• blog::csharp_7_features::comments

Concept Strategies & Recommendations
Key-Value Operations provide the best
possible performance
• Create an effective key naming strategy
• Create an optimized data model
Incremental MapReduce (Views) are well
suited to aggregation
• Ideal for large data sets
• Data set can be used to create complex
view indexes
N1QL queries provide the most flexibility –
everything else
• Query data regardless of how it is
modeled
• Good indexing is vital
Accessing your data: Strategies and recommendation

Migration options: Requirements
ETL / data cleanse / data enrichment

Duration vs. Resources

Data governance

• Batch vs. Incremental
• Single threaded vs. multi-threaded
Migration options: Pick your strategy

Data migration tools:
Informatica, Looker, Talend, DART, ODBC, CData
BYO-tool
• C# / bash / Powershell / curl / REST etc
• GoldenGate / DTS / SSIS
• Hadoop, Spark, Kafka, Nifi
• CLI: cbimport, mongoimport, etc
Migration options: Pick your tools

Migration options: KISS
• CSV:
• Export to CSV
• Import as documents into a 'staging' bucket
• Use N1QL to transform
• Insert into new bucket
• SQL:
• Transform
• Export
• Insert into document database

Migration options: Recommendations
• Align with your data model
• Plan for failure
• Bad source data
• Hardware failure
• Resource limitations
• Ensure: Interruptible, restartable, logged, predictable

Sync NoSQL and relational? Automatic Replication
Couchbase
Kafka
Queue
Producer Consumer
RDBMSDCP
Stream

How can you sync NoSQL and relational?
RDBMS
Handler
Couchbase
GoldenGate
https://github.com/mahurtado/CouchbaseGoldenGateAdapter

Data Flow with NiFi
5
3
https://blog.couchbase.com/nifi-processing-flow-couchbase-server/

Sync NoSQL and relational? Manual.

Pick the right
application
Summary

Drive data model
from data access
patterns
Summary

Match the data
access method to
requirements
Summary

https://blog.couchbase.com/proof-of-concept-move-
relational/
https://blog.couchbase.com/json-data-modeling-rdbms-users/
Resources

Couchbase Plug
6
1
• Go to Couchbase.com to download Couchbase
• Enter to win a $100 gift card here:
https://bit.ly/FEST2018 (use code FEST2018)

Where do you find us?
6
2
•blog.couchbase.com
•@mgroves
•@couchbasedev

Frequently Asked Questions
6
3
1. How is Couchbase different than Mongo?
2. Is Couchbase the same thing as CouchDb?
3. How tall are you? Do you play basketball?
4. What is the Couchbase licensing situation?
5. Is Couchbase a Managed Cloud Service (DBaaS)?

Managed Cloud Server (DBaaS)
6
4
< Back

MongoDB vs Couchbase
6
5
• Architecture
• Memory first architecture
• Master-master architecture
• Auto-sharding
• Features
• SQL (N1QL)
• Full Text Search
• Mobile & Sync
< Back

Licensing
6
6
< Back
Couchbase Server Community
• Open source (Apache 2)
• Binary release is one release behind Enterprise (except major versions)
• Free to use in dev/test/qa/prod
• Forum support only
Couchbase Server Enterprise
• Mostly open source (Apache 2)
• Some features not available on Community (XDCR TLS, MDS, Rack Zone,
etc)
• Free to use in dev/test/qa
• Need commercial license for prod
• Paid support provided

CouchDB and Couchbase
6
7
< Back
memcached

JSON Data Modeling - July 2018 - Tulsa Techfest

Related slideshows

More Related Content

JSON Data Modeling - July 2018 - Tulsa Techfest

Editor's Notes