NoSQL document databases provide unique capabilities of scaling, flexibility, and performance for a wide variety of use cases. However, many developers from relational backgrounds are understandably nervous (for a variety of reasons) about using NoSQL in their next project. This session will address one of those reasons: ACID transactions (or lack thereof). This session will start with some background about why NoSQL databases didn’t (initially) have full ACID capabilities. Next, we’ll look at why lack of ACID may not be a big deal and some of the data modeling and querying techniques to use instead. Finally, we’ll look at the more recent trend of document databases adding distributed multi-document ACID capabilities and show a live demo of a NoSQL transaction. You’ll leave this session with a better understanding of how ACID works and when to use it.
6. 6
• Matthew Groves
• (Technical) Product Marketing Manager
at Couchbase
• Microsoft MVP
• Pluralsight Author
• Father of 2, Husband
Who am I? Where am I?
• THAT Conference
• https://that.us
7. 01/ Transactions in Relational
AGENDA
02/ Why NoSQL?
03/ What is ACID? And why is it hard?
04/ Demo
05/ Summary / Questions / Resources
9. 9
Third Normal Form
ID DateCreated Item1 Item2 Item3
100 2020-05-27 smartphone charger
cable
case
101 2020-04-24 case
102 2020-05-25 charger
cable
case
Table: ShoppingCart
10. 10
Third Normal Form
ID DateCreated
100 2020-05-27
101 2020-04-24
102 2020-05-25
CartID Name
100 smartphone
100 charger cable
100 case
101 case
102 charger cable
102 case
Table: ShoppingCart Table: ShoppingCartItems
11. 11
Why Transactions?
ID DateCreated
100 2020-06-09
CartID Name
100 case
100 charger cable
Save a new shopping cart:
1. Insert one row into ShoppingCart
2. Insert one row into ShoppingCartItems
3. Insert another row into ShoppingCartItems
4. Done.
Table: ShoppingCart
Table: ShoppingCartItems
12. 12
Why Transactions?
ID DateCreated
100 2020-06-09
CartID Name
100 case
100 charger cable
Save a new shopping cart:
1. Insert one row into ShoppingCart
2. Insert one row into ShoppingCartItems
Table: ShoppingCart
Table: ShoppingCartItems
14. 14
Why Transactions?
ID DateCreated
100 2020-06-09
CartID Name
100 case
100 charger cable
Save a new shopping cart:
1. Insert one row into ShoppingCart
2. Insert one row into ShoppingCartItems
3. Crash!
4. Rollback! (phew)
Table: ShoppingCart
Table: ShoppingCartItems
15. 15
First Normal Form
ID DateCreated Item1 Item2 Item3
100 2020-05-27 smartphone charger
cable
case
101 2020-04-24 case
102 2020-05-25 charger
cable
case
Table: ShoppingCart
18. 18
Photo caption here
This layout has a
WHITE logo for use
on a darker photo.
• Domain-Driven Design
• By Eric Evans
• https://domainlanguage.com/ddd
21. 21
Scaling
Performance
High Availability
Flexibility
• Fewer operations for complex data
• Memory-first or memory-only
• Distributed systems can handle concurrency
• Designed to be distributed
• Easy clustering
• Sharding is built-in, automatic
• Fault tolerance
• Distributed systems can withstand damage
• Maintenance / upgrades / planned outages don't
have to be "outages"
• Data is isolated, accepting of many data models
• JSON
• "implied" schema
• Polyglot Persistence
NoSQL: The Big Four
23. 23
• A – Atomicity
• C – Consistency
• I – Isolation
• D - Durabilty
ACID
24. 24
• A group of operations either all succeed or all fail
A is for Atomicity
25. 25
• Data will never be in an invalid state
• "Dirty reads", "dirty writes", "phantom reads", etc
• What is "eventual consistency"?
C is for Consistency
26. 26
C is for Consistency
http://jepsen.io/consistency
27. 27
• Ensure that an operation is independent of other concurrent
operations
• Optimistic/pessimistic locking
• Timeouts
I is for Isolation
28. 28
• Data is safely stored in case of a system failure
• What is "durable enough"?
• Disk?
• Memory?
• Data center?
• Planet?
D is for Durability
30. 30
Challenge:
• What happens if one or more of the
machines in the cluster crashes?
• Uncommitted transactions leave behind
artifacts?
• Identifying edge cases
Solutions:
• Consensus requirements
• Cooperative model / Paxos
• Mitigation
"Split Brain" (aka network problems)
31. 31
Challenge:
• Performance: we don't want to just
reinvent a relational database
• How does an ACID transaction affect
performance, high availability?
Solutions:
• Only apply ACID transactions when
necessary.
• Use Data modeling to solve when
possible
Latency
32. 32
Challenge:
• Testing
• How do we verify all those edge cases?
Solutions:
• "Solve" with Jepsen guidelines
• Jepsen testing
• Jepsen Disputes MongoDB's Data
Consistency Claims (InfoQ) -
https://bit.ly/jepsenMongo
Correctness
😀 😐 😞
👍 👌 👎
33. 33
Server-side:
• Pros
• Light SDK work
• Cons
• Global co-ordinator
• Global lock manager
• Global scheduler
Client-side:
• Pros
• None of those global things
• Quick iteration
• Nothing new to configure on the server
• Cons
• Major SDK work
• All SDKs must use the same algorithm
Client-side vs Server-side
35. 35
New to Couchbase 7 (beta)
BEGIN WORK;
UPDATE x1 SET a = a + 1 WHERE b < 10;
UPDATE x1 SET a = a + 15 WHERE b < 10;
SELECT a, b, c FROM x1 WHERE b < 20;
COMMIT WORK;
36. 36
ACID
Transactions
Use only when
necessary
Remember the
Overhead
Solve with data
modelling when
possible
Don't be afraid to
use a transaction
when you need to
Give you the ability to treat multiple
operations as a single all-or-nothing
operation
What are the tradeoffs?