SlideShare a Scribd company logo
Omid: A Transactional Framework for HBase
Francisco Perez-Sorrosal
Ohad Shacham
Hadoop Summit SJ
June 29th, 2016
Outline
 Background
 Basic Concepts
 Use cases
 Architecture
 Transaction Management
 High Availability
 Performance
 Summary
Hadoop Summit SJ (June 29th 2016)2
 New Big data apps → new requirements:
● Low-latency
● Incremental data processing
● e.g. Percolator
 Multiple clients updating same data concurrently
● Problem: Conflicts/Inconsistencies may arise
● Solution: Transactional Access to Data
Background
Hadoop Summit SJ (June 29th 2016)3
 Transaction → Abstract UoW to manage data with certain
guarantees
● ACID
● Relational databases
 Big data → NoSQL datastores → Transactions in NoSQL
● Relaxed Guarantees:
○ e.g. Atomicity, Consistency
Background
● Hard to Scale
○ Data partition
○ Data replication
Hadoop Summit SJ (June 29th 2016)4
 Flexible
 Reliable
 High Performant
 Scalable
…OLTP framework that allows BigData apps to
execute ACID transactions on top of HBase
+ =
Consistency in
BigData Apps
Omid is a…
Hadoop Summit SJ (June 29th 2016)5
Why use Omid?
 Simplifies development of apps requiring consistency
● Multi-row/multi-table transactions on HBase
● Simple & well-known interface
 Good performance & reliability
 Lock-free
 Snapshot Isolation
 HBase is a blackbox
● No HBase code modification
● No changes on table schemas
 Used successfully at Yahoo
Hadoop Summit SJ (June 29th 2016)6
Snapshot Isolation
▪ Transaction T2 overlaps in time with T1 & T3, but spatially:
● T1 ∩ T2 = ∅
● T2 ∩ T3 = { R4 } Transactions T2 and T3 conflict
▪ Transaction T4 does not have conflicts
TxId
T1
T2
T3
T4
Time Overlap Spatial Overlap (WriteSet)
R1 R2 R3 R4
R3 R4
R2 R4
R1 R3
Hadoop Summit SJ (June 29th 2016)7
Sieve
Use Cases: Sieve @ Yahoo
HBase
Internet
Crawler Doc Proc Aggregation
Omid
Feeder
Real-Time
Index
Notifications
Transactional Data Flow
Hadoop Summit SJ (June 29th 2016)8
Hive Metastore Thrift Server
Use Cases:
HBase
HBaseStore
Omid
Hadoop Summit SJ (June 29th 2016)
ObjectStore
Relational
Database
9
Transactional App
Architectural Components
HBase
Omid Client
Transaction Status Oracle
(TSO)
Timestamp
Oracle
Get Start/Commit
Timestamps
Start/Commit TXs
Keep track &
Validate TXs
Commit Table
Compactor
Commit data
R/W data
Guarantee
SI
App Table
Shadow
CellsApp TableApp Table
Shadow
Cells
Hadoop Summit SJ (June 29th 2016)10
Client APIs
▪ Transaction Manager → Create Transactional contexts
Transaction begin();
void commit(Transaction tx);
void rollback(Transaction tx);
▪ Transactional Tables (TTable) → Data access
Result get(Transaction tx, Get g);
void put(Transaction tx, Put p);
ResultScanner getScanner(Transaction tx, Scan s);
Hadoop Summit SJ (June 29th 2016)11
TX Management (Begin TX phase)
Omid Client TSO TO Table/SC CommitTable
Begin TX Get ST
ST=1
TX(ST=1)
R/W Ops for TX (ST=1)
App
Begin TX
R/W Ops (within TX context)
TX Context
R/W Results for TX with ST=1
Read Ops:
Get right results
for TX’s SnapshotWrite Ops:
Build Writeset
for TX
Hadoop Summit SJ (June 29th 2016)12
TX Management (Commit TX Phase)
Omid Client TSO TO Table/SC CommitTable
Commit TX (Writeset)
Get CT
CT=2
TX(CT=2)
App
Commit TX
Check Conflicts
of TX Writeset
in Conflict Map
Persist commit details (ST/CT) for TX
Hadoop Summit SJ (June 29th 2016)13
TX Management (Complete TX Phase)
Omid Client TSO TO Table/SC CommitTable
Update SC for TX (ST=1/CT=2)
App
Complete commit (Cleanup entry for TX with ST=1)
Result
Hadoop Summit SJ (June 29th 2016)14
Transactional App
High Availability
HBase
Omid Client
Transaction Status Oracle
Timestamp
Oracle
Get Start/Commit
Timestamps
Start/Commit TXs
Commit Table
Compactor
Commit data
R/W data
Guarantee
SI
App Table
Shadow
CellsApp TableApp Table
Shadow
Cells
Single
point of
failure
Hadoop Summit SJ (June 29th 2016)15
Timestamp
Oracle
Transaction Status Oracle
Transactional App
High Availability
HBase
Omid Client
Transaction Status Oracle
Timestamp
Oracle
Get Start/Commit
Timestamps
Start/Commit TXs
Commit Table
Compactor
Commit data
R/W data
Guarantee
SI
App Table
Shadow
CellsApp TableApp Table
Shadow
CellsRecovery
State
Primary
/
Backup
Hadoop Summit SJ (June 29th 2016)16
High Availability – Failing Scenario
Omid Client TSO P TSO B Table/SC CommitTableApp
Begin TX
Begin TX Get ST
ST=1
TX(ST=1)
TX 1
TO
Data Store Commit Table
Write(k1, v1) (ST=1)
TX 1 Write(k1, v1)
(k1, v1, 1)
Hadoop Summit SJ (June 29th 2016)17
High Availability – Failing Scenario
Omid Client TSO P TSO B Table/SC CommitTableApp TO
Data Store Commit Table
Write(k2, v2) (ST=1)
Write(k2, v2)
(k1, v1, 1)
(k2, v2, 1)
Commit TX 1{k1, k2}
Commit TX 1
Get CT
CT=2
Persist commit details for TX 1
Hadoop Summit SJ (June 29th 2016)18
High Availability – Failing Scenario
Omid Client TSO B Table/SC CommitTableApp
Begin TX
Begin TX Get ST
ST=3
TX(ST=3)
TX 3
TO
Data Store Commit Table
Read(k1) (ST=3)
TX 3 Read(k1)
(k1, v1, 1)
(k1, v1, 1)
(k2, v2, 1)Hadoop Summit SJ (June 29th 2016)19
High Availability – Failing Scenario
Omid Client TSO B Table/SC CommitTableApp TO
Data Store Commit Table
Return TX 1 CT
(k1, v1, 1)
! exist
! exist
Read(k2) (ST=3)
(k2, v2, 1)
TX 3 Read(k2)
(k2, v2, 1)
CT = 2
Return TX 1 CT
v2
(1, 2)Hadoop Summit SJ (June 29th 2016)20
Timestamp
Oracle
Transaction Status Oracle
Transactional App
High Availability
HBase
Omid Client
Transaction Status Oracle
Timestamp
Oracle
Get Start/Commit
Timestamps
Start/Commit TXs
Commit Table
Compactor
R/W data
Guarantee
SI
App Table
Shadow
CellsApp TableApp Table
Shadow
CellsRecovery
State
Hadoop Summit SJ (June 29th 2016)21
High Availability – Solution
Omid Client TSO P TSO B Table/SC CommitTableApp
Begin TX
Begin TX Get ST
ST=1
TX(ST=1,E=1)
TX 1, 1
TO
Data Store Commit Table
Write(k1, v1) (ST=1)
TX 1 Write(k1, v1)
(k1, v1, 1)
Hadoop Summit SJ (June 29th 2016)22
High Availability – Solution
Omid Client TSO P TSO B Table/SC CommitTableApp TO
Data Store Commit Table
Write(k2, v2) (ST=1)
Write(k2, v2)
(k1, v1, 1)
(k2, v2, 1)
Commit TX 1{k1, k2}
Commit TX 1
Get CT
CT=2
Persist commit details for TX 1
Hadoop Summit SJ (June 29th 2016)23
High Availability – Solution
Omid Client TSO B Table/SC CommitTableApp
Begin TX
Begin TX Get ST
ST=3
TX(ST=3,E=3)
TX 3,3
TO
Data Store Commit Table
Read(k1) (ST=3)
TX 3 Read(k1)
(k1, v1, 1)
(k1, v1, 1)
(k2, v2, 1)Hadoop Summit SJ (June 29th 2016)24
High Availability – Solution
Omid Client TSO B Table/SC CommitTableApp TO
Data Store Commit Table
Return TX1 CT
(k1, v1, 1)
! exist
(k2, v2, 1)
Invalid
Try invalidate
(1, -, invalid)
! exist
Read(k2) (ST=3)
(k2, v2, 1)
TX 3 Read(k2)
Hadoop Summit SJ (June 29th 2016)25
High Availability – Solution
Omid Client TSO B Table/SC CommitTableApp TO
Data Store Commit Table
Return TX 1 CT
(k1, v1, 1)
! exist
! exist
(k2, v2, 1)
(1, 2, invalid)Hadoop Summit SJ (June 29th 2016)26
High Availability
 No runtime overhead in mainstream execution
• Minor overhead after failover
 TSO uses regular writes
 Leases for leader election
• Lease status check before/after writing to Commit Table
Hadoop Summit SJ (June 29th 2016)27
Perf. Improvements: Read-Only Txs
Omid Client TSO/TO Table/SC
Begin TX
TX(ST=1)
Read Ops for TX (ST=1)
App
Begin TX
Read Ops (in TX context)
TX Context
Read Results in Snapshot
Commit TX
Writeset is ∅, so no need to contact TSO!!!Success
Hadoop Summit SJ (June 29th 2016)28
TSO
HBase
Perf. Improvements: Commit Table Writes
Omid
Client
HBase
TSO
Commit
Table
Commit
Data
Omid
Client
Commit
Data
Hadoop Summit SJ (June 29th 2016)29
HBase
TSO
Perf. Improvements: Commit Table Writes
Omid
Client
HBase
TSO
Commit
Table
Commit
Data
Omid
Client
Commit
Data
Hadoop Summit SJ (June 29th 2016)30
0
50
100
150
200
250
300
350
400
1 2 4 6
Tps*103
Commit Table: # Region servers
Omid Throughput with Improvements
Hadoop Summit SJ (June 29th 2016)31
Summary
 Transactions in NoSQL
• Use cases in incremental big data processing
• Snapshot Isolation: Scalable consistency model
 Omid
• Web-scale TPS for HBase
• Reliable and performant
• Battle-tested
• http://omid.incubator.apache.org/
Hadoop Summit SJ (June 29th 2016)32
Questions?
Hadoop Summit SJ (June 29th 2016)33

More Related Content

Omid: A Transactional Framework for HBase

  • 1. Omid: A Transactional Framework for HBase Francisco Perez-Sorrosal Ohad Shacham Hadoop Summit SJ June 29th, 2016
  • 2. Outline  Background  Basic Concepts  Use cases  Architecture  Transaction Management  High Availability  Performance  Summary Hadoop Summit SJ (June 29th 2016)2
  • 3.  New Big data apps → new requirements: ● Low-latency ● Incremental data processing ● e.g. Percolator  Multiple clients updating same data concurrently ● Problem: Conflicts/Inconsistencies may arise ● Solution: Transactional Access to Data Background Hadoop Summit SJ (June 29th 2016)3
  • 4.  Transaction → Abstract UoW to manage data with certain guarantees ● ACID ● Relational databases  Big data → NoSQL datastores → Transactions in NoSQL ● Relaxed Guarantees: ○ e.g. Atomicity, Consistency Background ● Hard to Scale ○ Data partition ○ Data replication Hadoop Summit SJ (June 29th 2016)4
  • 5.  Flexible  Reliable  High Performant  Scalable …OLTP framework that allows BigData apps to execute ACID transactions on top of HBase + = Consistency in BigData Apps Omid is a… Hadoop Summit SJ (June 29th 2016)5
  • 6. Why use Omid?  Simplifies development of apps requiring consistency ● Multi-row/multi-table transactions on HBase ● Simple & well-known interface  Good performance & reliability  Lock-free  Snapshot Isolation  HBase is a blackbox ● No HBase code modification ● No changes on table schemas  Used successfully at Yahoo Hadoop Summit SJ (June 29th 2016)6
  • 7. Snapshot Isolation ▪ Transaction T2 overlaps in time with T1 & T3, but spatially: ● T1 ∩ T2 = ∅ ● T2 ∩ T3 = { R4 } Transactions T2 and T3 conflict ▪ Transaction T4 does not have conflicts TxId T1 T2 T3 T4 Time Overlap Spatial Overlap (WriteSet) R1 R2 R3 R4 R3 R4 R2 R4 R1 R3 Hadoop Summit SJ (June 29th 2016)7
  • 8. Sieve Use Cases: Sieve @ Yahoo HBase Internet Crawler Doc Proc Aggregation Omid Feeder Real-Time Index Notifications Transactional Data Flow Hadoop Summit SJ (June 29th 2016)8
  • 9. Hive Metastore Thrift Server Use Cases: HBase HBaseStore Omid Hadoop Summit SJ (June 29th 2016) ObjectStore Relational Database 9
  • 10. Transactional App Architectural Components HBase Omid Client Transaction Status Oracle (TSO) Timestamp Oracle Get Start/Commit Timestamps Start/Commit TXs Keep track & Validate TXs Commit Table Compactor Commit data R/W data Guarantee SI App Table Shadow CellsApp TableApp Table Shadow Cells Hadoop Summit SJ (June 29th 2016)10
  • 11. Client APIs ▪ Transaction Manager → Create Transactional contexts Transaction begin(); void commit(Transaction tx); void rollback(Transaction tx); ▪ Transactional Tables (TTable) → Data access Result get(Transaction tx, Get g); void put(Transaction tx, Put p); ResultScanner getScanner(Transaction tx, Scan s); Hadoop Summit SJ (June 29th 2016)11
  • 12. TX Management (Begin TX phase) Omid Client TSO TO Table/SC CommitTable Begin TX Get ST ST=1 TX(ST=1) R/W Ops for TX (ST=1) App Begin TX R/W Ops (within TX context) TX Context R/W Results for TX with ST=1 Read Ops: Get right results for TX’s SnapshotWrite Ops: Build Writeset for TX Hadoop Summit SJ (June 29th 2016)12
  • 13. TX Management (Commit TX Phase) Omid Client TSO TO Table/SC CommitTable Commit TX (Writeset) Get CT CT=2 TX(CT=2) App Commit TX Check Conflicts of TX Writeset in Conflict Map Persist commit details (ST/CT) for TX Hadoop Summit SJ (June 29th 2016)13
  • 14. TX Management (Complete TX Phase) Omid Client TSO TO Table/SC CommitTable Update SC for TX (ST=1/CT=2) App Complete commit (Cleanup entry for TX with ST=1) Result Hadoop Summit SJ (June 29th 2016)14
  • 15. Transactional App High Availability HBase Omid Client Transaction Status Oracle Timestamp Oracle Get Start/Commit Timestamps Start/Commit TXs Commit Table Compactor Commit data R/W data Guarantee SI App Table Shadow CellsApp TableApp Table Shadow Cells Single point of failure Hadoop Summit SJ (June 29th 2016)15
  • 16. Timestamp Oracle Transaction Status Oracle Transactional App High Availability HBase Omid Client Transaction Status Oracle Timestamp Oracle Get Start/Commit Timestamps Start/Commit TXs Commit Table Compactor Commit data R/W data Guarantee SI App Table Shadow CellsApp TableApp Table Shadow CellsRecovery State Primary / Backup Hadoop Summit SJ (June 29th 2016)16
  • 17. High Availability – Failing Scenario Omid Client TSO P TSO B Table/SC CommitTableApp Begin TX Begin TX Get ST ST=1 TX(ST=1) TX 1 TO Data Store Commit Table Write(k1, v1) (ST=1) TX 1 Write(k1, v1) (k1, v1, 1) Hadoop Summit SJ (June 29th 2016)17
  • 18. High Availability – Failing Scenario Omid Client TSO P TSO B Table/SC CommitTableApp TO Data Store Commit Table Write(k2, v2) (ST=1) Write(k2, v2) (k1, v1, 1) (k2, v2, 1) Commit TX 1{k1, k2} Commit TX 1 Get CT CT=2 Persist commit details for TX 1 Hadoop Summit SJ (June 29th 2016)18
  • 19. High Availability – Failing Scenario Omid Client TSO B Table/SC CommitTableApp Begin TX Begin TX Get ST ST=3 TX(ST=3) TX 3 TO Data Store Commit Table Read(k1) (ST=3) TX 3 Read(k1) (k1, v1, 1) (k1, v1, 1) (k2, v2, 1)Hadoop Summit SJ (June 29th 2016)19
  • 20. High Availability – Failing Scenario Omid Client TSO B Table/SC CommitTableApp TO Data Store Commit Table Return TX 1 CT (k1, v1, 1) ! exist ! exist Read(k2) (ST=3) (k2, v2, 1) TX 3 Read(k2) (k2, v2, 1) CT = 2 Return TX 1 CT v2 (1, 2)Hadoop Summit SJ (June 29th 2016)20
  • 21. Timestamp Oracle Transaction Status Oracle Transactional App High Availability HBase Omid Client Transaction Status Oracle Timestamp Oracle Get Start/Commit Timestamps Start/Commit TXs Commit Table Compactor R/W data Guarantee SI App Table Shadow CellsApp TableApp Table Shadow CellsRecovery State Hadoop Summit SJ (June 29th 2016)21
  • 22. High Availability – Solution Omid Client TSO P TSO B Table/SC CommitTableApp Begin TX Begin TX Get ST ST=1 TX(ST=1,E=1) TX 1, 1 TO Data Store Commit Table Write(k1, v1) (ST=1) TX 1 Write(k1, v1) (k1, v1, 1) Hadoop Summit SJ (June 29th 2016)22
  • 23. High Availability – Solution Omid Client TSO P TSO B Table/SC CommitTableApp TO Data Store Commit Table Write(k2, v2) (ST=1) Write(k2, v2) (k1, v1, 1) (k2, v2, 1) Commit TX 1{k1, k2} Commit TX 1 Get CT CT=2 Persist commit details for TX 1 Hadoop Summit SJ (June 29th 2016)23
  • 24. High Availability – Solution Omid Client TSO B Table/SC CommitTableApp Begin TX Begin TX Get ST ST=3 TX(ST=3,E=3) TX 3,3 TO Data Store Commit Table Read(k1) (ST=3) TX 3 Read(k1) (k1, v1, 1) (k1, v1, 1) (k2, v2, 1)Hadoop Summit SJ (June 29th 2016)24
  • 25. High Availability – Solution Omid Client TSO B Table/SC CommitTableApp TO Data Store Commit Table Return TX1 CT (k1, v1, 1) ! exist (k2, v2, 1) Invalid Try invalidate (1, -, invalid) ! exist Read(k2) (ST=3) (k2, v2, 1) TX 3 Read(k2) Hadoop Summit SJ (June 29th 2016)25
  • 26. High Availability – Solution Omid Client TSO B Table/SC CommitTableApp TO Data Store Commit Table Return TX 1 CT (k1, v1, 1) ! exist ! exist (k2, v2, 1) (1, 2, invalid)Hadoop Summit SJ (June 29th 2016)26
  • 27. High Availability  No runtime overhead in mainstream execution • Minor overhead after failover  TSO uses regular writes  Leases for leader election • Lease status check before/after writing to Commit Table Hadoop Summit SJ (June 29th 2016)27
  • 28. Perf. Improvements: Read-Only Txs Omid Client TSO/TO Table/SC Begin TX TX(ST=1) Read Ops for TX (ST=1) App Begin TX Read Ops (in TX context) TX Context Read Results in Snapshot Commit TX Writeset is ∅, so no need to contact TSO!!!Success Hadoop Summit SJ (June 29th 2016)28
  • 29. TSO HBase Perf. Improvements: Commit Table Writes Omid Client HBase TSO Commit Table Commit Data Omid Client Commit Data Hadoop Summit SJ (June 29th 2016)29
  • 30. HBase TSO Perf. Improvements: Commit Table Writes Omid Client HBase TSO Commit Table Commit Data Omid Client Commit Data Hadoop Summit SJ (June 29th 2016)30
  • 31. 0 50 100 150 200 250 300 350 400 1 2 4 6 Tps*103 Commit Table: # Region servers Omid Throughput with Improvements Hadoop Summit SJ (June 29th 2016)31
  • 32. Summary  Transactions in NoSQL • Use cases in incremental big data processing • Snapshot Isolation: Scalable consistency model  Omid • Web-scale TPS for HBase • Reliable and performant • Battle-tested • http://omid.incubator.apache.org/ Hadoop Summit SJ (June 29th 2016)32
  • 33. Questions? Hadoop Summit SJ (June 29th 2016)33