SlideShare a Scribd company logo
VoltDB
Fast, the Next Big!
June 2014!
VoltDB
I fear we delay seeing the big value
from data by not thinking ahead!
VoltDB
VoltDB
VoltDB
 5	
  
VoltDB
 6	
  
VoltDB
 7	
  
VoltDB
Are we really forced to choose?!
8	
  
Timely! Accurate!
Sampled recommendations!
!
Sensor indicates out of band!
!
Some fraud detected now!
Suggestion after purchase!
!
Trapped miners too late!
!
All fraud found days later!
or
VoltDB
The analytics stack is taking shape!
	
  
Enterprise	
  Apps	
  
ETL	
  
CRM	
   ERP	
   Etc.	
  
Data Lake
(HDFS, etc)
BIG	
  DATA	
  
Batch	
  Manipulate	
  
Pre-­‐process,	
  etc.	
  
Impala	
  
Hawq	
  
Big	
  SQL	
  
…	
  
SQL on
Hadoop
Map
Reduce
Exploratory
Analytics
Netezza	
  /	
  BLU	
  
RedshiO	
  
VerQca	
  
Greenplum	
  
…	
  
BI
Reporting
VoltDB
But what’s the point?!
Better Decisions!
Better Personalization!
Better Detection!
…!
!
!
10	
  
VoltDB
If we think about big data purely as historical analytics… !
!
!
! ! ! ! ! ! ! !…we will miss the opportunity!
11	
  
VoltDB
 12	
  
Application and analytics are no longer developed separately!
!
VoltDB
Applications Require Data To!
•  Ingest huge amounts of events!
•  Make data-driven decision on each event!
•  Analyze in real time for operational visibility!
13	
  
VoltDB
The RDBMS is Getting Crushed!
Reports!
Analytics!
Dashboards!
Alerts!
Mobile!
M2M!
Market Data!
Clickstreams!
Web Interactions!
Social Media!
Internet of Things!
RDBMS	
  
(In	
  Real-­‐Time)	
  
vs.	
  Batch	
  
Scale	
  with	
  unnatural	
  acts?	
  
•  SSD	
  or	
  Fusion	
  IO	
  cards	
  
•  Sharding	
  
•  Cache/Grid	
  
•  NoSQL,	
  No	
  ACID	
  
Boleneck	
  
Unlimited Data with Real Time Analytics!
VoltDB
Future Corporate Data Architecture!
	
  
Enterprise	
  Apps	
  
ETL	
  
CRM	
   ERP	
   Etc.	
  
Data Lake
(HDFS, etc)
BIG	
  DATA	
  
SQL on
Hadoop
Map
Reduce
Exploratory
Analytics
BI
Reporting
Fast Operational
Database
FAST	
  DATA	
  
Export
Ingest /
Interactive
Real-Time
Analytics
Fast Serve
Analytics
Decisioning
VoltDB
	
  
Enterprise	
  Apps	
  
ETL	
  
CRM	
   ERP	
   Etc.	
  
BIG	
  DATA	
  
SQL on
Hadoop
Map
Reduce
Exploratory
Analytics
Data Lake
(HDFS)
BI
Reporting
Fast Operational
Database
FAST	
  DATA	
  
Requirements for Fast Data!
Export
Decisioning
Ingest /
Interactive
Real-Time
Analytics
Fast Serve
Analytics
1
 2
3
4
5
1) Ingest	
  &	
  interact	
  on	
  streams	
  of	
  inbound	
  data	
  
2) Make	
  per	
  event,	
  data	
  driven	
  decisions	
  
3) Real-­‐Qme	
  analyQcs	
  on	
  fast	
  moving	
  data	
  
4) Integrated	
  export	
  to	
  data	
  warehouse	
  
5) High	
  speed	
  serving	
  of	
  warehouse	
  derived	
  analyQcs	
  
VoltDB
	
  
Enterprise	
  Apps	
  
ETL	
  
CRM	
   ERP	
   Etc.	
  
BIG	
  DATA	
  
SQL on
Hadoop
Map
Reduce
Exploratory
Analytics
Data Lake
(HDFS)
BI
Reporting
Requirements for Fast Data – Stream Processing!
1) Ingest	
  &	
  interact	
  on	
  streams	
  of	
  inbound	
  data	
  
2) Make	
  per	
  event,	
  data	
  driven	
  decisions	
  
3) Real-­‐Qme	
  analyQcs	
  on	
  fast	
  moving	
  data	
  
4) Integrated	
  export	
  to	
  data	
  warehouse	
  
5) High	
  speed	
  serving	
  of	
  warehouse	
  derived	
  analyQcs	
  
6) System	
  of	
  Record	
  OLTP	
  (requires	
  different	
  system)	
  
FAST	
  DATA	
  
Unable to do fast serving of
Analytics from warehouse
2
4
5
Decisioning
Ingest
Stream	
  	
  
Processing	
  
Continuous
Computation
for RTA
SQL	
  database	
  
Decisions only on
Aggregated or
predefined 
1
3
 Hand coded
computations
VoltDB
How fast?!
•  Yahoo Cloud Serving Benchmark
(YCSB) is a popular industry-
standard benchmark for cloud
databases!
•  Workload “B” is most widely reported!
–  95% reads with 5% updates. !
•  Results - Best in class cloud
performance (run in the cloud)!!
–  285k TPS for 3 nodes scaling linearly to
724k TPS for a 12 node cluster!
Latency (ms) vs. Throughput!
Linear Scalability!
724k!!
VoltDB
Example: Log management is a … mess!
19	
  
Log
Web Server
 Log
Log
Log
Log
Log
Hadoop



Log
Log
Log
Log
Log
Log
…


Log
Web Server
 Log
Web Server
 Log
Web Server
 Log
Web Server
…	
  
Log
Log
Log
Log
Log
Hadoop
Import	
  
Aggregate	
  
Clean	
  
Filter	
  
…	
  
Write	
  to	
  Disk	
  
Analysis	
  
FTP	
  
VoltDB
Log management is a Fast Data problem!
20	
  
VoltDB
 Hadoop
Read	
  Queue	
  
Aggregate	
  
Clean	
  
Filter	
  
Export	
  
Analysis	
  
Kafka
Web Server
Web Server
Web Server
Web Server
Web Server
…	
  
	
  hps://github.com/VoltDB/app-­‐log-­‐ingesQon	
  
VoltDB
IoT,	
  Energy,	
  Sensor	
  
Smart	
  grid/meters,	
  asset	
  tracking	
  &	
  management	
  
Personalized	
  Targe4ng	
  
Ad	
  opQmizaQon,	
  audience	
  segmenQng	
  
Telco	
  
Billing	
  and	
  rights	
  management,	
  subscriber	
  data,	
  etc.	
  
Capital	
  Markets	
  
Risk,	
  market	
  data	
  management,	
  customer	
  mgt	
  
Infrastructure	
  
Data	
  pipeline,	
  system	
  performance,	
  streaming	
  ETL	
  
There are lots of Fast Data Problems!
21	
  
UK	
  Smart	
  
Meter	
  
VoltDB
 sjarr@voltdb.com	
  
Smart Data Fast!

More Related Content

VoltDB Big Data Camp LA 2014 - Scott Jar

  • 1. VoltDB Fast, the Next Big! June 2014!
  • 2. VoltDB I fear we delay seeing the big value from data by not thinking ahead!
  • 8. VoltDB Are we really forced to choose?! 8   Timely! Accurate! Sampled recommendations! ! Sensor indicates out of band! ! Some fraud detected now! Suggestion after purchase! ! Trapped miners too late! ! All fraud found days later! or
  • 9. VoltDB The analytics stack is taking shape!   Enterprise  Apps   ETL   CRM   ERP   Etc.   Data Lake (HDFS, etc) BIG  DATA   Batch  Manipulate   Pre-­‐process,  etc.   Impala   Hawq   Big  SQL   …   SQL on Hadoop Map Reduce Exploratory Analytics Netezza  /  BLU   RedshiO   VerQca   Greenplum   …   BI Reporting
  • 10. VoltDB But what’s the point?! Better Decisions! Better Personalization! Better Detection! …! ! ! 10  
  • 11. VoltDB If we think about big data purely as historical analytics… ! ! ! ! ! ! ! ! ! ! !…we will miss the opportunity! 11  
  • 12. VoltDB 12   Application and analytics are no longer developed separately! !
  • 13. VoltDB Applications Require Data To! •  Ingest huge amounts of events! ���  Make data-driven decision on each event! •  Analyze in real time for operational visibility! 13  
  • 14. VoltDB The RDBMS is Getting Crushed! Reports! Analytics! Dashboards! Alerts! Mobile! M2M! Market Data! Clickstreams! Web Interactions! Social Media! Internet of Things! RDBMS   (In  Real-­‐Time)   vs.  Batch   Scale  with  unnatural  acts?   •  SSD  or  Fusion  IO  cards   •  Sharding   •  Cache/Grid   •  NoSQL,  No  ACID   Boleneck   Unlimited Data with Real Time Analytics!
  • 15. VoltDB Future Corporate Data Architecture!   Enterprise  Apps   ETL   CRM   ERP   Etc.   Data Lake (HDFS, etc) BIG  DATA   SQL on Hadoop Map Reduce Exploratory Analytics BI Reporting Fast Operational Database FAST  DATA   Export Ingest / Interactive Real-Time Analytics Fast Serve Analytics Decisioning
  • 16. VoltDB   Enterprise  Apps   ETL   CRM   ERP   Etc.   BIG  DATA   SQL on Hadoop Map Reduce Exploratory Analytics Data Lake (HDFS) BI Reporting Fast Operational Database FAST  DATA   Requirements for Fast Data! Export Decisioning Ingest / Interactive Real-Time Analytics Fast Serve Analytics 1 2 3 4 5 1) Ingest  &  interact  on  streams  of  inbound  data   2) Make  per  event,  data  driven  decisions   3) Real-­‐Qme  analyQcs  on  fast  moving  data   4) Integrated  export  to  data  warehouse   5) High  speed  serving  of  warehouse  derived  analyQcs  
  • 17. VoltDB   Enterprise  Apps   ETL   CRM   ERP   Etc.   BIG  DATA   SQL on Hadoop Map Reduce Exploratory Analytics Data Lake (HDFS) BI Reporting Requirements for Fast Data – Stream Processing! 1) Ingest  &  interact  on  streams  of  inbound  data   2) Make  per  event,  data  driven  decisions   3) Real-­‐Qme  analyQcs  on  fast  moving  data   4) Integrated  export  to  data  warehouse   5) High  speed  serving  of  warehouse  derived  analyQcs   6) System  of  Record  OLTP  (requires  different  system)   FAST  DATA   Unable to do fast serving of Analytics from warehouse 2 4 5 Decisioning Ingest Stream     Processing   Continuous Computation for RTA SQL  database   Decisions only on Aggregated or predefined 1 3 Hand coded computations
  • 18. VoltDB How fast?! •  Yahoo Cloud Serving Benchmark (YCSB) is a popular industry- standard benchmark for cloud databases! •  Workload “B” is most widely reported! –  95% reads with 5% updates. ! •  Results - Best in class cloud performance (run in the cloud)!! –  285k TPS for 3 nodes scaling linearly to 724k TPS for a 12 node cluster! Latency (ms) vs. Throughput! Linear Scalability! 724k!!
  • 19. VoltDB Example: Log management is a … mess! 19   Log Web Server Log Log Log Log Log Hadoop Log Log Log Log Log Log … Log Web Server Log Web Server Log Web Server Log Web Server …   Log Log Log Log Log Hadoop Import   Aggregate   Clean   Filter   …   Write  to  Disk   Analysis   FTP  
  • 20. VoltDB Log management is a Fast Data problem! 20   VoltDB Hadoop Read  Queue   Aggregate   Clean   Filter   Export   Analysis   Kafka Web Server Web Server Web Server Web Server Web Server …    hps://github.com/VoltDB/app-­‐log-­‐ingesQon  
  • 21. VoltDB IoT,  Energy,  Sensor   Smart  grid/meters,  asset  tracking  &  management   Personalized  Targe4ng   Ad  opQmizaQon,  audience  segmenQng   Telco   Billing  and  rights  management,  subscriber  data,  etc.   Capital  Markets   Risk,  market  data  management,  customer  mgt   Infrastructure   Data  pipeline,  system  performance,  streaming  ETL   There are lots of Fast Data Problems! 21   UK  Smart   Meter