SlideShare a Scribd company logo
Go Against the Flow
Databases and Stream
Processing
N E H A N A R K H E D E
Businesses are streams of events
DB
Old World
DB
DB
DB
DWH
Operational Databases Relational Data Warehouse
Reporting
Analytics
App
App
App
New World
Streaming First
• DB/DWH + Many more distributed
data systems
• Monolith -> Microservices
• Batch -> Real-time
Kafka Summit SF 2017 - Keynote - Go Against the Flow: Databases and Stream Processing
App
01
Databases
A Swiss-army Knife
C H A L L E N G E 0 1
Shared state is unsuitable
for microservices
App
03
App
02
App
01
Databases
Mutable state hurts
forward compatibility
App
03
App
02
App
01
Databases
C H A L L E N G E 0 2
App App
App
Query
Inefficient for
streaming data
C H A L L E N G E 0 3
Databases
Turning the database inside out
for a streaming-first world
Storage
What would the core storage abstraction
for streaming data look like?
Processing
What would queries on streaming data
look like?
Materialized Views
How can materialized views be
constructed on streaming data?
T U R N I N G T H E D A T A B A S E I N S I D E O U T
Storage in Databases
The log is an implementation detail
Log
T U R N I N G T H E D A T A B A S E I N S I D E O U T
Storage for Streams
The log as a first class citizen
Log• Suitable for streaming data
• Built around immutability as a
core construct
T U R N I N G T H E D A T A B A S E I N S I D E O U T
Query
Processing in Databases
One-time short-lived queries
T U R N I N G T H E D A T A B A S E I N S I D E O U T
Processing on Streams
Continuous queries
Stream
Table
Stream 01:
Stream 02:
Continuous
Query
Continuous queries core abstractions
Streams and Tables
Stream
Table
Stream 01:
Stream 02:
Continuous
Query Derived Table
Source of Truth
Stream
Query
Insert data
Source Tables Materialized View
Create via a query
Select ⭑ FROM ORDERS
Where Region – ‘USA’
T U R N I N G T H E D A T A B A S E I N S I D E O U T
Materialized Views
In relational databases
Streaming Materialized Views
In Kafka
Stream
Table
Stream 01:
Stream 02:
Continuous
Query
Streaming
Materialized View
QueryQuery
What is Stream Processing?
Stream 01:
Stream 02:
Continuous
Query
Stream
Table
Processing streams of data to create more
streams or tables
Stream Processing is approachable
only to those of us who can write code
Kafka Summit SF 2017 - Keynote - Go Against the Flow: Databases and Stream Processing
Introducing KSQL
Open source Streaming SQL for Apache Kafka
The first completely interactive SQL interface for Kafka
KSQL supports a variety of powerful stream processing operations
Continuous window aggregations
Stream-table joins
Filters, projections
Sessionization
N O C O D I N G R E Q U I R E D
KSQL
A look inside
• You can submit queries using an
interactive SQL command line client
• Several continuous queries run in
parallel on a KSQL cluster
• Adding more server processes scales a
KSQL cluster
KS Q L DE M O
Real-time Anomaly Detection:
Malicious Web Users
KSQL in practice
Use Cases
A big step towards a streaming-first world:
• Real-time monitoring and analytics
• Streaming ETL, not Batch ETL
• Application development
KSQL
Streaming SQL for Apache Kafka™
github.com/confluentinc/ksql
slackpass.io/confluentcommunity -- #ksql
confluent.io/ksql

More Related Content

Kafka Summit SF 2017 - Keynote - Go Against the Flow: Databases and Stream Processing

  • 1. Go Against the Flow Databases and Stream Processing N E H A N A R K H E D E
  • 3. DB Old World DB DB DB DWH Operational Databases Relational Data Warehouse Reporting Analytics App App App
  • 4. New World Streaming First • DB/DWH + Many more distributed data systems • Monolith -> Microservices • Batch -> Real-time
  • 7. C H A L L E N G E 0 1 Shared state is unsuitable for microservices App 03 App 02 App 01 Databases
  • 8. Mutable state hurts forward compatibility App 03 App 02 App 01 Databases C H A L L E N G E 0 2 App App App
  • 9. Query Inefficient for streaming data C H A L L E N G E 0 3 Databases
  • 10. Turning the database inside out for a streaming-first world Storage What would the core storage abstraction for streaming data look like? Processing What would queries on streaming data look like? Materialized Views How can materialized views be constructed on streaming data?
  • 11. T U R N I N G T H E D A T A B A S E I N S I D E O U T Storage in Databases The log is an implementation detail Log
  • 12. T U R N I N G T H E D A T A B A S E I N S I D E O U T Storage for Streams The log as a first class citizen Log• Suitable for streaming data • Built around immutability as a core construct
  • 13. T U R N I N G T H E D A T A B A S E I N S I D E O U T Query Processing in Databases One-time short-lived queries
  • 14. T U R N I N G T H E D A T A B A S E I N S I D E O U T Processing on Streams Continuous queries Stream Table Stream 01: Stream 02: Continuous Query
  • 15. Continuous queries core abstractions Streams and Tables Stream Table Stream 01: Stream 02: Continuous Query Derived Table Source of Truth Stream
  • 16. Query Insert data Source Tables Materialized View Create via a query Select ⭑ FROM ORDERS Where Region – ‘USA’ T U R N I N G T H E D A T A B A S E I N S I D E O U T Materialized Views In relational databases
  • 17. Streaming Materialized Views In Kafka Stream Table Stream 01: Stream 02: Continuous Query Streaming Materialized View
  • 18. QueryQuery What is Stream Processing? Stream 01: Stream 02: Continuous Query Stream Table Processing streams of data to create more streams or tables
  • 19. Stream Processing is approachable only to those of us who can write code
  • 21. Introducing KSQL Open source Streaming SQL for Apache Kafka The first completely interactive SQL interface for Kafka KSQL supports a variety of powerful stream processing operations Continuous window aggregations Stream-table joins Filters, projections Sessionization
  • 22. N O C O D I N G R E Q U I R E D
  • 23. KSQL A look inside • You can submit queries using an interactive SQL command line client • Several continuous queries run in parallel on a KSQL cluster • Adding more server processes scales a KSQL cluster
  • 24. KS Q L DE M O Real-time Anomaly Detection: Malicious Web Users
  • 25. KSQL in practice Use Cases A big step towards a streaming-first world: • Real-time monitoring and analytics • Streaming ETL, not Batch ETL • Application development
  • 26. KSQL Streaming SQL for Apache Kafka™ github.com/confluentinc/ksql slackpass.io/confluentcommunity -- #ksql confluent.io/ksql