Redpanda and ClickHouse

■ High-throughput, low-latency pub/sub
messaging system with strong durability
guarantees
■ Kafka API compatible
■ Project started in 2017. Core devs have
low-latency, distributed systems, and storage
backgrounds
■ Source Available (BSL). All development and
issue tracking done on github
■ Focused on performance, safety, and operational
simplicity
What is Redpanda?
A new kind of streaming platform

Similarities with Apache Kafka
Same high level concepts, same protocol
■ Producers, Consumers
■ Namespaces, Topics, and Partitions
■ Brokers: Leaders, and Followers
■ Transactions
■ Schema Registry
■ HTTP Proxy
Redpanda integrates with the existing Kafka ecosystem -
clients, streaming frameworks, KafkaConnect, etc.

Differences from Apache Kafka
Modernized distributed log implementation
■ Operational simplicity
■ Faster, safer, more reliable
○ Raft protocol
○ Direct IO Management (No Pagecache)
○ C++ / Seastar
○ Transactions
■ Enhancements
○ Shadow Indexing
○ WASM / Data Policies

■ 100% CLI driven via rpk
■ No reliance on external systems, no JVM.
■ Single binary includes broker, HTTP proxy, schema
registry
■ Automatic leader and partition balancing
■ Auto-tune kernel parameters, auto-detect
underlying hardware
■ Native Prometheus + Grafana integration
■ Docker image, Kubernetes controller, Terraform +
Ansible templates available
Operational Simplicity
Easy to operate out of the box; no need for enterprise tooling

● Requires odd number of replicas
● Each partition is a Raft group with r members
(where r = replication factor)
● No reliance on external systems (no Zookeeper)
● Single fault domain — just one distributed
system protocol
● Able to ride out slowness in individual replicas
○ Leader can ack to producer once majority of
replicas (including the leader) have responded
Widely used, mathematically proven distributed consensus protocol
Raft – modern consensus protocol

● Async programming model (via
futures & promises). Requires no
locks, minimizes I/O blocking.
● Thread-per-core architecture
reduces context switching costs,
preserves cache lines
“An open source C++ framework
for high performance server
applications on modern hardware.”
Seastar framework

~2ms average latency, ~100ms at p99.999
Benchmark vs Kafka
500 MB/s workload on 3 brokers

9
Shadow Indexing
Shadow Indexing provides inﬁnite data
retention by archiving log segments to
cloud object store
● Provides access to archived log
entries via the same consumer API
● 99.999999999% (11 9’s) durability
within seconds
● Global availability of read-replicas
(cross region replication under 15m)
● Archived data can serve as a backup
for disaster recovery
Unify historical and real-time streaming
Producers /
Consumers

10
Shadow Indexing
Workload isolation with analytical clusters
One or more analytical
clusters may be
provisioned to serve data
from the object store
without impacting
operational SLAs
Producers /
Consumers
Consumers
(read-only)
Consumers
(read-only)
OPERATIONAL CLUSTER ANALYTICAL CLUSTERS

11
Redpanda Transforms
● Coprocessors allow for custom logic
adjacent (core-local) to the data
● Run WASM bytecode as a sidecar process;
embedded V8 under active development
● Can benefit potentially 60% of streaming
workloads
● Sample use cases: data validation, data
transformation, data masking, message
routing, fine grained access control,
projection & filter pushdown, ...
Custom Server-Side Functions

Clickhouse and Kafka
■ Uses librdkafka as the Kafka client
○ Most of the configs from librdkafka can be placed in
config.xml in the <kafka> attribute
■ Settings for the demo:
○ kafka_max_wait_ms - set in user.xml
■ 0 to wait always
○ auto_offset_reset - set in config.xml
■ smallest - when no consumer group offset
information is present go with the smallest offset
available

Demo
https://altinity.com/blog/2020/5/21/clickhouse-kafka-engine-tutorial

Try Redpanda
Code
Check out the source:
https://github.com/vectorizedio/redpanda
Blog
Read about Redpanda from our blogs:
https://vectorized.io/blog
Slack
Join the community Slack channel:
https://vectorized.io/slack
Meet
Set a 1:1 meeting to discuss your use case
https://vectorized.io/contact
This is the way

Redpanda and ClickHouse

Related slideshows

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (20)

Similar to Redpanda and ClickHouse

Similar to Redpanda and ClickHouse (20)

More from Altinity Ltd

More from Altinity Ltd (20)

Recently uploaded

Recently uploaded (20)

Redpanda and ClickHouse