SlideShare a Scribd company logo
TL;DR Kafka Metrics
Gantigmaa Selenge
2
TL;DR Kafka Metrics
Gantigmaa Selenge
Kafka Cluster
3
Controller
Broke
r
Client application
Consumer
Producer
Kafka cluster
Overview
Broke
r
4
Broker metrics
Kafka Cluster
Client application
Consumer
Producer
Controller
5
Broker metrics
Alert metrics
UnderMinIsrPartitionCount
kafka.server:type=ReplicaManager,name=UnderMinIsrPartitionCount
6
Broker metrics
Alert metrics
UnderReplicatedPartitionCount
kafka.server:type=ReplicaManager,name=UnderReplicatedPartitionCount
7
BytesInPerSec | BytesOutPerSec
kafka.server:type=BrokerTopicMetrics,name={BytesInPerSec|BytesOutPerSec}
Cluster Performance
Metrics to monitor
ReplicationBytesInPerSec | ReplicationBytesOutPerSec
kafka.server:type=BrokerTopicMetrics,name={ReplicationBytesInPerSec|ReplicationBytesOutPerSec}
8
RequestHandlerAvgIdlePercent
kafka.server:type=KafkaRequestHandlersPool,name=RequestHandlerAvgIdlePercent
Cluster Performance
Metrics to monitor
9
Unbalanced cluster
Metrics to monitor
PartitionCount | LeaderCount
kafka.server:type=ReplicaManager,name=PartitionCount|LeaderCount
10
RequestsPerSec
kafka.network:type=RequestMetrics,name=RequestsPerSec,request={Produce|FetchConsumer|
FetchFollower}
Slow network
Metrics to monitor
11
NetworkProcessorAvgIdlePercent
kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent
Slow network
Metrics to monitor
12
RequestQueueTimeMs | RequestQueueSize
kafka.network:type=RequestChannel,name=RequestQueueTimeMs|RequestQueueSize
Slow network
Metrics to monitor
Kafka Cluster
13
Controller
Broke
r
Client application
Consumer
Producer
Controller Metrics
14
ActiveControllerCount
kafka.controller:type=KafkaController,name=ActiveControllerCount
Controller metrics
Alert metrics
15
OfflinePartitionCount
kafka.controller:type=KafkaController,name=OfflinePartitionCount
Controller metrics
Alert metrics
16
LeaderElectionRateAndTimeMs
kafka.controller:type=KafkaController,name=LeaderElectionRateAndTimeMs
Controller metrics
Alert metrics
Kafka Cluster
17
Controller
Broke
r
Client application
Consumer
Producer
Client Metrics
External monitoring for clusters
18
connection-count
kafka.[producer|consumer]:type=[producer|consumer]-metrics,client-id=([-.w]+)
Client metrics
Alert metrics
19
incoming|outgoing-byte-rate
kafka.[producer|consumer]:type=[producer|consumer]-metrics,client-id=([-.w]+)
Client metrics
Alert metrics
Client application
Kafka Cluster
Producer
JVM
JVM
20
Broker
Controller
Monitoring tools
How does it all fit together?
Prometheus
jmx_exporter
server
jmx_exporter
server
Alert Manager
PagerDuty
Grafana
Consumer
21
https://github.com/tinaselenge
https://www.linkedin.com/in/gselenge
https://developers.redhat.com/topics/kafka-kubernetes
https://kafka.apache.org/documentation/#monitoring
https://strimzi.io/docs/operators/0.36.1/full/overview#metrics-overview_str
https://cwiki.apache.org/confluence/display/KAFKA/
KIP-714%3A+Client+metrics+and+observability
Thank you
22
Tuesday
2:00 PM - 2:45 PM Breakout Room 4
Getting the Balance Right with Kafka Connect
Kate Stanley
5:30 PM - 6:15 PM Breakout Room 7
Safeguarding Your Kafka Data with Encryption-at-rest
Tom Bentley
Red Hat Sessions
Wednesday
2:00 PM - 2:45 PM Breakout Room 3
Meet the Apache Kafka Committers
Tom Bentley
2:15 PM - 3:00 PM Meetup Hub
Running Kafka on Kube the Native Way with Operators (plus Kafka Connect book signing)
Kate Stanley

More Related Content

TL;DR Kafka Metrics | Kafka Summit London