SlideShare a Scribd company logo
WHAT YOU SEE
IS WHAT YOU GET
Kafka Connect
implementation at GumGum
08.15.2017
2
About GumGum
! Artificial Intelligence company
! 9 year old, 225 employees
! Offices in New York, Chicago, London, Sydney
! Thousands of Publishers and Advertisers
! Process billions of impressions every day
3
Advertising
4
GumGum Sports
GumGum’s
Architecture
Previous Architecture: Pipeline A
Real Time
Primary
AWS
Redshift
Amazon S3File Uplaod
Previous Architecture: Pipeline B
AWS
Redshift
Amazon S3
8
! Stateful Ad Servers
! Data Loss
! Reducing Network Transfer
Problems with that architecture
Migration to
Kafka Connect
Our Constraints
10
! No duplicate events
! Consume all the messages from Kafka
! Kafka Connect must integrate with the current storage format
Overriding Kafka Connect classes
11
! Overriding the S3 sink destination
○ From bucket/topic/topicName/ to our constraints
7 },
8 "lastName": {
9 "type": "string"
10 },
11 "age": {
12 "type": "integer",
13 "minimum": 0
14 }
15 },
16 "required": ["firstName", "lastName"]}
1 public class TopicPartitionWriter {
2 ...
3 private String fileKey(String keyPrefix, String name) {
4 // return topicsPrefix + dirDelim + keyPrefix + dirDelim + name;
5 return keyPrefix + dirDelim + name;
6 }
7
8 private String fileKeyToCommit(String dirPrefix, long startOffset) {
9 String name = tp.topic()
10 + fileDelim
11 + tp.partition()
12 + fileDelim
13 + String.format(zeroPadOffsetFormat, startOffset)
14 + extension;
15 // return fileKey(topicsDir, dirPrefix, name);
16 return fileKey(dirPrefix, name);
17 }
18 ...
Overriding Kafka Connect classes
12
! Need to compress our events
○ Need to compress the data to reduce S3 costs
○ Custom implementation of the Avro Record Writer Provider using
SNAPPY Compression (Available in Confluent platform 3.3.0)
○ Gzip compression for some of our other events
1 Introduction
1 public class RTBTimestampExtractor implements TimestampExtractor {
2
3 @Override
4 public Long extract(ConnectRecord<?> record) {
5 Object value = record.value();
6 if (value instanceof Struct) {
7 Struct struct = (Struct) value;
8 value = struct.get("eventMetadata");
9 if (value instanceof Struct) {
10 Struct eventMetadataStruct = (Struct) value;
11 Object timestamp = eventMetadataStruct.get("timestamp");
12 if (timestamp instanceof Long) {
13 return (Long) timestamp;
14 }
15 ...
1 public class GumGumAvroRecordWriterProvider extends AvroRecordWriterProvider {
2
3 @Override
4 public RecordWriter getRecordWriter(final S3SinkConnectorConfig conf,
5 final String filename) {
6 // This is not meant to be a thread-safe writer!
7 return new RecordWriter() {
8 final DataFileWriter<Object> writer =
9 new DataFileWriter<>(new GenericDatumWriter<>())
10 .setCodec(CodecFactory.snappyCodec());
11 ...
Overriding Kafka Connect classes
13
! Creating a String format
Tue Jul 04 01:00:00 -0700 2017, {"id":"32237763-4c55-4d35-84df-23f8be320449","t":
1499155200608,"cl":"js","ua":"Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_1 like Mac OS X)
AppleWebKit/603.1.30 (KHTML, like Gecko) Mobile/14E304 [FBAN/FBIOS;FBAV/99.0.0.57.70;FBBV/
63577032;FBDV/iPhone5,3;FBMD/iPhone;FBSN/iOS;FBSV/10.3.1;FBSS/2;FBCR/Verizon;FBID/phone;FBLC/
en_US;FBOP/5;FBRV/0]","bty":2,"bfa":"Facebook App","bn":"Facebook","bof":"iOS","bon":"iPhone
OS","ip":"141.239.172.162","cc":"US","rg":"HI","ct":"Kailua","pc":"96734","mc":
744,"isp":"Hawaiian Telcom","bf":"704a0c01a4995359fc8c336d5751d0ad17f1c301","lt":"Mon Jul 03
22:00:00 -1000 2017","sip":"10.11.152.18","awsr":"us-west-1"},
{"v":"1.1","pv":"0e27633e-025b-43fd-a971-9ebf854188c0","r":"release-1211-15-
gfa55c30","t":"5e6e2525","a":[{"i":11,"u":"http://wishesndishes.com/images/adthrive/2017/06/
Weekly-Meal-Plan-Week-100-480x480.jpg","w":300,"h":300,"x":10,"y":
10367,"lt":"in","af":false,"lu":"http://wishesndishes.com/weekly-meal-plan-week-100/?
m&m","ia":"Weekly Meal Plan {Week 100} - 10 great bloggers bringing you a full week of summer
recipes including dinner, sides dishes, and desserts!"}],"rf":"http://wishesndishes.com/
creamy-pecan-crunch-grape-salad/","p":"http://wishesndishes.com/creamy-pecan-crunch-grape-
salad/?m","fs":false,"ce":true,"ac":{"25855":5},"vp":{"ii":false,"w":320,"h":546},"sc":{"w":
320,"h":568,"d":2},"tr":0.6,"pid":11685,"pn":"Ad Thrive","vid":16,"ths":["GGT0"],"aevt":
["GGE24-3","GGE24-4","GGE26-1"],"pcat":["IAB8","IAB8-1"],"ss":"0.75","hk":
["pecan","bloggers","bringing","dishes","crunch","desserts","dinner","creamy","salad","dishes
and desserts"],"ut":[1,2,34,3,4,20,6,9,10]}
Previous Architecture: Pipeline A
Now with Kafka Connect: Pipeline A
Previous Architecture: Pipeline B
Now with Kafka Connect: Pipeline B
Production
Issues
19
! Schema: Defines the possible fields of the message
! Use Maven plugin when generating your schema
! Make sure you use the schema evolution properties properly
! Kafka-Connect performance can decrease drastically because of a
schema evolution
Schema evolution
11 Object timestamp = eventMetadataStruct.get("timestamp");
12 if (timestamp instanceof Long) {
13 return (Long) timestamp;
14 }
15 ...
1 public class GumGumAvroRecordWriterProvider extends AvroRecordWriterProvider {
2
3 @Override
4 public RecordWriter getRecordWriter(final S3SinkConnectorConfig conf,
5 final String filename) {
6 // This is not meant to be a thread-safe writer!
7 return new RecordWriter() {
8 final DataFileWriter<Object> writer =
9 new DataFileWriter<>(new GenericDatumWriter<>())
10 .setCodec(CodecFactory.snappyCodec());
11 ...
1 {"namespace": "example.avro",
2 "type": "record",
3 "name": "User",
4 "fields": [
5 {"name": "name", "type": "string"},
6 {"name": "favorite_number", "type": ["int", "null"]},
7 {"name": "favorite_color", "type": ["string", "null"]}
8 ]
9 }
20
Schema evolution: NONE
E1 E2 E2 E1 E1
E2
E2
E1
S1 S2 S1
21
Schema evolution: FORWARD
E1
E2
E2
E1
S1
E1 E2 E2 E1
22
Schema evolution: BACKWARD & FULL
E1
E2
E2
E1
S1 S2
E1 E2 E2 E1
23
Monitoring Kafka Connect
! Monitoring health of Kafka-Connect cluster
○ Ganglia Monitoring
24
Monitoring Kafka Connect
! Monitoring health of Kafka-Connect
○ Log ingestion through Sumo Logic / Splunk
25
Monitoring Kafka Connect
! Use of Zookeeper and Kafka monitoring tools to carefully monitor the lag
○ AWS Cloud Watch Alerts
! Monitoring of the connectors with the Kafka-Connect REST API
26
Auto remediation
! Monitoring of the connectors with the Kafka-Connect REST API
○ What happen when something fails?
○ Only 8 hours of data in Kafka - Need to recover quickly
○ Notification on connector failure
27
Auto remediation
! In case of massive outage of Kafka-Connect, what to do with invalid
offsets?
○ auto.offset.reset property
28
THANK YOU!
Karim Lamouri
karim@gumgum.com

More Related Content

Kafka Connect implementation at GumGum

  • 1. WHAT YOU SEE IS WHAT YOU GET Kafka Connect implementation at GumGum 08.15.2017
  • 2. 2 About GumGum ! Artificial Intelligence company ! 9 year old, 225 employees ! Offices in New York, Chicago, London, Sydney ! Thousands of Publishers and Advertisers ! Process billions of impressions every day
  • 6. Previous Architecture: Pipeline A Real Time Primary AWS Redshift Amazon S3File Uplaod
  • 7. Previous Architecture: Pipeline B AWS Redshift Amazon S3
  • 8. 8 ! Stateful Ad Servers ! Data Loss ! Reducing Network Transfer Problems with that architecture
  • 10. Our Constraints 10 ! No duplicate events ! Consume all the messages from Kafka ! Kafka Connect must integrate with the current storage format
  • 11. Overriding Kafka Connect classes 11 ! Overriding the S3 sink destination ○ From bucket/topic/topicName/ to our constraints 7 }, 8 "lastName": { 9 "type": "string" 10 }, 11 "age": { 12 "type": "integer", 13 "minimum": 0 14 } 15 }, 16 "required": ["firstName", "lastName"]} 1 public class TopicPartitionWriter { 2 ... 3 private String fileKey(String keyPrefix, String name) { 4 // return topicsPrefix + dirDelim + keyPrefix + dirDelim + name; 5 return keyPrefix + dirDelim + name; 6 } 7 8 private String fileKeyToCommit(String dirPrefix, long startOffset) { 9 String name = tp.topic() 10 + fileDelim 11 + tp.partition() 12 + fileDelim 13 + String.format(zeroPadOffsetFormat, startOffset) 14 + extension; 15 // return fileKey(topicsDir, dirPrefix, name); 16 return fileKey(dirPrefix, name); 17 } 18 ...
  • 12. Overriding Kafka Connect classes 12 ! Need to compress our events ○ Need to compress the data to reduce S3 costs ○ Custom implementation of the Avro Record Writer Provider using SNAPPY Compression (Available in Confluent platform 3.3.0) ○ Gzip compression for some of our other events 1 Introduction 1 public class RTBTimestampExtractor implements TimestampExtractor { 2 3 @Override 4 public Long extract(ConnectRecord<?> record) { 5 Object value = record.value(); 6 if (value instanceof Struct) { 7 Struct struct = (Struct) value; 8 value = struct.get("eventMetadata"); 9 if (value instanceof Struct) { 10 Struct eventMetadataStruct = (Struct) value; 11 Object timestamp = eventMetadataStruct.get("timestamp"); 12 if (timestamp instanceof Long) { 13 return (Long) timestamp; 14 } 15 ... 1 public class GumGumAvroRecordWriterProvider extends AvroRecordWriterProvider { 2 3 @Override 4 public RecordWriter getRecordWriter(final S3SinkConnectorConfig conf, 5 final String filename) { 6 // This is not meant to be a thread-safe writer! 7 return new RecordWriter() { 8 final DataFileWriter<Object> writer = 9 new DataFileWriter<>(new GenericDatumWriter<>()) 10 .setCodec(CodecFactory.snappyCodec()); 11 ...
  • 13. Overriding Kafka Connect classes 13 ! Creating a String format Tue Jul 04 01:00:00 -0700 2017, {"id":"32237763-4c55-4d35-84df-23f8be320449","t": 1499155200608,"cl":"js","ua":"Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_1 like Mac OS X) AppleWebKit/603.1.30 (KHTML, like Gecko) Mobile/14E304 [FBAN/FBIOS;FBAV/99.0.0.57.70;FBBV/ 63577032;FBDV/iPhone5,3;FBMD/iPhone;FBSN/iOS;FBSV/10.3.1;FBSS/2;FBCR/Verizon;FBID/phone;FBLC/ en_US;FBOP/5;FBRV/0]","bty":2,"bfa":"Facebook App","bn":"Facebook","bof":"iOS","bon":"iPhone OS","ip":"141.239.172.162","cc":"US","rg":"HI","ct":"Kailua","pc":"96734","mc": 744,"isp":"Hawaiian Telcom","bf":"704a0c01a4995359fc8c336d5751d0ad17f1c301","lt":"Mon Jul 03 22:00:00 -1000 2017","sip":"10.11.152.18","awsr":"us-west-1"}, {"v":"1.1","pv":"0e27633e-025b-43fd-a971-9ebf854188c0","r":"release-1211-15- gfa55c30","t":"5e6e2525","a":[{"i":11,"u":"http://wishesndishes.com/images/adthrive/2017/06/ Weekly-Meal-Plan-Week-100-480x480.jpg","w":300,"h":300,"x":10,"y": 10367,"lt":"in","af":false,"lu":"http://wishesndishes.com/weekly-meal-plan-week-100/? m&m","ia":"Weekly Meal Plan {Week 100} - 10 great bloggers bringing you a full week of summer recipes including dinner, sides dishes, and desserts!"}],"rf":"http://wishesndishes.com/ creamy-pecan-crunch-grape-salad/","p":"http://wishesndishes.com/creamy-pecan-crunch-grape- salad/?m","fs":false,"ce":true,"ac":{"25855":5},"vp":{"ii":false,"w":320,"h":546},"sc":{"w": 320,"h":568,"d":2},"tr":0.6,"pid":11685,"pn":"Ad Thrive","vid":16,"ths":["GGT0"],"aevt": ["GGE24-3","GGE24-4","GGE26-1"],"pcat":["IAB8","IAB8-1"],"ss":"0.75","hk": ["pecan","bloggers","bringing","dishes","crunch","desserts","dinner","creamy","salad","dishes and desserts"],"ut":[1,2,34,3,4,20,6,9,10]}
  • 15. Now with Kafka Connect: Pipeline A
  • 17. Now with Kafka Connect: Pipeline B
  • 19. 19 ! Schema: Defines the possible fields of the message ! Use Maven plugin when generating your schema ! Make sure you use the schema evolution properties properly ! Kafka-Connect performance can decrease drastically because of a schema evolution Schema evolution 11 Object timestamp = eventMetadataStruct.get("timestamp"); 12 if (timestamp instanceof Long) { 13 return (Long) timestamp; 14 } 15 ... 1 public class GumGumAvroRecordWriterProvider extends AvroRecordWriterProvider { 2 3 @Override 4 public RecordWriter getRecordWriter(final S3SinkConnectorConfig conf, 5 final String filename) { 6 // This is not meant to be a thread-safe writer! 7 return new RecordWriter() { 8 final DataFileWriter<Object> writer = 9 new DataFileWriter<>(new GenericDatumWriter<>()) 10 .setCodec(CodecFactory.snappyCodec()); 11 ... 1 {"namespace": "example.avro", 2 "type": "record", 3 "name": "User", 4 "fields": [ 5 {"name": "name", "type": "string"}, 6 {"name": "favorite_number", "type": ["int", "null"]}, 7 {"name": "favorite_color", "type": ["string", "null"]} 8 ] 9 }
  • 20. 20 Schema evolution: NONE E1 E2 E2 E1 E1 E2 E2 E1 S1 S2 S1
  • 22. 22 Schema evolution: BACKWARD & FULL E1 E2 E2 E1 S1 S2 E1 E2 E2 E1
  • 23. 23 Monitoring Kafka Connect ! Monitoring health of Kafka-Connect cluster ○ Ganglia Monitoring
  • 24. 24 Monitoring Kafka Connect ! Monitoring health of Kafka-Connect ○ Log ingestion through Sumo Logic / Splunk
  • 25. 25 Monitoring Kafka Connect ! Use of Zookeeper and Kafka monitoring tools to carefully monitor the lag ○ AWS Cloud Watch Alerts ! Monitoring of the connectors with the Kafka-Connect REST API
  • 26. 26 Auto remediation ! Monitoring of the connectors with the Kafka-Connect REST API ○ What happen when something fails? ○ Only 8 hours of data in Kafka - Need to recover quickly ○ Notification on connector failure
  • 27. 27 Auto remediation ! In case of massive outage of Kafka-Connect, what to do with invalid offsets? ○ auto.offset.reset property