Improvements to Flink & it's Applications in Alibaba Search

Blink
Improvements to Flink &
Its Applications in Alibaba SearchXiaowei Jiang, Feng Wang
{xiaowei.jxw, jason.wang}
@alibaba-inc.com

Who Are We?
n Xiaowei Jiang
l 2014 −− now Alibaba
l 2010 −− 2014 Facebook
l 2002 −− 2010 Microsoft
l 2000 −− 2002 Stratify
n Feng Wang
l 2006 −− now Alibaba

About Alibaba
n  Alibaba Group
l  Operating the world’s largest online marketplace
l  Annual GMV $394 Billion in year 2015
n  Alibaba Search
l  Personalized search and recommendation platform
l  Major driver of online traffic

Agenda
n Background
n What is Blink?
n Improvements in Blink
n Challenges & Future

Logs
Scenario – Realtime A/B Test
Transacton
Parser
Filter
Join
Agg
Parser
Filter
UDF
Druid
Click
Impression
Parser
Filter

Scenario – Search Index Build & Update
DataSource
Filter
Sync
HBase
IC
Filter
Sync
UIC
Join
Search
Engine
Export
HBase
Result
UIC
IC1
IC2
UIC1
UIC2

Streaming Topologies
Long Batch Pipelines
Machine Learning at Scale
Graph Analysis
à low latency
à resource utilization
à iterative algorithms
à mutable state
Flink: Unified Compute Engine

What is Blink?
n Blink – Improvements to Flink from Alibaba
l Comprehensive Improvements to Flink Table API
l Improved Runtime Compatible with Flink API and Ecosystem
n Status
l Runs on Thousands of Nodes In Alibaba Production
l Supports Mission Critical Products

Table API Improvements
n Principle – Unified SQL layer for batch and streaming
n Functionality
l  UDF/UDTF/UDAGG
l  Stream-Stream Join
l  Aggregation(min, max, avg, sum, count, distinct_count)
l  Windowing (time_window, count_window)
l  Retraction

Runtime Improvements
n New Runtime Architecture on YARN
n Optimized State, Checkpoint & Failover
n Reliable & Production Quality
n Much More

Flink on YARN
Client Node YARN Node
YARN Node
YARN
ResourceManager
YARN
NodeManager
Container
Flink
JobManager
YARN
AppMaster
YARN Node
YARN
NodeManager
Container
Flink
TaskManager
YARN Node
YARN
NodeManager
Container
Flink
TaskManager
Flink
YARN Client
HDFS
4.allocate worker
3.allocate app master
1. store user jar and conﬁguration
2. register resource and request app master
always bootstrap containers with user jar and conﬁg

Blink on YARN
Client Node YARN Node
YARN Node
YARN
ResourceManager
YARN
NodeManager
Container
JobMaster
YARN Node
YARN
NodeManager
YARN Node
YARN
NodeManager
Blink Client
HDFS
4.allocate worker
3.allocate app master
1. store user jar and conﬁguration
2. register resource and request app master
always bootstrap containers with user jar and conﬁg
Container
TaskExecutor
Container
TaskExecutor
Container
TaskExecutor
Container
Container
TaskExecutor
JobMaster
4.allocate worker

Blink Job Architecture
Yarn Node
NodeManager
Yarn Node
NodeManager
Shuffle Service
Yarn Node
NodeManager
Shuffle Service
HDFS
ZooKeeper
controlchannel
controlchannel
state backup/recover
local data channel local data channel
state backup/recover
Container
Job Master
task scheduler
checkpoint
coordinator
Container
rocks db spilled file
Task Executor
taskin out
Container
Task Executor
taskin out
Container
Task Executor
taskin out
Container
Task Executor
taskin out
completed checkpoint
schedule events
Network data channel

Blink Checkpoint & State
TaskExecutor
Local CPn Local CPn-1Incremental Backup
OnComplete
i1 i2 i3 Bn
in queue
o1 o2 Bn-
1
o3
out queue
2. hard link snapshot
Job Master
1. trigger
3.ack
clean up
4. complete
clean up
Task
operator
state
HDFS
reference
async
CPn
CPn-1
diﬀ
State Files
1.sst 2.sst n.sst

Blink Failover
At Least Once
Source
Source
Source
Source
fail restart
restart
failover
Excactly Once
Source
Source
Source
Source
fail restart
failover
Sink
Sink
Sink
Sink

Blink Metrics
Job Vertex Number: [CPU, Memory] * Parallelism

In Queue
TPS
Out Queue
Latency
Delay
CPU
Memory
Task Metrics
Running Tasks

Challenges & Future
n Continued Optimization in Streaming
n Batch in Production
n Machine Learning in Production
n Larger Cluster Scale
n Contribute back to Flink community

Q & A
Thank You!
Xiaowei Jiang: xiaowei.jxw@alibaba-inc.com
Twitter: @xiaoweij
Feng Wang: jason.wang@alibaba-inc.com
Twitter: @ifengwang

Improvements to Flink & it's Applications in Alibaba Search

Related slideshows

More Related Content

Improvements to Flink & it's Applications in Alibaba Search