SlideShare a Scribd company logo
StreamingStreamingStreamingStreaming&ParallelParallelParallelParallel
Decision TreeDecision TreeDecision TreeDecision Treein FlinkFlinkFlinkFlink
1 2 3 4
1 2 3 4 anwar.rizal @anrizal
MotivationMotivationMotivationMotivation
Motivation
Architecture
Decision Trees
Implementation
Conclusion
MotivationMotivationMotivationMotivation
Need a classifier system on streaming data
The data used for learning come as
a stream
So are the data to be classified
MotivationMotivationMotivationMotivation
$90 $90 $120 $90 $90 $150 $200
$90 $75 $90 $90 $90 $90 $90
$120 $90 Sold out Sold out $75 $90 $90
$120 $90 $90 $90 $100 $90 $120
(predicted) to increase zero to two days
(predicted) to increase this week
(predicted) to increase next week
MotivationMotivationMotivationMotivation
FRA – NYC
FRA - LON
FRA - MEX
Need attention
Revenue decrease
Need attention
passenger decrease
Need attention
revenue decrease,
cost increase
MotivationMotivationMotivationMotivation
Need a classifier system on streaming data
The data used for learning come as
a stream
So are the data to be classified
MotivationMotivationMotivationMotivation
The classifier is kept fresh
No need for separate batch learning/evaluation
The feedback is taken into account in real time, regularly
The classifier can be introspected
Transparent model structure
(e.g. know the tree, information gain for each split
point)
Known expected performance (accuracy, precision, recall,
AUC)
Seamless support for workflow of machine
learning
Data preprocessing: up/down sampling, imputations, …
Feature selections
Model evaluation, cross validation,
MUST
MotivationMotivationMotivationMotivation
The classifier is immediately available
The classifier can already predict during learning
When learning phase is terminated, it starts another cycle of
learning
The classifier has a meta-learning capability
The classifier has several models different parameters
It is possible to learn about the learning capability of the
models
NICE TO HAVE
MotivationMotivationMotivationMotivation
Learning Learning &
Classifying
End of
learning
New cycle of
learning
Cycle of
Learning, Classifying during Learning, End
of Learning, Classifying, New Learning
MotivationMotivationMotivationMotivation
Classifying Application
Stream Learner
Labeled
points
Classifier Predicted
points
Unlabeled
points
MotivationMotivationMotivationMotivation
Decision Tree Algorithms Outline
Implementation using Flink Streaming
Further Discussions
Thanks!
Credit to: Yiqing Yan (Eurecom) & Tianshu Yang (Telecom Bretagne), Amadeus Interns

More Related Content

Flink Case Study: Amadeus