SlideShare a Scribd company logo
CRUBY+JRUBY
FLUENTDCEP
NORIKRA
MSGPACK-RPC-OVER-HTTP
LOGGING
STREAM PROCESSING
xQL
ESPER
13年6月1日土曜日
Complex Event Processing
on Ruby, Fluentd and Norikra
RubyKaigi 2013 (2013/06/01)
TAGOMORI Satoshi (@tagomoris)
13年6月1日土曜日
TAGOMORI Satoshi (@tagomoris)
LINE corp.
Ruby, Perl, Node.js, Hadoop, ...
13年6月1日土曜日
TAGOMORI Satoshi (@tagomoris)
LINE corp.
Ruby, Perl, Node.js, Hadoop, ...
Please, Call me 'MORIS' !
13年6月1日土曜日
13年6月1日土曜日
2013/04- LINE Corporation (+NHN Japan)
2012/01- NHN Japan
-2011/12 livedoor (+NHN Japan +Naver Japan)
13年6月1日土曜日
13年6月1日土曜日
13年6月1日土曜日
My mission: logging
Store access logs / application logs
Calculate & visualize service activities
Build data warehouse for application
engineers' operations
Notify anomaly service statuses
for system status (HTTP status, response
time, ...)
for application metrics
13年6月1日土曜日
Our log traffic
Daily
1.5+ TB (non compressed)
5.6+ Billion lines / day (56億行/day)
Peak time
140,000+ lines / sec
300Mbps
13年6月1日土曜日
What we want to do
COUNT PV,UU and others (daily/realtime)
COUNT Service metrics (daily/hourly)
FIND Surprising Errors [4xx,5xx] (immediately)
CHECK Response Times (immediately)
SERCH Logs in troubles (hourly/immediately)
VISUALIZE/NOTIFY App Status(realtime)
13年6月1日土曜日
BATCHES
AND
STREAMS
13年6月1日土曜日
Batches and Streams
Hadoop is for batches
High performance batch is important
HDFS has good performance
Stream log writing and calculations
are also VERY VERY IMPORTANT
Hybrid System:
Stream processing + Batch
13年6月1日土曜日
System Overview
Web
Servers Fluentd
Cluster
Archive
Storage
(scribed)
Fluentd
Watchers
Graph
Tools
Notifications
(IRC)
Hadoop Cluster
(HDFS, YARN)
webhdfs
Huahin
Manager
hive
server
STREAM
Shib ShibUI
BATCH
SCHEDULED
BATCH
Norikra
13年6月1日土曜日
Stream processing
Parsing logs
Appending flags for analysis
Counting rate/bytes
Calculating system metrics
Calculating application metrics
13年6月1日土曜日
Fluentd
"Fluentd" is a lightweight and flexible log collector.
Fluentd receives logs as JSON streams, buffers
them, and sends them to other systems like
Amazon S3, MongoDB, Hadoop, or other
Fluentds.
http://fluentd.org
13年6月1日土曜日
Fluentd on CRuby
easy to install/setup (from rubygems.org)
plugins
easy to install (from rubygems.org)
easy to write (with ruby!)
stability (no one crashes in this 1 year)
throughput (17500 msgs/sec)
td-agent (rpm/deb: ruby and fluentd and some
plugins)
13年6月1日土曜日
Fluentd users
13年6月1日土曜日
Fluentd: stream aggregation
System metrics: status / response time
13年6月1日土曜日
Fluentd: stream aggregation
### response time aggregation
<match responsetime.monitor.*>
type numeric_monitor
tag monitor.responsetime
aggregate tag
unit minute
monitor_key duration
percentiles 50,90,95,98,99
</match>
### response time counting
<match responsetime.counter.*>
type numeric_counter
tag numcount.responsetime
aggregate tag
unit minute
count_key duration
pattern1 u100ms 0 100000
pattern2 u500ms 100000 500000
pattern3 u1s 500000 1000000
pattern4 u3s 1000000 3000000
pattern5 long 3000000
</match>
### HTTP status counting
<match httpstatus.counter.*>
type datacounter
tag_prefix datacount.httpstatus
output_per_tag yes
aggregate tag
output_messages yes
unit minute
count_key status
pattern1 2xx ^2dd
pattern2 3xx ^3dd
pattern3 429 ^429
pattern4 4xx ^4dd
pattern5 5xx ^5dd
</match>
13年6月1日土曜日
break    
13年6月1日土曜日
And more: stream query
Custom plugin: not so casual enough
xQL: declarative language
streams processing
for optional data fields
no more schema management
connectivity with Fluentd
13年6月1日土曜日
Stream query:
vs stored data query
No more query wait time
Immediate result for time batch
No more storages
No more query execution management
Once register query, runs forever
13年6月1日土曜日
Norikra
13年6月1日土曜日
Norikra
Full feature of Esper over JRuby
Simple RPC: msgpack-rpc-over-http
Simple RPC Server: mizuno (jetty + rack)
Simple Client Library: norikra-client
Just same code for cruby/jruby
13年6月1日土曜日
Norikra
Norikra Server (on JVM)
Esper Instance (Query Engine)
Type Definition
Manager
Output Event
Pool
Norikra Engine
RPC Server
mizuno (Jetty + Rack)
Rack RPC Handler
Norikra
Client
Norikra
Client
JRUBY
CRUBY
msgpack-rpc-over-http
13年6月1日土曜日
Esper
"Esper and Event Processing Language (EPL)
provide a highly scalable, memory-efficient, in-
memory computing, SQL-standard, minimal
latency, real-time streaming Big Data processing
engine for medium to high-velocity and high-
variety data."
http://esper.codehaus.org/
13年6月1日土曜日
Norikra Query: target "sales"
goods_id:5 price:49.8 num:1 shop:"LINE"
goods_id:2 price:12.5 num:3 shop:"Cookpad"
goods_id:4 price:36.6 num:10 shop:"Cookpad"
SELECT shop, sum(price*num) AS amount
FROM sales.win:time_batch(10 minutes)
GROUP BY shop
goods_id:5 price:49.8 num:1 shop:"LINE"
goods_id:2 price:12.5 num:3 shop:"Cookpad" affiliate:"BiS"
SELECT affiliate, count(*) AS cnt
FROM sales.win:time_batch(1 hour)
GROUP BY affiliate
13年6月1日土曜日
Norikra query:
vs Fluentd custom plugin
SQL!!!
No more restart for new queries
register queries whenever we want
No more private plugins
No more fat Fluentd configurations
13年6月1日土曜日
fluent-plugin-norikra
Fluentd plugin to use Norikra
Norikra server autostart
Automatically defined target(ex: table)
Pre-defined queries for each targets
13年6月1日土曜日
fluent-plugin-norikra
installation
`gem install fluent-plugin-norikra`
configuration
see DEMO
13年6月1日土曜日
Demo: bootstrap
rbenv shell jruby-1.7.4
gem install norikra
which norikra
rbenv shell 2.0.0-pxxx
gem install fluent-plugin-norikra
vi demo.conf
fluentd -c demo.conf
13年6月1日土曜日
Demo: query streams
some messages over fluent-cat
register queries with norikra-client
more messages over fluent-cat & norikra-client
13年6月1日土曜日
Roadmap
of Norikra
13年6月1日土曜日
roadmap of norikra
Norikra is still UNDER DEVELOPMENT
Norikra feature updates (JOINs, etc)
Web GUI
query & target list management
save & restore
Distributed & orchestrated nodes
13年6月1日土曜日
Ruby without Rails
13年6月1日土曜日
Unbelievable
to stop GC!!!!!!!!!!
13年6月1日土曜日
CRuby
great partner for java & rubyist
and for jvm middleware, like Hadoop
Norikra uses Esper's internal API to
parse queries
gems across platforms?
JRuby
long-running daemons on cruby
memory usage is big problem
13年6月1日土曜日
SHUT THE FUCK UP
AND WRITE SOME QUERY
13年6月1日土曜日
See also:
http://fluentd.org/
http://fluentd.org/plugin/
https://github.com/tagomoris/norikra
https://github.com/tagomoris/norikra-client
https://github.com/tagomoris/fluent-plugin-norikra
http://esper.codehaus.org/
"Fluentd: The ruby based middleware across the world"
http://www.slideshare.net/tagomoris/fluentd-in-tkrk10
"Log analysis system with Hadoop in livedoor 2013 Winter"
http://www.slideshare.net/tagomoris/log-analysis-with-hadoop-in-livedoor-2013
13年6月1日土曜日

More Related Content

Complex Event Processing on Ruby, Fluentd and Norikra #rubykaigi