Complex Event Processing on Ruby, Fluentd and Norikra #rubykaigi
- 9. My mission: logging
Store access logs / application logs
Calculate & visualize service activities
Build data warehouse for application
engineers' operations
Notify anomaly service statuses
for system status (HTTP status, response
time, ...)
for application metrics
13年6月1日土曜日
- 10. Our log traffic
Daily
1.5+ TB (non compressed)
5.6+ Billion lines / day (56億行/day)
Peak time
140,000+ lines / sec
300Mbps
13年6月1日土曜日
- 11. What we want to do
COUNT PV,UU and others (daily/realtime)
COUNT Service metrics (daily/hourly)
FIND Surprising Errors [4xx,5xx] (immediately)
CHECK Response Times (immediately)
SERCH Logs in troubles (hourly/immediately)
VISUALIZE/NOTIFY App Status(realtime)
13年6月1日土曜日
- 13. Batches and Streams
Hadoop is for batches
High performance batch is important
HDFS has good performance
Stream log writing and calculations
are also VERY VERY IMPORTANT
Hybrid System:
Stream processing + Batch
13年6月1日土曜日
- 16. Fluentd
"Fluentd" is a lightweight and flexible log collector.
Fluentd receives logs as JSON streams, buffers
them, and sends them to other systems like
Amazon S3, MongoDB, Hadoop, or other
Fluentds.
http://fluentd.org
13年6月1日土曜日
- 17. Fluentd on CRuby
easy to install/setup (from rubygems.org)
plugins
easy to install (from rubygems.org)
easy to write (with ruby!)
stability (no one crashes in this 1 year)
throughput (17500 msgs/sec)
td-agent (rpm/deb: ruby and fluentd and some
plugins)
13年6月1日土曜日
- 20. Fluentd: stream aggregation
### response time aggregation
<match responsetime.monitor.*>
type numeric_monitor
tag monitor.responsetime
aggregate tag
unit minute
monitor_key duration
percentiles 50,90,95,98,99
</match>
### response time counting
<match responsetime.counter.*>
type numeric_counter
tag numcount.responsetime
aggregate tag
unit minute
count_key duration
pattern1 u100ms 0 100000
pattern2 u500ms 100000 500000
pattern3 u1s 500000 1000000
pattern4 u3s 1000000 3000000
pattern5 long 3000000
</match>
### HTTP status counting
<match httpstatus.counter.*>
type datacounter
tag_prefix datacount.httpstatus
output_per_tag yes
aggregate tag
output_messages yes
unit minute
count_key status
pattern1 2xx ^2dd
pattern2 3xx ^3dd
pattern3 429 ^429
pattern4 4xx ^4dd
pattern5 5xx ^5dd
</match>
13年6月1日土曜日
- 22. And more: stream query
Custom plugin: not so casual enough
xQL: declarative language
streams processing
for optional data fields
no more schema management
connectivity with Fluentd
13年6月1日土曜日
- 23. Stream query:
vs stored data query
No more query wait time
Immediate result for time batch
No more storages
No more query execution management
Once register query, runs forever
13年6月1日土曜日
- 25. Norikra
Full feature of Esper over JRuby
Simple RPC: msgpack-rpc-over-http
Simple RPC Server: mizuno (jetty + rack)
Simple Client Library: norikra-client
Just same code for cruby/jruby
13年6月1日土曜日
- 26. Norikra
Norikra Server (on JVM)
Esper Instance (Query Engine)
Type Definition
Manager
Output Event
Pool
Norikra Engine
RPC Server
mizuno (Jetty + Rack)
Rack RPC Handler
Norikra
Client
Norikra
Client
JRUBY
CRUBY
msgpack-rpc-over-http
13年6月1日土曜日
- 27. Esper
"Esper and Event Processing Language (EPL)
provide a highly scalable, memory-efficient, in-
memory computing, SQL-standard, minimal
latency, real-time streaming Big Data processing
engine for medium to high-velocity and high-
variety data."
http://esper.codehaus.org/
13年6月1日土曜日
- 28. Norikra Query: target "sales"
goods_id:5 price:49.8 num:1 shop:"LINE"
goods_id:2 price:12.5 num:3 shop:"Cookpad"
goods_id:4 price:36.6 num:10 shop:"Cookpad"
SELECT shop, sum(price*num) AS amount
FROM sales.win:time_batch(10 minutes)
GROUP BY shop
goods_id:5 price:49.8 num:1 shop:"LINE"
goods_id:2 price:12.5 num:3 shop:"Cookpad" affiliate:"BiS"
SELECT affiliate, count(*) AS cnt
FROM sales.win:time_batch(1 hour)
GROUP BY affiliate
13年6月1日土曜日
- 29. Norikra query:
vs Fluentd custom plugin
SQL!!!
No more restart for new queries
register queries whenever we want
No more private plugins
No more fat Fluentd configurations
13年6月1日土曜日
- 32. Demo: bootstrap
rbenv shell jruby-1.7.4
gem install norikra
which norikra
rbenv shell 2.0.0-pxxx
gem install fluent-plugin-norikra
vi demo.conf
fluentd -c demo.conf
13年6月1日土曜日
- 33. Demo: query streams
some messages over fluent-cat
register queries with norikra-client
more messages over fluent-cat & norikra-client
13年6月1日土曜日
- 35. roadmap of norikra
Norikra is still UNDER DEVELOPMENT
Norikra feature updates (JOINs, etc)
Web GUI
query & target list management
save & restore
Distributed & orchestrated nodes
13年6月1日土曜日
- 38. CRuby
great partner for java & rubyist
and for jvm middleware, like Hadoop
Norikra uses Esper's internal API to
parse queries
gems across platforms?
JRuby
long-running daemons on cruby
memory usage is big problem
13年6月1日土曜日