SlideShare a Scribd company logo
Tuning Elasticsearch
Indexing Pipeline
for Logs
Radu Gheorghe
Rafał Kuć
Who are we?
Radu Rafał
Logsene
The next hour
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs Logs
Logs
The tools
Logsene
2.0 SNAPSHOT8.9.01.5 RC2
Let the games begin
Logstash
Multiple inputs
Lots of filters
Several outputs
Lots of plugins
How Logstash works
input
(thread per input)
file
tcp
redis
...
filter
(multiple workers)
grok
geoip
...
elasticsearch
solr
...
output
(multiple workers)
Scaling Logstash
Logstash basic
input {
syslog {
port => 13514
}
}
output {
elasticsearch {
protocol => "http”
manage_template => false
index => "test-index”
index_type => "test-type"
}
}
Logstash basic
4K events per second
~130% CPU
utilization
299MB RAM used
Logstash basic
Logstash with mutate
output {
elasticsearch {
protocol => "http���
manage_template => false
index => "test-index”
index_type => "test-type”
flush_size => 1000
workers => 5
}
}
filter {
mutate {
remove_field => [ "severity", "facility", "priority", "@version", "timestamp", "host" ]
}
}
3 filter threads!
-w 3
Logstash with mutate
5K events per second
~250% CPU
utilization
289MB RAM used
Logstash with mutate
Logstash with grok and tcp
filter {
grok {
match => [ "message", "<%{NUMBER:priority}>%{SYSLOGTIMESTAMP:date}
%{DATA:hostname} %{DATA:tag} %{DATA:what}:%{DATA:number}:" ]
}
mutate {
remove_field => [ "message", "@version", "@timestamp", "host" ]
}
}
input {
tcp {
port => 13514
}
}
Logstash with grok and tcp
8K events per second
~310% CPU
utilization
327MB RAM used
Logstash with grok and tcp
Logstash with JSON lines
input {
tcp {
port => 13514
codec => "json_lines"
}
}
Logstash with JSON lines
8K events per second
~260% CPU
utilization
322MB RAM used
Logstash with JSON lines
Rsyslog
Very fast
Very light
How rsyslog works
im*
imfile
imtcp
imjournal
...
mm* om*
mmnormalize
mmjsonparse
...
omelasticsearch
omredis
...
Using rsyslog
Rsyslog basic
module(load="impstats"
interval="10"
resetCounters="on"
log.file="/tmp/stats")
module(load="imtcp")
module(load="omelasticsearch")
input(type="imtcp" port="13514")
action(type="omelasticsearch"
template="plain-syslog"
searchIndex="test-index"
searchType="test-type"
bulkmode="on"
action.resumeretrycount="-1"
)
template(name="plain-syslog"
type="list") {
constant(value="{")
constant(value=""@timestamp":"") property(name="timereported" dateFormat="rfc3339")
constant(value="","host":"") property(name="hostname")
constant(value="","severity":"") property(name="syslogseverity-text")
constant(value="","facility":"") property(name="syslogfacility-text")
constant(value="","syslogtag":"") property(name="syslogtag" format="json")
constant(value="","message":"") property(name="msg" format="json")
constant(value=""}")
}
*http://blog.sematext.com/2015/04/13/monitoring-rsyslogs-performance-with-imstats-and-elasticsearch
Rsyslog basic
6K events per second
~20% CPU utilization
50MB RAM used
Rsyslog basic
Rsyslog queue and workers
main_queue(
queue.size="100000" # capacity of the main queue
queue.dequeuebatchsize="5000" # process messages in batches of 5K
queue.workerthreads="4" # 4 threads for the main queue
)
action(name="send-to-es"
type="omelasticsearch"
template="plain-syslog" # use the template defined earlier
searchIndex="test-index"
searchType="test-type"
bulkmode="on" # use bulk API
action.resumeretrycount="-1" # retry indefinitely if ES is unreachable
)
Rsyslog queue and workers
25K events per
second
~100% CPU
utilization (1 core)
75MB RAM used
(queue dependent)
Rsyslog queue and workers
Rsyslog + mmnormalize
module(load="mmnormalize")
action(type="mmnormalize"
ruleBase="/opt/rsyslog_rulebase.rb"
useRawMsg="on"
)
template(name="lumberjack" type="list") {
property(name="$!all-json")
}
$ cat /opt/rsyslog_rulebase.rb
rule=:<%priority:number%>%date:date-rfc3164% %host:word% %syslogtag:word% %what:char-
to:x3a%:%number:char-to:x3a%:
Rsyslog + mmnormalize
16K events per second
~200% CPU utilization
100MB RAM used
(queue dependent)
Rsyslog + mmnormalize
Rsyslog with JSON parsing
module(load="mmjsonparse")
action(type="mmjsonparse")
Rsyslog with JSON parsing
20K events per
second
~130% CPU utilization
70MB RAM used
(queue dependent)
Rsyslog with JSON parsing
Disk-assisted queues
main_queue(
queue.filename="main_queue" # write to disk if needed
queue.maxdiskspace="5g" # when to stop writing to disk
queue.highwatermark="200000" # start spilling to disk at this size
queue.lowwatermark="100000" # stop spilling when it gets back to this size
queue.saveonshutdown="on" # write queue contents to disk on shutdown
queue.dequeueBatchSize="5000"
queue.workerthreads="4"
queue.size="10000000" # absolute max queue size
)
Elasticsearch
How Elasticsearch works
JSON bulk, single doc
transaction log
inverted index
analysis
primary
transaction log
inverted index
analysis
replica
Elasticsearch
replicate
ES horizontal scaling
Node
shard
ES horizontal scaling
Node
shard
Node
shard
ES horizontal scaling
Node
shard
Node
shard
Node
shard
ES horizontal scaling
Node
shard shard
shard shard
Node
shard shard
shard shard
Node
shard shard
shard shard
ES horizontal scaling
Node
shard shard
shard shard
replica
replica
replica
replica
Node
shard shard
shard shard
replica
replica
replica
replica
Node
shard shard
shard shard
replica
replica
replica
replica
Elasticsearch for tools tests
Nothing is
indexed
No JVM
tuning
Nothing is
stored
_source
disabled
_all
disabled
-1 refresh
30m sync
translog
size: 2g
interval: 30m
Tuning Elasticsearch
refresh_interval: 5s*
doc_values: true
store.throttle.max_bytes_per_sec: 200mb
*http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/
Tests: hardware and data
2 x EC2 c3.large instances
(2vCPU, 3.5GB RAM,
2x16GB SSD in RAID0)
vs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs Logs
Logs
Apache logs
Test requests
Filters Aggregations
filter by client IP date histogram
filter by word in user agent top 10 response codes
wildcard filter on domain # of unique IPs
top IPs per response per time
Test runs
1. Write throughput
2. Capacity of a single index
3. Capacity with time-based indices on
hot/cold setup
Write throughput (one index)
Capacity of one index (3200 EPS)
20 seconds @ 40 - 50M
Capacity of one index (400 EPS)
15 seconds @ 40 - 50M
Time-based indices: ideal shard size
smaller indices
lighter indexing
easier to isolate hot data from cold data
easier to relocate
bigger indices
less RAM
less management overhead
smaller cluster state
without indexing, equal latency when dividing
32M data into 1/2/4/8/16/32M indices
Time-based. 2 hot and 2 cold nodes
Before: 3200 After: 4800
Time-based. 2 hot and 2 cold nodes
Before:
15s
After:
5s
That's all folks!
What to remember?
log in
JSON
parallelize
when
possible
use time
based indices
use hot / cold
nodes policy
We are hiring
Dig Search?
Dig Analytics?
Dig Big Data?
Dig Performance?
Dig Logging?
Dig working with and in open – source?
We’re hiring world – wide!
http://sematext.com/about/jobs.html
Thank you!
Radu Gheorghe
@radu0gheorghe
radu.gheorghe@sematext.com
Rafał Kuć
@kucrafal
rafal.kuc@sematext.com
Sematext
@sematext
http://sematext.com

More Related Content

Tuning Elasticsearch Indexing Pipeline for Logs

Editor's Notes

  1. Rafał starts and passes mic to Radu
  2. Rafal slide – describe the talk brefily !!! Ask people how many of the audience used the tools
  3. Radu slide we did some tests, we’ll share configs and benchmarks – here are the versions Logstash 1.5 – the final version will be up soon Rsyslog 8.9 – the current stable (note: most distros come with 5.x or 7.x) ES is a search engine based on Apache Lucene Current version is 1.5, next major is 2.0 with lots of changes. Many related to Lucene 5.0 Not the only tools for logging, there are many other tools, both open source and commercial, that can receive logs, parse them, buffer them and index them
  4. Rafal slide
  5. Rafal slide * Ask how many people know about Logstash
  6. Rafal slide
  7. Rafal slide
  8. Radu Assume we want to centralize syslog Forward syslog via TCP/UDP on a port to Logstash On the Logstash side, you can use the TCP input to listen to that port and parse syslog messages You’d use the ES output to forward to ES you can use a Java binary, but HTTP is better Logstash comes with a template for ES index, but for perf tests we’ll use our own Specify where (index,type – like a DB and a table)
  9. Radu - 1.3 CPUs
  10. Radu – segue to tuning, pass the mic
  11. Rafal Flush size – 1000 lowered from default 5000
  12. Rafal
  13. Rafal
  14. Rafal Syslog is just TCP + Grok We changed that and we are not parsing the syslog format exactly – we wanted to parse additional things and wanted to show how to parse unstructured data
  15. The bound was: - hardware (high CPU usage) - JSON lines codec is not parallelized, while GROK is - But if you want to do your homework you can do another run with JSON filter instead of codec and that will give the possibility of parallelization
  16. Radu Many people hate it, maybe because of docs I like it because it’s light and fast and has surprisingly rich functionality
  17. Like Logstash, it’s modular, you can use inputs to get data in, message modifiers to parse data and outputs to pass it on The flow of data is a bit different Inputs may have multiple threads, and they write to a main queue On the main queue, worker threads can do filtering, format messages using templates (will talk later) and run actions (parsing/output) You can have action queues as well, with their own threads => async You can have rulesets, which let you separate flows of input – parse – output (e.g. One ruleset for local logs, one for remote logs)
  18. Typical setup is to have it on each server, push to ES directly, buffer if necessary
  19. Load modules Impstats is for monitoring, then tcp and ES Start the tcp listener Template – how the JSON that we send to ES will look like Action – send to ES, using the template, specify index/type, use bulks, retry on failure
  20. Bigger memory buffer Increase bulk size Moar worker threads
  21. Not using more because ES is using the rest – Rafal will talk about that in a bit RAM has increased because of the queue size
  22. Clear win
  23. But not really apples for apples, because rsyslog has dedicated syslog parsers Still, not only for syslog, can parse unstructured data via mmnormalize Refer to a rulebase, which looks much like grok patterns, with two differences: Normally, patterns like number or date aren’t regexes but specific parsers. Faster but less flexible. The one above is equivalent to the Logstash grok seen earlier Builds a parse tree on startup, helps with speed if you have many rules
  24. Radu
  25. More throughput with less CPU usage
  26. Before moving on, one more thing: in production you probably want to use disk assisted queues instead of in-memory queues like the ones we had here. DA queue is in-memory queue that can spill to disk. Specify that via file name and give it a threshold Spilling is smart: Normally in memory When it reaches high watermark it starts writing to disk, but it does so in batches, so resumes to memory when lowwatermark Side-benefit: can save and reload memory queue contents when restarting rsyslog
  27. Rafal
  28. Rafał Index a document It goes to ES first to transaction log next to inverted index It is replicated on transaction log level
  29. Rafał
  30. Rafał
  31. Rafał
  32. Rafał
  33. Rafał
  34. Rafał
  35. Rafał Throttling – the default is 20, we are using 200, so we are actually going for 10 times more (we are usind SSD drives here)
  36. Rafał
  37. Rafał Cheaper filters and aggregations are on top The more expensive are at the bottom
  38. Radu Index as fast as we can How much data we can put in a single index at a decent indexing rate before searches took too long a good practice is to have time-based indices (e.g. Keep logs for a week, have one per day). We want to benchmark that + separating indexing load from search load by putting today’s index on different nodes than the „old” ones
  39. Rafal Rate slowly goes down, because merges happen and because the index is slowly getting bigger
  40. Rafał 40-50 m @ 20 seconds Most expensive query takes 20 sec on average Filters (quick ones) takes subseconds Some aggs takes up to 5 seconds on average
  41. Rafał Spikes because of merges, the big spike is because the merge happen and after the merge the queries are actually faster Most expensive queries take 15 seconds
  42. Radu Want to benchmark TB indices. Because: Indexing is better because of merging Searching recent data is better because idx is smaller Deleting entired indices is better But what granularity? Use-cases for small (high indexing, small retention, CPU contraint) vs big (low idx, high retention, mem constraint) granularity doesn’t affect cold search perf
  43. Rafal Tell about hot and cold setup The drop is because cold nodes were full
  44. Rafal
  45. Radu