Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe, Sematext

OCTOBER 11-14, 2016 • BOSTON, MA

Tuning Solr and its Pipeline for Logs
Rafał Kuć and Radu Gheorghe
Software Engineers, Sematext Group, Inc.

3
01
Agenda
Designing a Solr(Cloud) cluster for time-series data
Solr and operating system knobs to tune
Pipeline patterns and shipping options

4
01
Time-based collections, the single best improvement
14.10
indexing

5
01
14.10
indexing
15.10

6
01
14.10
indexing
15.10

7
01
14.10 15.10 ... 21.10
indexing

8
01
14.10 15.10 ... 21.10
indexing
Less merging ⇒ faster indexing
Quick deletes (of whole collections)
Search only some collections
Better use of caches

9
01
Load is uneven
Black
Friday
Saturday Sunday

1
0
01
Load is uneven
Need to “rotate” collections fast enough to
work with this (otherwise indexing and
queries will be slow)
These will be tiny
Black
Friday
Saturday Sunday

1
1
01
If load is uneven, daily/monthly/etc indices are suboptimal:
you either have poor performance or too many collections
Octi* is worried:
* this is Octi →

1
2
01
Solution: rotate by size
indexing
logs01
Size limit

1
3
01
indexing
logs01
Size limit

1
4
01
indexing
logs01
logs02
Size limit

1
5
01
indexing
logs01
logs02
Size limit

1
6
01
indexing
logs01
logs08...
logs02

1
7
01
indexingPredictable indexing and search performance
Fewer shards
logs01
logs08...
logs02

1
8
01
Dealing with size-based collections
logs01
logs08...
logs02
app (caches results)
stats
2016-10-11 2016-10-13 2016-10-12 2016-10-14
2016-10-18
doesn’t matter,
it’s the latest collection

1
9
01
Size-based collections handle spiky load better
Octi concludes:

2
0
01
Tiered cluster (a.k.a. hot-cold)
14 Oct
11 Oct
10 Oct
12 Oct
13 Oct
hot01 cold01 cold02
indexing, most searches longer-running (+cached) searches

2
1
01
Tiered cluster (a.k.a. hot-cold)
14 Oct
11 Oct
10 Oct
12 Oct
13 Oct
hot01 cold01 cold02
indexing, most searches longer-running (+cached) searches
⇒ Good CPU and IO* ⇒ Heap. Decent IO for replication&backup
* Ideally local SSD; avoid network storage unless it’s really good

2
2
01
Octi likes tiered clusters
Costs: you can use different hardware for different workloads
Performance (see costs): fewer shards, less overhead
Isolation: long-running searches don’t slow down indexing

2
3
01
AWS specifics
Hot tier:
c3 (compute optimized) + EBS and use local SSD as cache*
c4 (EBS only)
Cold tier:
d2 (big local HDDs + lots of RAM)
m4 (general purpose) + EBS
i2 (big local SSDs + lots of RAM)
General stuff:
EBS optimized
Enhanced Networking
VPC (to get access to c4&m4 instances)
* Use --cachemode writeback for async writing:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/ht
ml/Logical_Volume_Manager_Administration/lvm_cache_volume_creation.html

PIOPS is best but expensive
HDD - too slow (unless cold=icy)
⇒ General purpose SSDs
2
4
01
EBS
Stay under 3TB. More smaller (<1TB) drives in RAID0 give better, but shorter IOPS bursts
Performance isn’t guaranteed ⇒ RAID0 will wait for the slowest disk
Check limits (e.g. 160MB/s per drive, instance-dependent IOPS and network)
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html
3 IOPS/GB up to 10K (~3TB), up to 256kb/IO, merges up to 4 consecutive IOs

2
5
01
Octi’s AWS top picks
c4s for the hot tier: cheaper than c3s with similar performance
m4s for the cold tier: well balanced, scale up to 10xl, flexible storage via EBS
EBS drives < 3TB, otherwise avoids RAID0: higher chances of performance drop

2
6
01
Scratching the surface of OS options
Say no to swap
Disk scheduler: CFQ for HDD, deadline for SSD
Mount options: noatime, nodiratime, data=writeback, nobarrier
For bare metal: check CPU governor and THP*
because strict ordering is for the weak
* often it’s enabled, but /proc/sys/vm/nr_hugepages is 0

2
7
01
Schema and solrconfig
Auto soft commit (5s?)
Auto commit (few minutes?)
RAM buffer size + Max buffered docs
Doc Values for faceting
+ retrieving those fields (stored=false)
Omit norms, frequencies and positions
Don’t store catch-all field(s)

2
8
01
Relaxing the merge policy*
Merges are the heaviest part of indexing
Facets are the heaviest part of searching
Facets (except method=enum) depend on data size more than # of segments
Knobs:
segmentsPerTier: more segments ⇒ less merging
maxMergeAtOnce < segmentsPerTier, to smooth out those IO spikes
maxMergedSegmentMB: lower to merge more small segments, less bigger ones
* unless you only do “grep”. YMMV, of course. Keep an eye on open files, though
⇒ fewer open files

2
9
01
Some numbers: more segments, more throughput (10-15% here)
10 segmentsPerTier
10 maxMergeAtOnce
50 segmentsPerTier
50 maxMergeAtOnce
need to rotate
before this drop

3
0
01
Lower max segment (500MB from 5GB default)
less CPU fewer segments

3
1
01
There’s more...
SPM screenshots from all tests + JMeter test plan here:
https://github.com/sematext/lucene-revolution-samples/tree/master/2016/solr_logging
We’d love to hear about your own results!
correct spelling: sematext.com/spm

3
2
01
Increasing segments per tier while decreasing max merged
segment (by an order of magnitude) makes indexing better and
search latency less spiky
Octi’s conclusions so far

3
3
01
Optimize I/O and CPU by not optimizing
Unless you have spare CPU & IO (why would you?)
And unless you run out of open files
Only do this on “old” indices!

3
4
01
Optimizing the pipeline*
logs
log shipper(s)
Ship using which protocol(s)?
Buffer
Route to other destinations?
Parse
* for availability and performance/costs
Or log to Solr directly from app
(i.e. implement a new, embedded log shipper)

3
5
01
A case for buffers
performance & availability
allows batches and threads when Solr is down or can’t keep up

3
6
01
Types of buffers
Disk*, memory or a combination
On the logging host or centralized
* doesn’t mean it fsync()s for every message
file or local log shipper
Easy scaling; fewer moving parts
Often requires a lightweight shipper
Kafka/Redis/etc or central log shipper
Extra features (e.g. TTL)
One place for changes
bufferapp
bufferapp

3
7
01
Multiple destinations
* or flat files, or S3 or...
input buffer
processing
Outputs need to be in sync
input Processing may
cause backpressure
processing
*

3
8
01
Multiple destinations
input
Solr
offset
HDFS
offset
input
processing
processing

3
9
01
Just Solr and maybe flat files? Go simple with a local shipper
Custom, fast-changing processing & multiple destinations? Kafka as a central buffer
Octi’s pipeline preferences

4
0
01
Parsing unstructured data
Ideally, log in JSON*
Otherwise, parse
* or another serialization format
For performance and maintenance
(i.e. no need to update parsing rules)
Regex-based (e.g. grok)
Easy to build rules
Rules are flexible
Typically slow & O(n) on # of rules, but:
Move matching patterns to the top of the list
Move broad patterns to the bottom
Skip patterns including others that didn’t match
Grammar-based
(e.g. liblognorm, PatternDB)
Faster. O(1) on # of rules
Numbers in our 2015 session:
sematext.com/blog/2015/10/16/large-scale-log-analytics-with-solr/

4
1
01
Decide what gives when buffers fill up
input shipper
Can drop data here, but better buffer
when shipper buffer is full, app can block or drop data
Check:
Local files: what happens when files are rotated/archived/deleted?
UDP: network buffers should handle spiky load*
TCP: what happens when connection breaks/times out?
UNIX sockets: what happens when socket blocks writes**?
* you’d normally increase net.core.rmem_max and rmem_default
** both DGRAM and STREAM local sockets are reliable (vs Internet ones, UDP and TCP)

4
2
01
Octi’s flow chart of where to log
critical?
UDP. Increase network
buffers on destination,
so it can handle spiky
traffic
Paying with
RAM or IO?
UNIX socket. Local
shipper with memory
buffers, that can
drop data if needed
Local files. Make sure
rotation is in place or
you’ll run out of disk!
no
IO RAM
yes

4
3
01
Protocols
UDP: cool for the app, but not reliable
TCP: more reliable, but not completely
Application-level ACKs may be needed:
No failure/backpressure handling needed
App gets ACK when OS buffer gets it
⇒ no retransmit if buffer is lost*
* more at blog.gerhards.net/2008/05/why-you-cant-build-reliable-tcp.html
sender receiver
ACKs
Protocol Example shippers
HTTP Logstash, rsyslog, Fluentd
RELP rsyslog, Logstash
Beats Filebeat, Logstash
Kafka Fluentd, Filebeat, rsyslog, Logstash

4
4
01
Octi’s top pipeline+shipper combos
rsyslog
app
UNIX
socket
HTTP
memory+disk buffer
can drop
app
fileKafka
consumer
Filebeat
HTTP
simple & reliable

4
5
01
Conclusions, questions, we’re hiring, thank you
The whole talk was pretty much only conclusions :)
Feels like there’s much more to discover. Please test & share your own nuggets
http://www.relatably.com/m/img/oh-my-god-memes/oh-my-god.jpg
Scary word, ha? Poke us:
@kucrafal @radu0gheorghe @sematext
...or @ our booth here

Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe, Sematext

Related slideshows

More Related Content

Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe, Sematext