SlideShare a Scribd company logo
Elasticsearch & Docker
Rafał Kuć – Sematext Group, Inc.
@kucrafal @sematext sematext.com
Running High Performance
Fault Tolerant
Elasticsearch Clusters On Docker
Next 20 minutes
You Are Probably Familiar With That
Development
You Are Probably Familiar With That
Development Test
You Are Probably Familiar With That
Development Test QA
You Are Probably Familiar With That
Development Test QA
Production enviroment
And Problems That Come With It
Resources not utilized
And Problems That Come With It
Resources not utilized
Overprovisoned
Servers
And Problems That Come With It
Resources not utilized
Overprovisoned
Servers
≠ ≠
The solution
Development Test QA Production
Why Docker?
Light weight
Based on
Open Standards
Secure
Containers vs Virtual Machines
Hardware
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Traditional Virtual Machine
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Traditional Virtual MachineContainer
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Docker Engine
Traditional Virtual MachineContainer
Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Docker Engine
Libraries Libraries
Application 1 Application 2
Traditional Virtual MachineContainer
Why Elasticsearch?
Distributed
by design
http://www.dailypets.co.uk/2007/06/17/kittens-rest-at-half-time/
Indices Aggs Admin
Monitor Search
Index
Running Offical Elasticsearch Container
$ docker run -d elasticsearch
Running Offical Elasticsearch Container
$ docker run -d elasticsearch:latest
Running Offical Elasticsearch Container
$ docker run -d elasticsearch:latest
$ docker run -d --name es1 elasticsearch
Running Offical Elasticsearch Container
$ docker run -d elasticsearch:latest
$ docker run -d --name es1 elasticsearch
$ docker run -d --name es1 -e ES_HEAP_SIZE=1g elasticsearch
Running Offical Elasticsearch Container
$ docker run -d elasticsearch:latest
$ docker run -d --name es1 elasticsearch
$ docker run -d --name es1 -e ES_HEAP_SIZE=1g elasticsearch
$ docker run -d --name es1 elasticsearch -Dnode.name bbuzz
Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
http://docs.docker.com/engine/reference/run/
Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
$ docker run -d --cpuset-cpus="1,3" elasticsearch
http://docs.docker.com/engine/reference/run/
Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
$ docker run -d --cpuset-cpus="1,3" elasticsearch
http://docs.docker.com/engine/reference/run/
$ docker run -d --cpu-period=50000 --cpu-quota=25000 elasticsearch
Constraints - Good Practices
Limit container memory
Constraints - Good Practices
Limit container memory
Account for I/O cache when giving memory
Constraints - Good Practices
Limit container memory
Account for I/O cache when giving memory
Limit amount of CPU cores
Constraints - Good Practices
Limit container memory
Account for I/O cache when giving memory
Limit amount of CPU cores
Remember about JVM GC
Creating Optimized Image
Dockerfile:
FROM elasticsearch
ADD ./elasticsearch.yml /usr/share/elasticsearch/config/
Creating Optimized Image
Dockerfile:
FROM elasticsearch
ADD ./elasticsearch.yml /usr/share/elasticsearch/config/
$ docker build -t bbuzz/example .
Creating Optimized Image
Dockerfile:
FROM elasticsearch
ADD ./elasticsearch.yml /usr/share/elasticsearch/config/
$ docker build -t bbuzz/example .
Sending build context to Docker daemon 5.12 kB
Step 1 : FROM elasticsearch
---> 1e23f30a3667
Step 2 : ADD ./elasticsearch.yml /usr/share/elasticsearch/config/
---> 015f12adfd2a
Removing intermediate container de560c6ae0d1
Successfully built 015f12adfd2a
Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
$ docker run -d --link es1 elasticsearch
-Ddiscovery.zen.ping.unicast.hosts=es1
Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
Add network.publish_host when building own container
$ docker run -d --link es1 elasticsearch
-Ddiscovery.zen.ping.unicast.hosts=es1
Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
Add network.publish_host when building own container
Add discovery.zen.ping.unicast.hosts when building own
container
$ docker run -d --link es1 elasticsearch
-Ddiscovery.zen.ping.unicast.hosts=es1
Network - Good Practices
Separate network for Elasticsearch cluster
Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h esnode1 elasticsearch
Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h esnode1 elasticsearch
Expose only needed ports
Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h esnode1 elasticsearch
Expose only needed ports
Elasticsearch data & client nodes point to masters only
Dealing With Storage
By default in /usr/share/elasticsearch/data
Dealing With Storage
By default in /usr/share/elasticsearch/data
By default not persisted
Dealing With Storage
By default in /usr/share/elasticsearch/data
By default not persisted
$ docker run -d
-v /opt/elasticsearch/data:/usr/share/elasticsearch/data
elasticsearch
Dealing With Storage
$ docker run -d
-v /opt/elasticsearch/data:/usr/share/elasticsearch/data
elasticsearch
By default in /usr/share/elasticsearch/data
By default not persisted
Use data only Docker volumes
Permissions
Data Only Docker Volumes
Bypasses Union File System
Data Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
Data Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data
--name esdata elasticsearch
Permissions
Data Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data
--name esdata elasticsearch
$ docker run --volumes-from esdata elasticsearch
Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
minimum_master_nodes = N/2 + 1
Master Nodes & Docker
$ docker run -d elasticsearch
-Dnode.master=true
-Dnode.data=false
-Dnode.client=false
Client Nodes & Docker
$ docker run -d elasticsearch
-Dnode.master=false
-Dnode.data=false
-Dnode.client=true
Data Nodes & Docker
$ docker run -d elasticsearch
-Dnode.master=false
-Dnode.data=true
-Dnode.client=false
Multiple Tiers
node.tag=hot node.tag=cold node.tag=cold
Multiple Tiers
$ docker run -d elasticsearch -Dnode.tag=hot
Multiple Tiers
curl -XPUT 'localhost:9200/data_2016-06-05' -d '{
"settings": {
"index.routing.allocation.include.tag" : "hot"
}
}'
Multiple Tiers
node.tag=hot node.tag=cold node.tag=cold
data_2016-06-05
data_2016-06-05
Multiple Tiers
node.tag=hot node.tag=cold node.tag=cold
data_2016-06-05
data_2016-06-05
Top Metrics – Health & Shards
https://sematext.com/spm/
Top Metrics - CPU
https://sematext.com/spm/
Top Metrics – Memory Usage
https://sematext.com/spm/
Top Metrics – I/O Usage
https://sematext.com/spm/
Top Metrics – JVM Heap
https://sematext.com/spm/
Top Metrics – Garbage Collector
https://sematext.com/spm/
Top Metrics – Request Rate & Latency
https://sematext.com/spm/
Top Metrics - Caches
https://sematext.com/spm/
Top Metrics – Indexing
https://sematext.com/spm/
Top Metrics – Refresh Time
https://sematext.com/spm/
Top 10 Metrics – Merge Time
https://sematext.com/spm/
Short summary
http://www.soothetube.com/2013/12/29/thats-all-folks/
We Are Hiring !
Dig Search ?
Dig Analytics ?
Dig Big Data ?
Dig Performance ?
Dig Logging ?
Dig working with and in open – source ?
We’re hiring world – wide !
http://sematext.com/about/jobs.html
Rafał Kuć
@kucrafal
rafal.kuc@sematext.com
Sematext
@sematext
http://sematext.com
http://blog.sematext.com
Thank You !

More Related Content

Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker

Editor's Notes

  1. Problems with standard deployment like: Resources not utilized Need to provision machines before deployment Differences between development, test, QA and production environments Hard to scale automatically
  2. Problems with standard deployment like: Resources not utilized Need to provision machines before deployment Differences between development, test, QA and production environments Hard to scale automatically
  3. Problems with standard deployment like: Resources not utilized Need to provision machines before deployment Differences between development, test, QA and production environments Hard to scale automatically