Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
- 1. Elasticsearch & Docker
Rafał Kuć – Sematext Group, Inc.
@kucrafal @sematext sematext.com
Running High Performance
Fault Tolerant
Elasticsearch Clusters On Docker
- 6. You Are Probably Familiar With That
Development Test QA
Production enviroment
- 15. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Traditional Virtual Machine
- 16. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Traditional Virtual Machine
- 17. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Traditional Virtual Machine
- 18. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Traditional Virtual MachineContainer
- 19. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Docker Engine
Traditional Virtual MachineContainer
- 20. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Docker Engine
Libraries Libraries
Application 1 Application 2
Traditional Virtual MachineContainer
- 25. Running Offical Elasticsearch Container
$ docker run -d elasticsearch:latest
$ docker run -d --name es1 elasticsearch
$ docker run -d --name es1 -e ES_HEAP_SIZE=1g elasticsearch
- 26. Running Offical Elasticsearch Container
$ docker run -d elasticsearch:latest
$ docker run -d --name es1 elasticsearch
$ docker run -d --name es1 -e ES_HEAP_SIZE=1g elasticsearch
$ docker run -d --name es1 elasticsearch -Dnode.name bbuzz
- 27. Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
http://docs.docker.com/engine/reference/run/
- 28. Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
$ docker run -d --cpuset-cpus="1,3" elasticsearch
http://docs.docker.com/engine/reference/run/
- 29. Container Constraints
$ docker run -d -m 2G elasticsearch
$ docker run -d -m 2G --memory-swappiness=0 elasticsearch
$ docker run -d --cpuset-cpus="1,3" elasticsearch
http://docs.docker.com/engine/reference/run/
$ docker run -d --cpu-period=50000 --cpu-quota=25000 elasticsearch
- 31. Constraints - Good Practices
Limit container memory
Account for I/O cache when giving memory
- 32. Constraints - Good Practices
Limit container memory
Account for I/O cache when giving memory
Limit amount of CPU cores
- 33. Constraints - Good Practices
Limit container memory
Account for I/O cache when giving memory
Limit amount of CPU cores
Remember about JVM GC
- 36. Creating Optimized Image
Dockerfile:
FROM elasticsearch
ADD ./elasticsearch.yml /usr/share/elasticsearch/config/
$ docker build -t bbuzz/example .
Sending build context to Docker daemon 5.12 kB
Step 1 : FROM elasticsearch
---> 1e23f30a3667
Step 2 : ADD ./elasticsearch.yml /usr/share/elasticsearch/config/
---> 015f12adfd2a
Removing intermediate container de560c6ae0d1
Successfully built 015f12adfd2a
- 38. Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
$ docker run -d --link es1 elasticsearch
-Ddiscovery.zen.ping.unicast.hosts=es1
- 39. Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
Add network.publish_host when building own container
$ docker run -d --link es1 elasticsearch
-Ddiscovery.zen.ping.unicast.hosts=es1
- 40. Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
Add network.publish_host when building own container
Add discovery.zen.ping.unicast.hosts when building own
container
$ docker run -d --link es1 elasticsearch
-Ddiscovery.zen.ping.unicast.hosts=es1
- 41. Network - Good Practices
Separate network for Elasticsearch cluster
- 42. Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h esnode1 elasticsearch
- 43. Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h esnode1 elasticsearch
Expose only needed ports
- 44. Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h esnode1 elasticsearch
Expose only needed ports
Elasticsearch data & client nodes point to masters only
- 47. Dealing With Storage
By default in /usr/share/elasticsearch/data
By default not persisted
$ docker run -d
-v /opt/elasticsearch/data:/usr/share/elasticsearch/data
elasticsearch
- 48. Dealing With Storage
$ docker run -d
-v /opt/elasticsearch/data:/usr/share/elasticsearch/data
elasticsearch
By default in /usr/share/elasticsearch/data
By default not persisted
Use data only Docker volumes
Permissions
- 50. Data Only Docker Volumes
Bypasses Union File System
Can be shared between containers
- 51. Data Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
- 52. Data Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data
--name esdata elasticsearch
Permissions
- 53. Data Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data
--name esdata elasticsearch
$ docker run --volumes-from esdata elasticsearch
- 55. Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
minimum_master_nodes = N/2 + 1
- 56. Master Nodes & Docker
$ docker run -d elasticsearch
-Dnode.master=true
-Dnode.data=false
-Dnode.client=false
- 57. Client Nodes & Docker
$ docker run -d elasticsearch
-Dnode.master=false
-Dnode.data=false
-Dnode.client=true
- 58. Data Nodes & Docker
$ docker run -d elasticsearch
-Dnode.master=false
-Dnode.data=true
-Dnode.client=false
- 61. Multiple Tiers
curl -XPUT 'localhost:9200/data_2016-06-05' -d '{
"settings": {
"index.routing.allocation.include.tag" : "hot"
}
}'
- 70. Top Metrics – Request Rate & Latency
https://sematext.com/spm/
- 76. We Are Hiring !
Dig Search ?
Dig Analytics ?
Dig Big Data ?
Dig Performance ?
Dig Logging ?
Dig working with and in open – source ?
We’re hiring world – wide !
http://sematext.com/about/jobs.html
Editor's Notes
- Problems with standard deployment like:
Resources not utilized
Need to provision machines before deployment
Differences between development, test, QA and production environments
Hard to scale automatically
- Problems with standard deployment like:
Resources not utilized
Need to provision machines before deployment
Differences between development, test, QA and production environments
Hard to scale automatically
- Problems with standard deployment like:
Resources not utilized
Need to provision machines before deployment
Differences between development, test, QA and production environments
Hard to scale automatically