SlideShare a Scribd company logo
Introduction to Elastic
Search, Logstash, Kibana
and Beats
ELK Stack
Agenda
ELK Stack
STORY
Demo
STORY
Logs, Logs And Logs
Why LogsAre Important?!
● Application Monitoring
● Security Analysis
● Troubleshooting
Different Formats
Problems
● Non-consistent Log Format
● Expert Knowledge Needed
● Number Of Resources
● Running Analysis On Different Formats.
“ Can You Get POST Requests To Login
Endpoint Last Night !! “
Elastic stack Presentation
ELK Solution
The ELK Stack is a collection of four main components:
Elasticsearch, Logstash, Kibana and beats- From Elastic.
Beats
The Beats are open source data shippers that you install as agents on your servers to
send different types of operational data to Elasticsearch. Beats can send data directly to
Elasticsearch or send it to Elasticsearch via Logstash, which you can use to parse and
transform the data.
Beats
Logstash
Logstash is an open source, server-side data processing pipeline that ingests
data from a multitude of sources simultaneously, transforms it, and then sends it
to your favorite "stash."
127.0.0.1 - -
[05/Feb/2014:17:11:55 +0000]
"GET /css/main.css HTTP/1.1"
200 140
"http://www.onet.pl"
"Mozilla/5.0 (Windows NT 6.0;
WOW64; rv:2.0.1)
Gecko/20100101 Firefox/4.0.1"
{
"host" : "127.0.0.1",
"@timestamp" : "2014-02-
05T17:11:55+0000",
...
"verb" : "GET"
}
Logstash
02-beats-input.conf
input {
twitter {
consumer_key => "***********"
consumer_secret => "***********"
oauth_token => "***********"
oauth_token_secret => "***********"
keywords => ["Big Data", "Blockchain","AI", "ML"]
full_tweet => true
ignore_retweets => true
languages => ["en", "ar"]
}
}
Logstash
10-sys-filter.conf
filter {
grok {
match => {
"message" => [“..GROK..”]
}
}
if "_grokparsefailure" in [tags] { drop {} }
#Predefined date_time
date {
match => [ "date_time", "ISO8601","YYYY-MM-dd HH:mm:ss" ]
target => "date_time"
locale => "en"
}
mutate {
remove_field => [ "message","host","@version","path","tags","@timestamp" ]
}
}
Logstash
30-elasticsearch-output.conf
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "twitter-feed"
document_type => "tweets"
template => "/etc/elasticsearch/twitter_example/twitter.json"
template_name => "twitter_elastic_example"
template_overwrite => true
}
}
Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine
capable of addressing a growing number of use cases. As the heart
of the Elastic Stack, it centrally stores your data so you can discover
the expected and uncover the unexpected.
NoSQL(Elastic) Index Type Field Document
MYSQL Database Table Column Row
Elasticsearch
Inverted Index
Elasticsearch
Node Types
Master Data Coordinator Ingest ML
Elasticsearch
Master Node
Master
The master node is responsible for lightweight cluster-wide
actions such as creating or deleting an index, tracking which
nodes are part of the cluster, and deciding which shards to
allocate to which nodes. It is important for cluster health to have a
stable master node.
Elasticsearch
Data Node
Data
Data nodes hold the shards that contain the documents you
have indexed. Data nodes handle data related operations like
CRUD, search, and aggregations. These operations are I/O-,
memory-, and CPU-intensive. It is important to monitor
these resources and to add more data nodes if they are
overloaded.
Elasticsearch
Coordinating Only Node
Coordinator
If you take away the ability to be able to handle master
duties, to hold data, and pre-process documents, then
you are left with a coordinating node that can only
route requests, handle the search reduce phase, and
distribute bulk indexing. Essentially, coordinating only
nodes behave as smart load balancers.
Elasticsearch
Ingest Node
Ingest
Ingest nodes can execute pre-processing pipelines,
composed of one or more ingest processors. Depending
on the type of operations performed by the ingest
processors and the required resources, it may make
sense to have dedicated ingest nodes, that will only
perform this specific task.
Elasticsearch
elasticsearch.yml
# ======================== Elasticsearch Configuration =========================
# ---------------------------------- Cluster -----------------------------------
#cluster.name: my-application
# ------------------------------------ Node ------------------------------------
#node.name: node-1
#node.attr.rack: r1
# ----------------------------------- Paths ------------------------------------
# Path to directory where to store the data (separate multiple locations by comma):
path.data: /var/lib/elasticsearch
# Path to log files:
path.logs: /var/log/elasticsearch
#---------------------------------- Network -----------------------------------
# Set the bind address to a specific IP (IPv4 or IPv6):
network.host: localhost
http.port: 9200
# --------------------------------- Discovery ----------------------------------
# The default list of hosts is ["127.0.0.1", "[::1]"]
#discovery.seed_hosts: ["host1", "host2"]
#cluster.initial_master_nodes: ["node-1", "node-2"]
-------- another --------
node.master: true
node.data: false
node.ingest: false
node.ml: false
xpack.ml.enabled: true
Elasticsearch
Data Access Via REST
Create curl –XPOST “http://localhost:9200/<index>/<type>/<id>”
Read curl –XGET “http://localhost:9200/<index>/<type>/<id>”
Update curl –XPUT “http://localhost:9200/<index>/<type>/<id>”
Delete curl –XDELETE “http://localhost:9200/<index>/<type>/<id>”
Kibana
kibana lets you visualize your Elasticsearch data and navigate the
Elastic Stack so you can do anything from tracking query load to
understanding the way requests flow through your apps.
Overall Architecture
Overall Architecture
Overall Architecture
VS
● SPL
● Schema On Read
● Memory efficient
● Standalone
Implementation
● Pay As You Go
● Enterprise
● KPL
● Schema On Write
● Memory Inefficient
● 3 or 4 Independent
Components
● Infrastructure Cost
● Medium Application
VS
DEMO
Journey
Any Question !?
Contacts
Email: Amr.Alaa@barqsystems.com
Linkedin: https://www.linkedin.com/in/amr-alaa-yassen-609785108/
Github: https://github.com/AmrAlaaYassen

More Related Content

Elastic stack Presentation

  • 1. Introduction to Elastic Search, Logstash, Kibana and Beats ELK Stack
  • 4. Why LogsAre Important?! ● Application Monitoring ● Security Analysis ● Troubleshooting
  • 6. Problems ● Non-consistent Log Format ● Expert Knowledge Needed ● Number Of Resources ● Running Analysis On Different Formats. “ Can You Get POST Requests To Login Endpoint Last Night !! “
  • 8. ELK Solution The ELK Stack is a collection of four main components: Elasticsearch, Logstash, Kibana and beats- From Elastic.
  • 9. Beats The Beats are open source data shippers that you install as agents on your servers to send different types of operational data to Elasticsearch. Beats can send data directly to Elasticsearch or send it to Elasticsearch via Logstash, which you can use to parse and transform the data.
  • 10. Beats
  • 11. Logstash Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite "stash." 127.0.0.1 - - [05/Feb/2014:17:11:55 +0000] "GET /css/main.css HTTP/1.1" 200 140 "http://www.onet.pl" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1" { "host" : "127.0.0.1", "@timestamp" : "2014-02- 05T17:11:55+0000", ... "verb" : "GET" }
  • 12. Logstash 02-beats-input.conf input { twitter { consumer_key => "***********" consumer_secret => "***********" oauth_token => "***********" oauth_token_secret => "***********" keywords => ["Big Data", "Blockchain","AI", "ML"] full_tweet => true ignore_retweets => true languages => ["en", "ar"] } }
  • 13. Logstash 10-sys-filter.conf filter { grok { match => { "message" => [“..GROK..”] } } if "_grokparsefailure" in [tags] { drop {} } #Predefined date_time date { match => [ "date_time", "ISO8601","YYYY-MM-dd HH:mm:ss" ] target => "date_time" locale => "en" } mutate { remove_field => [ "message","host","@version","path","tags","@timestamp" ] } }
  • 14. Logstash 30-elasticsearch-output.conf output { elasticsearch { hosts => ["localhost:9200"] index => "twitter-feed" document_type => "tweets" template => "/etc/elasticsearch/twitter_example/twitter.json" template_name => "twitter_elastic_example" template_overwrite => true } }
  • 15. Elasticsearch Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected. NoSQL(Elastic) Index Type Field Document MYSQL Database Table Column Row
  • 17. Elasticsearch Node Types Master Data Coordinator Ingest ML
  • 18. Elasticsearch Master Node Master The master node is responsible for lightweight cluster-wide actions such as creating or deleting an index, tracking which nodes are part of the cluster, and deciding which shards to allocate to which nodes. It is important for cluster health to have a stable master node.
  • 19. Elasticsearch Data Node Data Data nodes hold the shards that contain the documents you have indexed. Data nodes handle data related operations like CRUD, search, and aggregations. These operations are I/O-, memory-, and CPU-intensive. It is important to monitor these resources and to add more data nodes if they are overloaded.
  • 20. Elasticsearch Coordinating Only Node Coordinator If you take away the ability to be able to handle master duties, to hold data, and pre-process documents, then you are left with a coordinating node that can only route requests, handle the search reduce phase, and distribute bulk indexing. Essentially, coordinating only nodes behave as smart load balancers.
  • 21. Elasticsearch Ingest Node Ingest Ingest nodes can execute pre-processing pipelines, composed of one or more ingest processors. Depending on the type of operations performed by the ingest processors and the required resources, it may make sense to have dedicated ingest nodes, that will only perform this specific task.
  • 22. Elasticsearch elasticsearch.yml # ======================== Elasticsearch Configuration ========================= # ---------------------------------- Cluster ----------------------------------- #cluster.name: my-application # ------------------------------------ Node ------------------------------------ #node.name: node-1 #node.attr.rack: r1 # ----------------------------------- Paths ------------------------------------ # Path to directory where to store the data (separate multiple locations by comma): path.data: /var/lib/elasticsearch # Path to log files: path.logs: /var/log/elasticsearch #---------------------------------- Network ----------------------------------- # Set the bind address to a specific IP (IPv4 or IPv6): network.host: localhost http.port: 9200 # --------------------------------- Discovery ---------------------------------- # The default list of hosts is ["127.0.0.1", "[::1]"] #discovery.seed_hosts: ["host1", "host2"] #cluster.initial_master_nodes: ["node-1", "node-2"] -------- another -------- node.master: true node.data: false node.ingest: false node.ml: false xpack.ml.enabled: true
  • 23. Elasticsearch Data Access Via REST Create curl –XPOST “http://localhost:9200/<index>/<type>/<id>” Read curl –XGET “http://localhost:9200/<index>/<type>/<id>” Update curl –XPUT “http://localhost:9200/<index>/<type>/<id>” Delete curl –XDELETE “http://localhost:9200/<index>/<type>/<id>”
  • 24. Kibana kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack so you can do anything from tracking query load to understanding the way requests flow through your apps.
  • 28. VS ● SPL ● Schema On Read ● Memory efficient ● Standalone Implementation ● Pay As You Go ● Enterprise ● KPL ● Schema On Write ● Memory Inefficient ● 3 or 4 Independent Components ● Infrastructure Cost ● Medium Application
  • 29. VS
  • 30. DEMO