SlideShare a Scribd company logo
Docker Logging
and analysing with
Elastic Stack
JAKUB HAJEK,
jakub.hajek@cometari.com, @_jakubhajek November 2019, Warsaw
www.devopsdays.pl
Introduction
• I am the owner and technical consultant working for Cometari
• I have been system admin since 1998.
• Cometari is a solution company implementing DevOps culture, providing
consultancy, workshops and software services.
• Our areas of expertise are DevOps, Elastic Stack (log analysis), Cloud
Computing.
• We are very deeply involved in the travel tech industry, however our solutions go
much further than just integrating travel API’s.
—
“I strongly believe that implementing DevOps
culture, across the entire organisation, should
provide measurable value and solve the real
issue rather than generate a new one.”
Agenda
• A little bit of the theory about logs.
• The major difference with old fashioned approach comparing to container world.
• Distributed logging with Elasticsearch and Fluentd
• Demo of logging based on live demos:
• A simple example sending logs from container to Fluentd
• Fully fledged environment running on Docker Swarm with deployed:
Elasticsearch Cluster, Kibana and Fluentd
• Deployed application stack contains multi tier application stack including Traefik
frontend and backend application
For what for do we need to collect
logs?
What are logs?
What are logs?
• Logs are the stream of aggregated, time ordered events collected from the
output stream
• The output stream can be generated by processes and backing services
• Raw logs are typically a text format with one event per line
• Backtraces from exceptions are usually multiline
• Logs have no beginning or end but flows continuously as long as the app is
operating.
Logging considerations
• Logging is not cheap. Requires lots of computing: storage, cpu, memory.
• Logging can be even expensive if you want to search against logs and correlate data.
• Having “LIVE” data accessible immediately can be even more expensive.
• Don’t log everything, consider which data you are interested in (it’s not for free)
• Logging retention time have to be considered (Curator if you store logs in Elasticsearch)
• I recommend Elasticsearch to keep logs as a time based data. It requires some
experience with Elasticsearch to provide reliable environment for logs.
• Logging is a mess ; Logging is not fun but we have to deal with it and build logging
solution
Logging in production
• Service logs
• Web access logs
• Transaction Logs
• Distributed tracing
• System Logs
• Syslog, system and other logs
• Audit logs
• Basic operating system metrics (CPU, memory, load …)
Logs for Business
KPI
Machine Learning
Predctive analytics
…
Logs for Service
System monitoring
Bottleneck
Troubleshooting
…
Logging is not the same as Monitoring
• Logging is recording to diagnose a system
• Monitoring is an observation, checking and than recording
• A Notification ( usually called alerts) can be send out to any notification
channels for both: logging and monitoring
• The notification can be triggered when specific criteria is met. e.g.
Http_requests_response_code is 500 in the last 60 seconds
A plugin had an unrecoverable error. Will restart this plugin.
Pipeline_id:main_dlq
Plugin: <LogStash::Inputs::DeadLetterQueue pipeline_id=>"main", path=>"/usr/share/logstash/data/dead_letter_queue", id=>"830027210528f50ad1234fe96f0ccc5f8a6989bb0b2d944881373ec56e555357", commit_offsets=>true,
enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_32044710-aeb5-4303-ba0e-2feb2dd851e9", enable_metric=>true, charset=>"UTF-8">>
Error:
Exception: Java::JavaNio::BufferOverflowException
Stack: java.nio.HeapByteBuffer.put(java/nio/HeapByteBuffer.java:189)
eas_errors{errorType=“CONTENT”,provider=“HRS",requestName="HotelAvailability", 
errorId=“1234",errorSeverity="2",startDate="2019-11-20T22:00:00",endDate="2019-11-21T21:59:59",} 10.0
Standard approach of logging to file
system
Writing message to a log file
Application
LOGS Logging}
Files -> Multiple places (usually /var/log)
Application
STDOUT 1 STDERR 2STDIN
Standard I/O Streams
{ echo "stdout"; echo "stderr" 1>&2; } | grep -v std
STDOUT 1 STDERR 2
Logging in container World!
The container world
Bare metal Container world
Service architecture Monolithic Microservices
System image Mutable Immutable
Local data Persistent Ephemeral
Network Physical Address No fixed address
Environment Manually / Automation Orchestration tools
Logging syslogd/rsync ?
*There is nothing wrong with monolithic system
unless you can distinguish boundaries in the system and
move that domain to the service on demand !
What are the challenges with logs in
container world?
Logging challenges with Containers
• No permanent storage (Container are stateless and storage is Ephemeral
• No fixed physical address.
• No fixed mapping between server and roles
• Lots of various application types
• Transfer logs immediately to distributed logging infrastructure
• Push logs from containers
• Labels logs with service name or use tags
• Need to handle various logs with regexp, GROK
Logging and Docker containers
Logging and Docker container strategy
• Application should writes a message to the STDOUT
STDOUT
APPLICATION running in
Docker container
Hello World!
Logging and Docker container strategy
• Message encapsulated in a JSON map (with JSON driver) structure via Docker.
Hello World!
{
“log” : “hello World!”,
“stream”: “stdout”,
“time”: “timestamp"
}
Logging and Docker container strategy
/var/log/docker/containers/00fae94d9a721bec312dba411…
f55f37e37/00fae94d9a721bec312dba41168231…6303f274f55f37e37-json.log
$: docker run -d busybox echo -n “Hello World!"
00fae94d9a721bec312dba411682313a4ab8846f01f7b406303f274f55f37e37
> cat
00fae94d9a721bec312dba411682313a4ab8846f01f7b406303f274f55f37e37-
json.log
{
“log”:”HelloWorld!",
“stream":"stdout",
“time":"2019-11-21T14:12:01.599413578Z"
}
Application running in a cluster
Node1
Node10
Node2 Node3
Node…nNode20
Region A Region B
Region C Region D
What is the approach to the logging
in the container world?
Treat logs as an event stream
Treat logs as an event stream
• Application should be stateless and does not store data / logs locally.
• Logs should not attempt to write to local storage
• Logs should not be managed locally, e.g. logrotate
• All logs should be treated as an event streams
• Each running process writes its event to STDOUT and STDERR
• In container based environment logging should be sent to STDOUT
Logging in the context of distributed
cluster
HOTELS
INSURENCES
RAILS
CARS
FLIGHTS INVOICESPAYMENTSPROFILESSEARCH
API GATEWAY
USERS
PROFILES PAYMENTS INVOICESSEARCHFLIGHTS
CARS
HOTELS
RAILS
INSURANCE
APP LOGS
API GATEWAY
Live aggregated logs
KPI, Dashboards
Analytics
grep / awk / Perl :-)
Log collectors for Central logging
• Logstash from Elastic Stack, Fluentd, Apache Flume and many more…
LOGS LOG COLLECTOR STORAGE
• Example storage options:
• S3, MongoDB, Hadoop, Elasticsearch
• file, forward, copy, stdout (useful for debugging)
Fluentd data collector
• An extensible and reliable data collection.
• Unified Logging Layer - treats logs as JSON
• Pluggable Architecture
• Supports memory and file based buffering to prevent internode data lost
• Built-in HA and load balancing
CORE
• Divide and conquer
• Buffering and retries
• Error Handling
• Message routing
• Parallelism
PLUGINS
• Read data
• Parse data
• Buffer data
• Write data
• Format data
Unifying logging layer
Services Services
Collector nodes
Aggregator Nodes
Elasticsearch
Fluentd
Application generates logs
Convert raw log data
in a structured data
Aggregated structured data
Structured Data Ready for analysis
An event in Fluentd
TAG: myapp.access
TIME: (current time)
RECORD: {“event”: “data”}
INPUT PARSER FILTER BUFFER OUTPUT FORMATTER
Internal architecture of plugins
“input-ish” “output-ish”
TAG TIME
RECORD
ROUTER
input - filter Output
Chunk
Chunk
Chunk
Metadata
Metadata
Metadata
BUFFER
Chunk
QUEUE
Chunk
ChunkChunk
Chunk
Process
Format
Write
Try _write OUTPUT
EMIT
ENQUEUE
source: https://docs.fluentd.org/output
Brief overview of configuration
• <source> where all the data come from, routing engine
• <match> Tell Fluentd what to do!
• <filter> Event processing pipeline
• INPUT -> filter 1 -> …. -> filter N -> OUTPUT
• <system> - system directive
• <label> use for grouping filter and output for internal routing
• @include split config into multiple files and re-use configuration
Source: https://docs.fluentd.org/configuration/config-file
<source>
@type forward
port 24223
bind 0.0.0.0
tag backend.invoice
</source>
<filter **>
@type parser
key_name log
reserve_data true
hash_value_field log
….
<parse>
@type multi_format
<pattern>
format json
</pattern>
…
</parse>
</source>
<match **>
@type Elasticsearch
host “#{ENV[‘ES_HOST’]}”
Port 9200
id_key hash
remove_keys hash
type_name doc
logstash_format true
logstash_dateformat %Y.%m
logstash_prefix logs
logstash_tag_key true
tag_key serviceTagName
…
<buffer tag>
@type memory
flush_thread_count 2
</buffer>
</match>
<source>
@type forward
port 24224
bind 0.0.0.0
</source>
<match backend.*>
@type mongo
Database fluent
Collection test
</match>
Docker fluentd driver
• The logging driver sends container logs to Fluentd in as structured log data
• Metadata: container_id, container_name, source, logs
• —log-driver fluentd —log-opt tag=docker.{{.ID} —log-opt fluentd-
address=tcp://fluenthost
• Messages are buffered until connection is established.
• The data can be buffered before flushing
• Retry, max-retry, sub-second-precision…
Architecture of the demo
environment
FLUENTD ELASTICSEARCH KIBANA
TRAEFIK
Frontend
BACKEND
Live demo & Examples of code
JAKUB HAJEK,
JAKUB.HAJEK@COMETARI.COM, @_jakubhajek
I'm waiting for
your feedback!
You can rate speakers and lectures
using our official conference app

More Related Content

Docker Logging and analysing with Elastic Stack - Jakub Hajek

  • 1. Docker Logging and analysing with Elastic Stack JAKUB HAJEK, jakub.hajek@cometari.com, @_jakubhajek November 2019, Warsaw www.devopsdays.pl
  • 2. Introduction • I am the owner and technical consultant working for Cometari • I have been system admin since 1998. • Cometari is a solution company implementing DevOps culture, providing consultancy, workshops and software services. • Our areas of expertise are DevOps, Elastic Stack (log analysis), Cloud Computing. • We are very deeply involved in the travel tech industry, however our solutions go much further than just integrating travel API’s.
  • 3. — “I strongly believe that implementing DevOps culture, across the entire organisation, should provide measurable value and solve the real issue rather than generate a new one.”
  • 4. Agenda • A little bit of the theory about logs. • The major difference with old fashioned approach comparing to container world. • Distributed logging with Elasticsearch and Fluentd • Demo of logging based on live demos: • A simple example sending logs from container to Fluentd • Fully fledged environment running on Docker Swarm with deployed: Elasticsearch Cluster, Kibana and Fluentd • Deployed application stack contains multi tier application stack including Traefik frontend and backend application
  • 5. For what for do we need to collect logs?
  • 7. What are logs? • Logs are the stream of aggregated, time ordered events collected from the output stream • The output stream can be generated by processes and backing services • Raw logs are typically a text format with one event per line • Backtraces from exceptions are usually multiline • Logs have no beginning or end but flows continuously as long as the app is operating.
  • 8. Logging considerations • Logging is not cheap. Requires lots of computing: storage, cpu, memory. • Logging can be even expensive if you want to search against logs and correlate data. • Having “LIVE” data accessible immediately can be even more expensive. • Don’t log everything, consider which data you are interested in (it’s not for free) • Logging retention time have to be considered (Curator if you store logs in Elasticsearch) • I recommend Elasticsearch to keep logs as a time based data. It requires some experience with Elasticsearch to provide reliable environment for logs. • Logging is a mess ; Logging is not fun but we have to deal with it and build logging solution
  • 9. Logging in production • Service logs • Web access logs • Transaction Logs • Distributed tracing • System Logs • Syslog, system and other logs • Audit logs • Basic operating system metrics (CPU, memory, load …) Logs for Business KPI Machine Learning Predctive analytics … Logs for Service System monitoring Bottleneck Troubleshooting …
  • 10. Logging is not the same as Monitoring • Logging is recording to diagnose a system • Monitoring is an observation, checking and than recording • A Notification ( usually called alerts) can be send out to any notification channels for both: logging and monitoring • The notification can be triggered when specific criteria is met. e.g. Http_requests_response_code is 500 in the last 60 seconds A plugin had an unrecoverable error. Will restart this plugin. Pipeline_id:main_dlq Plugin: <LogStash::Inputs::DeadLetterQueue pipeline_id=>"main", path=>"/usr/share/logstash/data/dead_letter_queue", id=>"830027210528f50ad1234fe96f0ccc5f8a6989bb0b2d944881373ec56e555357", commit_offsets=>true, enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_32044710-aeb5-4303-ba0e-2feb2dd851e9", enable_metric=>true, charset=>"UTF-8">> Error: Exception: Java::JavaNio::BufferOverflowException Stack: java.nio.HeapByteBuffer.put(java/nio/HeapByteBuffer.java:189) eas_errors{errorType=“CONTENT”,provider=“HRS",requestName="HotelAvailability", errorId=“1234",errorSeverity="2",startDate="2019-11-20T22:00:00",endDate="2019-11-21T21:59:59",} 10.0
  • 11. Standard approach of logging to file system
  • 12. Writing message to a log file Application LOGS Logging} Files -> Multiple places (usually /var/log)
  • 13. Application STDOUT 1 STDERR 2STDIN Standard I/O Streams { echo "stdout"; echo "stderr" 1>&2; } | grep -v std STDOUT 1 STDERR 2 Logging in container World!
  • 14. The container world Bare metal Container world Service architecture Monolithic Microservices System image Mutable Immutable Local data Persistent Ephemeral Network Physical Address No fixed address Environment Manually / Automation Orchestration tools Logging syslogd/rsync ? *There is nothing wrong with monolithic system unless you can distinguish boundaries in the system and move that domain to the service on demand !
  • 15. What are the challenges with logs in container world?
  • 16. Logging challenges with Containers • No permanent storage (Container are stateless and storage is Ephemeral • No fixed physical address. • No fixed mapping between server and roles • Lots of various application types • Transfer logs immediately to distributed logging infrastructure • Push logs from containers • Labels logs with service name or use tags • Need to handle various logs with regexp, GROK
  • 17. Logging and Docker containers
  • 18. Logging and Docker container strategy • Application should writes a message to the STDOUT STDOUT APPLICATION running in Docker container Hello World!
  • 19. Logging and Docker container strategy • Message encapsulated in a JSON map (with JSON driver) structure via Docker. Hello World! { “log” : “hello World!”, “stream”: “stdout”, “time”: “timestamp" }
  • 20. Logging and Docker container strategy /var/log/docker/containers/00fae94d9a721bec312dba411… f55f37e37/00fae94d9a721bec312dba41168231…6303f274f55f37e37-json.log $: docker run -d busybox echo -n “Hello World!" 00fae94d9a721bec312dba411682313a4ab8846f01f7b406303f274f55f37e37 > cat 00fae94d9a721bec312dba411682313a4ab8846f01f7b406303f274f55f37e37- json.log { “log”:”HelloWorld!", “stream":"stdout", “time":"2019-11-21T14:12:01.599413578Z" }
  • 23. What is the approach to the logging in the container world?
  • 24. Treat logs as an event stream
  • 25. Treat logs as an event stream • Application should be stateless and does not store data / logs locally. • Logs should not attempt to write to local storage • Logs should not be managed locally, e.g. logrotate • All logs should be treated as an event streams • Each running process writes its event to STDOUT and STDERR • In container based environment logging should be sent to STDOUT
  • 26. Logging in the context of distributed cluster
  • 27. HOTELS INSURENCES RAILS CARS FLIGHTS INVOICESPAYMENTSPROFILESSEARCH API GATEWAY USERS PROFILES PAYMENTS INVOICESSEARCHFLIGHTS CARS HOTELS RAILS INSURANCE APP LOGS API GATEWAY Live aggregated logs KPI, Dashboards Analytics grep / awk / Perl :-)
  • 28. Log collectors for Central logging • Logstash from Elastic Stack, Fluentd, Apache Flume and many more… LOGS LOG COLLECTOR STORAGE • Example storage options: • S3, MongoDB, Hadoop, Elasticsearch • file, forward, copy, stdout (useful for debugging)
  • 29. Fluentd data collector • An extensible and reliable data collection. • Unified Logging Layer - treats logs as JSON • Pluggable Architecture • Supports memory and file based buffering to prevent internode data lost • Built-in HA and load balancing
  • 30. CORE • Divide and conquer • Buffering and retries • Error Handling • Message routing • Parallelism PLUGINS • Read data • Parse data • Buffer data • Write data • Format data
  • 31. Unifying logging layer Services Services Collector nodes Aggregator Nodes Elasticsearch Fluentd Application generates logs Convert raw log data in a structured data Aggregated structured data Structured Data Ready for analysis
  • 32. An event in Fluentd TAG: myapp.access TIME: (current time) RECORD: {“event”: “data”}
  • 33. INPUT PARSER FILTER BUFFER OUTPUT FORMATTER Internal architecture of plugins “input-ish” “output-ish”
  • 34. TAG TIME RECORD ROUTER input - filter Output Chunk Chunk Chunk Metadata Metadata Metadata BUFFER Chunk QUEUE Chunk ChunkChunk Chunk Process Format Write Try _write OUTPUT EMIT ENQUEUE source: https://docs.fluentd.org/output
  • 35. Brief overview of configuration • <source> where all the data come from, routing engine • <match> Tell Fluentd what to do! • <filter> Event processing pipeline • INPUT -> filter 1 -> …. -> filter N -> OUTPUT • <system> - system directive • <label> use for grouping filter and output for internal routing • @include split config into multiple files and re-use configuration Source: https://docs.fluentd.org/configuration/config-file
  • 36. <source> @type forward port 24223 bind 0.0.0.0 tag backend.invoice </source> <filter **> @type parser key_name log reserve_data true hash_value_field log …. <parse> @type multi_format <pattern> format json </pattern> … </parse> </source> <match **> @type Elasticsearch host “#{ENV[‘ES_HOST’]}” Port 9200 id_key hash remove_keys hash type_name doc logstash_format true logstash_dateformat %Y.%m logstash_prefix logs logstash_tag_key true tag_key serviceTagName … <buffer tag> @type memory flush_thread_count 2 </buffer> </match> <source> @type forward port 24224 bind 0.0.0.0 </source> <match backend.*> @type mongo Database fluent Collection test </match>
  • 37. Docker fluentd driver • The logging driver sends container logs to Fluentd in as structured log data • Metadata: container_id, container_name, source, logs • —log-driver fluentd —log-opt tag=docker.{{.ID} —log-opt fluentd- address=tcp://fluenthost • Messages are buffered until connection is established. • The data can be buffered before flushing • Retry, max-retry, sub-second-precision…
  • 38. Architecture of the demo environment
  • 41. Live demo & Examples of code
  • 42. JAKUB HAJEK, JAKUB.HAJEK@COMETARI.COM, @_jakubhajek I'm waiting for your feedback! You can rate speakers and lectures using our official conference app