Docker Logging and analysing with Elastic Stack - Jakub Hajek

Docker Logging
and analysing with
Elastic Stack
JAKUB HAJEK,
jakub.hajek@cometari.com, @_jakubhajek November 2019, Warsaw
www.devopsdays.pl

Introduction
• I am the owner and technical consultant working for Cometari
• I have been system admin since 1998.
• Cometari is a solution company implementing DevOps culture, providing
consultancy, workshops and software services.
• Our areas of expertise are DevOps, Elastic Stack (log analysis), Cloud
Computing.
• We are very deeply involved in the travel tech industry, however our solutions go
much further than just integrating travel API’s.

—
“I strongly believe that implementing DevOps
culture, across the entire organisation, should
provide measurable value and solve the real
issue rather than generate a new one.”

Agenda
• A little bit of the theory about logs.
• The major difference with old fashioned approach comparing to container world.
• Distributed logging with Elasticsearch and Fluentd
• Demo of logging based on live demos:
• A simple example sending logs from container to Fluentd
• Fully ﬂedged environment running on Docker Swarm with deployed:
Elasticsearch Cluster, Kibana and Fluentd
• Deployed application stack contains multi tier application stack including Traeﬁk
frontend and backend application

For what for do we need to collect
logs?

What are logs?
• Logs are the stream of aggregated, time ordered events collected from the
output stream
• The output stream can be generated by processes and backing services
• Raw logs are typically a text format with one event per line
• Backtraces from exceptions are usually multiline
• Logs have no beginning or end but ﬂows continuously as long as the app is
operating.

Logging considerations
• Logging is not cheap. Requires lots of computing: storage, cpu, memory.
• Logging can be even expensive if you want to search against logs and correlate data.
• Having “LIVE” data accessible immediately can be even more expensive.
• Don’t log everything, consider which data you are interested in (it’s not for free)
• Logging retention time have to be considered (Curator if you store logs in Elasticsearch)
• I recommend Elasticsearch to keep logs as a time based data. It requires some
experience with Elasticsearch to provide reliable environment for logs.
• Logging is a mess ; Logging is not fun but we have to deal with it and build logging
solution

Logging in production
• Service logs
• Web access logs
• Transaction Logs
• Distributed tracing
• System Logs
• Syslog, system and other logs
• Audit logs
• Basic operating system metrics (CPU, memory, load …)
Logs for Business
KPI
Machine Learning
Predctive analytics
…
Logs for Service
System monitoring
Bottleneck
Troubleshooting
…

Logging is not the same as Monitoring
• Logging is recording to diagnose a system
• Monitoring is an observation, checking and than recording
• A Notification ( usually called alerts) can be send out to any notification
channels for both: logging and monitoring
• The notification can be triggered when specific criteria is met. e.g.
Http_requests_response_code is 500 in the last 60 seconds
A plugin had an unrecoverable error. Will restart this plugin.
Pipeline_id:main_dlq
Plugin: <LogStash::Inputs::DeadLetterQueue pipeline_id=>"main", path=>"/usr/share/logstash/data/dead_letter_queue", id=>"830027210528f50ad1234fe96f0ccc5f8a6989bb0b2d944881373ec56e555357", commit_offsets=>true,
enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_32044710-aeb5-4303-ba0e-2feb2dd851e9", enable_metric=>true, charset=>"UTF-8">>
Error:
Exception: Java::JavaNio::BufferOverflowException
Stack: java.nio.HeapByteBuffer.put(java/nio/HeapByteBuffer.java:189)
eas_errors{errorType=“CONTENT”,provider=“HRS",requestName="HotelAvailability",
errorId=“1234",errorSeverity="2",startDate="2019-11-20T22:00:00",endDate="2019-11-21T21:59:59",} 10.0

Standard approach of logging to ﬁle
system

Writing message to a log ﬁle
Application
LOGS Logging}
Files -> Multiple places (usually /var/log)

Application
STDOUT 1 STDERR 2STDIN
Standard I/O Streams
{ echo "stdout"; echo "stderr" 1>&2; } | grep -v std
STDOUT 1 STDERR 2
Logging in container World!

The container world
Bare metal Container world
Service architecture Monolithic Microservices
System image Mutable Immutable
Local data Persistent Ephemeral
Network Physical Address No ﬁxed address
Environment Manually / Automation Orchestration tools
Logging syslogd/rsync ?
*There is nothing wrong with monolithic system
unless you can distinguish boundaries in the system and
move that domain to the service on demand !

What are the challenges with logs in
container world?

Logging challenges with Containers
• No permanent storage (Container are stateless and storage is Ephemeral
• No ﬁxed physical address.
• No ﬁxed mapping between server and roles
• Lots of various application types
• Transfer logs immediately to distributed logging infrastructure
• Push logs from containers
• Labels logs with service name or use tags
• Need to handle various logs with regexp, GROK

Logging and Docker container strategy
• Application should writes a message to the STDOUT
STDOUT
APPLICATION running in
Docker container
Hello World!

• Message encapsulated in a JSON map (with JSON driver) structure via Docker.
Hello World!
{
“log” : “hello World!”,
“stream”: “stdout”,
“time”: “timestamp"
}

/var/log/docker/containers/00fae94d9a721bec312dba411…
f55f37e37/00fae94d9a721bec312dba41168231…6303f274f55f37e37-json.log
$: docker run -d busybox echo -n “Hello World!"
00fae94d9a721bec312dba411682313a4ab8846f01f7b406303f274f55f37e37
> cat
00fae94d9a721bec312dba411682313a4ab8846f01f7b406303f274f55f37e37-
json.log
{
“log”:”HelloWorld!",
“stream":"stdout",
“time":"2019-11-21T14:12:01.599413578Z"
}

Application running in a cluster

Node1
Node10
Node2 Node3
Node…nNode20
Region A Region B
Region C Region D

What is the approach to the logging
in the container world?

Treat logs as an event stream
• Application should be stateless and does not store data / logs locally.
• Logs should not attempt to write to local storage
• Logs should not be managed locally, e.g. logrotate
• All logs should be treated as an event streams
• Each running process writes its event to STDOUT and STDERR
• In container based environment logging should be sent to STDOUT

Logging in the context of distributed
cluster

HOTELS
INSURENCES
RAILS
CARS
FLIGHTS INVOICESPAYMENTSPROFILESSEARCH
API GATEWAY
USERS
PROFILES PAYMENTS INVOICESSEARCHFLIGHTS
CARS
HOTELS
RAILS
INSURANCE
APP LOGS
API GATEWAY
Live aggregated logs
KPI, Dashboards
Analytics
grep / awk / Perl :-)

Log collectors for Central logging
• Logstash from Elastic Stack, Fluentd, Apache Flume and many more…
LOGS LOG COLLECTOR STORAGE
• Example storage options:
• S3, MongoDB, Hadoop, Elasticsearch
• ﬁle, forward, copy, stdout (useful for debugging)

Fluentd data collector
• An extensible and reliable data collection.
• Uniﬁed Logging Layer - treats logs as JSON
• Pluggable Architecture
• Supports memory and ﬁle based buffering to prevent internode data lost
• Built-in HA and load balancing

CORE
• Divide and conquer
• Buffering and retries
• Error Handling
• Message routing
• Parallelism
PLUGINS
• Read data
• Parse data
• Buffer data
• Write data
• Format data

Unifying logging layer
Services Services
Collector nodes
Aggregator Nodes
Elasticsearch
Fluentd
Application generates logs
Convert raw log data
in a structured data
Aggregated structured data
Structured Data Ready for analysis

An event in Fluentd
TAG: myapp.access
TIME: (current time)
RECORD: {“event”: “data”}

INPUT PARSER FILTER BUFFER OUTPUT FORMATTER
Internal architecture of plugins
“input-ish” “output-ish”

TAG TIME
RECORD
ROUTER
input - filter Output
Chunk
Chunk
Chunk
Metadata
Metadata
Metadata
BUFFER
Chunk
QUEUE
Chunk
ChunkChunk
Chunk
Process
Format
Write
Try _write OUTPUT
EMIT
ENQUEUE
source: https://docs.ﬂuentd.org/output

Brief overview of configuration
• <source> where all the data come from, routing engine
• <match> Tell Fluentd what to do!
• <filter> Event processing pipeline
• INPUT -> filter 1 -> …. -> filter N -> OUTPUT
• <system> - system directive
• <label> use for grouping filter and output for internal routing
• @include split config into multiple files and re-use configuration
Source: https://docs.fluentd.org/configuration/config-file

<source>
@type forward
port 24223
bind 0.0.0.0
tag backend.invoice
</source>
<filter **>
@type parser
key_name log
reserve_data true
hash_value_field log
….
<parse>
@type multi_format
<pattern>
format json
</pattern>
…
</parse>
</source>
<match **>
@type Elasticsearch
host “#{ENV[‘ES_HOST’]}”
Port 9200
id_key hash
remove_keys hash
type_name doc
logstash_format true
logstash_dateformat %Y.%m
logstash_prefix logs
logstash_tag_key true
tag_key serviceTagName
…
<buffer tag>
@type memory
flush_thread_count 2
</buffer>
</match>
<source>
@type forward
port 24224
bind 0.0.0.0
</source>
<match backend.*>
@type mongo
Database fluent
Collection test
</match>

Docker ﬂuentd driver
• The logging driver sends container logs to Fluentd in as structured log data
• Metadata: container_id, container_name, source, logs
• —log-driver fluentd —log-opt tag=docker.{{.ID} —log-opt fluentd-
address=tcp://fluenthost
• Messages are buffered until connection is established.
• The data can be buffered before ﬂushing
• Retry, max-retry, sub-second-precision…

Architecture of the demo
environment

JAKUB HAJEK,
JAKUB.HAJEK@COMETARI.COM, @_jakubhajek
I'm waiting for
your feedback!
You can rate speakers and lectures
using our official conference app

Docker Logging and analysing with Elastic Stack - Jakub Hajek

More Related Content

Docker Logging and analysing with Elastic Stack - Jakub Hajek