Mining Human-scale Insights from Log Data with Machine Learning
---
David Andrzejewski - @davidandrzej
Data Sciences Engineering, Sumo Logic
Report
Share
Report
Share
1 of 70
Download to read offline
More Related Content
OC Big Data Monthly Meetup #5 - Session 2 - Sumo Logic
1. Mining
human-‐scale
insights
from
log
data
with
machine
learning
David
Andrzejewski
-‐
@davidandrzej
Data
Sciences
Engineering,
Sumo
Logic
OC
Big
Data
Meetup,
September
17,
2014
5. The
Problem
We
Solve
“More Logs Are Created In A Single Day Now Than in All of FY 2003,” Gartner
Machine Generated
Clickstream
Web Servers, Email
Applications, Mobile
Security Devices, Desktops
Human Generated
Orders, Blogs, Social Networks,
HR, Inventory, Manufacturing
Networks, Servers, Hypervisors
Machine Data is the largest, fastest growing, most
complex segment of Big Data.
2003 2005 2007 2009 2011 2013 2015
6. Sumo
Logic
“Turning Machine Data Into IT and Business Insights”
6
Search, monitor, visualize
Learn, classify, predict
11. Anatomy
of
a
log
message:
Five
W’s
! When?
11
Timestamp
with
Ome
zone
12. Anatomy
of
a
log
message:
Five
W’s
! When?
12
Timestamp
with
Ome
zone
! Where?
Host,
module,
code
locaOon
13. Anatomy
of
a
log
message:
Five
W’s
! When?
13
Timestamp
with
Ome
zone
! Where?
Host,
module,
code
locaOon
! Who?
AuthenOcaOon
context
14. Anatomy
of
a
log
message:
Five
W’s
! When?
14
Timestamp
with
Ome
zone
! Where?
Host,
module,
code
locaOon
! Who?
AuthenOcaOon
context
! What?
Log
level
and
key-‐value
pairs
15. Inhuman
scale
! Logs:
like
“computer
tweets”
! TwiZer
2013*
• Peak
@
~144k
TPS
• Avg
~6k
tweets
/
second
! Log
data
• Example:
1
TB
/
day
• Avg
~25k
logs
/
second
* https://blog.twitter.com/2013/new-tweets-per-second-record-and-how
15
16. Inhuman
complexity
South
Hampstead
Marylebone
Dalston Junction
Haggerston
“A
distributed
system
is
one
in
which
the
failure
of
a
computer
you
didn't
even
know
existed
can
render
your
own
computer
unusable.”
-‐
Leslie
Lamport
16
River Thames
Central
2
2
Moorgate
1 1 Tottenham
Court Road
Piccadilly
Circus
1
Embankment
Lambeth
North
Bethnal
Green
Pimlico
Camden Town
Swiss Cottage
Imperial
Wharf
Finchley Road
Stepney Cannon Street
Mansion House
Borough
Brondesbury Caledonian
Road &
Barnsbury
Homerton
Limehouse Wapping
Hoxton
Rotherhithe
Surrey Quays
Whitechapel
Baker
Street
Regent’s Park
Edgware
Road
Goodge
Street
Bayswater
Warren Street
Aldgate
Euston
Farringdon
Barbican
Russell
Square
Mornington
Crescent
High Street
Kensington
Old Street
St. John’s Wood
Green Park
Notting
Hill Gate
Victoria
Aldgate
East
Blackfriars
Temple
Oxford
Circus
Bond
Street
Tower
Hill
Westminster
Charing
Cross
Holborn
Tower
Gateway
Monument
Leicester Square
London
Bridge
St. Paul’s
Hyde Park Corner
Knightsbridge
Angel
Queensway Marble
Arch
South
Kensington
Sloane
Square
Covent Garden
Liverpool
Street
Great
Portland
Street
Bank
Chancery
Lane
Lancaster
Gate
Fenchurch Street
Gloucester
Road St. James’s
Park
Bermondsey
Shoreditch
High Street
King’s Cross
St. Pancras
Euston
Edgware Square
Road
Southwark
Waterloo
Canonbury
Shadwell
Canada
Water
25. In
the
beginning,
there
was
the
prind()
printf("Health status check: %s is %s”,
hostid, hoststatus)
Log generation
Health status check: zim-5 is OK
Health status check: gir-3 is OK
Health status check: gir-2 is TIMED OUT
Health status check: dib-1 is OK
26. Reverse
engineering
prind()
printf("Health status check: %s is %s”,
hostid, hoststatus)
Log generation
Health status check: zim-5 is OK
Health status check: gir-3 is OK
Health status check: gir-2 is TIMED OUT
Health status check: dib-1 is OK
“magic”
Health status check: *** is ***
27. 1. Define string distance function
2. Do distance-based clustering
27
Unsupervised clustering
! Given: log messages
! Do: group by “signature”
50. User
action
webID=7F92
Initiating
requestID=082A
for
webID=7F92
…
…
orderID=34C8
received
for
requestID=082A
…
51. User
action
webID=7F92
Initiating
requestID=082A
for
webID=7F92
…
…
orderID=34C8
received
for
requestID=082A
…
Retrieving
userID=11D2
for
requestID=082A
…
52. User
action
webID=7F92
Initiating
requestID=082A
for
webID=7F92
…
…
orderID=34C8
received
for
requestID=082A
…
Retrieving
userID=11D2
for
requestID=082A
…
…
accountID=1234
access,
userID=11D2
…
53. User
action
webID=7F92
Initiating
requestID=082A
for
webID=7F92
…
…
orderID=34C8
received
for
requestID=082A
…
Retrieving
userID=11D2
for
requestID=082A
…
…
accountID=1234
access,
userID=11D2
…
ERROR
accountID=1234
not
found!
PROCESSING
FAILED:
webID=79F92