Large scale near real-time log indexing with Flume and SolrCloud
- 2. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 2
1. Intro
2. Problem to solve?
3. How does Flume/Solr help?
4. Syslog indexing example
5. HA, DR & scalability
- 3. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 3
Ops Architect at Cisco CCATG (WebEx)
Ensure operational readiness for complex distributed services
HA, DR, monitoring, config, deployment
Previously eBay, Excite@Home, IBM, VISA
Operations architecture, monitoring, event correlation
- 5. © 2012 Cisco and/or its affiliates. All rights reserved. 5
Cisco WebEx Meetings
• Voice, video, desktop sharing
• Meeting/Event/Support/Training
• Centers
• Integration with TelePresence
Cisco WebEx Social
• Social networking
• Content creation
• Integrated IM
Cisco WebEx Messenger
• IM, presence
• Integrate with voice, video
• XMPP
- 6. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 6© 2010 Cisco and/or its affiliates. All rights reserved. 6
Participants from over 231 countries, 52% market share
2.2 Billion meeting minutes per month
40.5 Million meeting attendees per month
9.4 million registered hosts worldwide
4 Million mobile downloads
- 7. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 7© 2010 Cisco and/or its affiliates. All rights reserved. 7
Datacenter / PoP
Leased network link
Global Scale: 13 datacenters &
iPoPs around the globe
Dedicated network: dual path
10G circuits between DCs
Multi-tenant: 95k sites
Real-time collaboration:
voice, desktop sharing, video, chat
- 8. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 8© 2010 Cisco and/or its affiliates. All rights reserved. 8
Datacenter / PoP
Leased network link
People make mistakes
Hardware fails
Software fails
Even failovers sometimes fail
- 9. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 9
“If a problem has no solution, it may not be a problem,
but a fact, not to be solved, but to be coped with over time”
— Shimon Peres (“Peres’s Law”)
People/HW/SW failures are facts, not problems
Operations main goal is to maintain high service availability
• Recovery/repair is how we cope with above facts
• Improving recovery/repair improves availability
UnAvailability = MTTR / MTBF
1/10th MTTR just as valuable as 10x MTBF
- 10. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 10
Even better: proactive
Good: reactive
Your search – What is the root cause of the outage? – did not match any documents.
- 12. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 12
Flume
Log4j
File
Avro
Syslog
Other Sinks
Solr
Sink
Applicationstate&APIs
HDFS
Thrift
AMQP RDBMS
Sqoop
HTTP/REST
MySQL
Unstructured/semi-structured data Structured data
Cisco UCS C240 M3 servers
12 x 3TB = 36 TB / server
HDFS
Sink
SolrCloud
Raw dataSolr index
- 13. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 13
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier
- 14. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 14
agent agent agent
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
…
Replicating
fan-out
flow
Flume Collector server
Failover & load
balancing agents
Flume Storage tier
All events replicated to
both Channels
DC1 DC2
- 15. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 15
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier
- 16. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 16
File
Channel 1
Avro
src
Solr
Sink
HDFS
sink
File
Channel 2
…
Multiplexing
fan-out
flow
Flume Storage tier server
Failover & load
balancing agents
Flume
Collector
Flume
Collector
Flume
Collector
HDFSSolrCloud
Routing to Solr by
Flume event header
All events to HDFS
- 17. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 17
Isn’t Big Data “schema on read”?
• Why does Solr require a schema on write?
• Dirty little secret: there’s always a schema
• Performance & functionality vs flexibility
• Optimize operations and storage based on field type - that's how you
get sub second response times
There’s always a schema
• Application code vs. central location
- 18. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 18
Cloudera Morphlines
• Framework to simplify event transformation
• Compatible with existing grok patterns
• Reusable across multiple index workloads:
Flume & M/R
Command: readLine
Command: grok
Command: loadSolr
Solr
Flume event = headers + body
Record
Document matching schema.xml
Command: tryRules
Command: addValues
…
Record
Record
Record
Record
SolrSink
- 19. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 19
Convert syslog message..
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com : %ACE-3-251008: Health probe
failed for server 10.240.22.111 on port 1234
.. into Solr schema fields
Severity=[3]
Facility=[22]
host=[colo01-wxp00-ace01b-connect.webex.com]
timestamp=[2013-06-16T04:36:49.000Z]
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234]
severity_label=[error]
access_token=[54asdf654]
id=[b2f839c3-dece-404f-a535-e0141ad549bf]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]
- 20. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 20
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 1: readLine reads in Flume event headers and body
timestamp=[1371357409000]
host=[colo01-wxp00-ace01b-connect.webex.com]
category=[545f5sfsd5sf]
Severity=[3]
Facility=[22]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013
04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port
1234]
Headers
Body
- 21. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 21
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 2: convertTimestamp converts epoch to ISO 8601 format
timestamp=[2013-06-16T04:36:49.000Z]
host=[colo01-wxp00-ace01b-connect.webex.com]
access_token=[545f5sfsd5sf]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013
04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port
1234]
Severity=[3]
Facility=[22]
- 22. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 22
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 3: addValues creates new field access_token
timestamp=[2013-06-16T04:36:49.000Z]
category=[545f5sfsd5sf]
access_token=[545f5sfsd5sf]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16
2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
host=[colo01-wxp00-ace01b-connect.webex.com]
Severity=[3]
Facility=[22]
- 23. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 23
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 4: tryRules creates field severity_label for severity
timestamp=[2013-06-16T04:36:49.000Z]
severity_label=[error]
access_token=[545f5sfsd5sf]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16
2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
host=[colo01-wxp00-ace01b-connect.webex.com]
category=[545f5sfsd5sf]
Severity=[3]
Facility=[22]
- 24. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 24
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 5: tryRules creates new fields
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]
- 25. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 25
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 6: sanitizeUnknownSolrFields drops non-schema fields
timestamp=[2013-06-16T04:36:49.000Z]
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
severity_label=[error]
access_token=[545f5sfsd5sf]
host=[colo01-wxp00-ace01b-connect.webex.com]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]
- 26. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 26
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 7: generateUUID creates an unique id for the document
timestamp=[2013-06-16T04:36:49.000Z]
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
severity_label=[error]
access_token=[545f5sfsd5sf]
id=[b2f839c3-dece-404f-a535-e0141ad549bf]
host=[colo01-wxp00-ace01b-connect.webex.com]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]
- 27. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 27
Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 8: loadSolr loads a record into a Solr server
- 28. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 28
Command: readLine
Command: grok
Command: loadSolr
SolrCloud
Flume syslog event = headers + body
Record
Document matching schema.xml
Command: tryRules
Command: addValues
…
Record
Record
Record
Record
SolrSink
- 29. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 29
ZooKeeper
leader1
replica1
Shard1
leader2
replica2
Shard2
leader3
replica3
Shard3
SolrCloud cluster
zk1
zk2
zk3
Pluggable filesystem
(local, HDFS)
Add doc to syslog index
• Collections, shards & replicas
• Pluggable file system
• Central config & coordination with ZK
• Full HA, automatic fail-over
• NRT indexing
• Automatic routing
Where can I index data?
leader3
Collection
- 30. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 30
Collection “syslog” with
three shards
- 31. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 31
Special case of search
• Logs are time series data: timestamp + data
• High indexing rate, no updates
• New data is more frequently searched than old
Collection aliases
• Time partitioned collections – e.g. one collection per day
• Reduces the workload to near-real-time data only
• One-to-many collection mapping: queries go to a logical representation
mapped to multiple, same-schema collection
• Simplifies for hot-warm-cold migration of data
Index expiration
• Old data is aged out by Collection Aliases
• Remap only the latest collection to an alias
- 32. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 32
Solr
• No multi-datacenter cluster support
HDFS
• No multi-datacenter cluster support
Options?
• All our services must survive DC outage
• . . so should logging and indexing
- 33. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 33
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC 2
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
…
Collector
tier
Storage
tierPlanned or
unplanned outage
Flume Collector
disk channel
buffering DC1
events
DC1 Hadoop cluster
back online after outage
Replicate
aggregate
data
- 34. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 34
DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
SolrCloud
DC 1
Flume
Flume
Flume
syslog log4j file
DC 2
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier
Flume
Flume
Flume
distcp
Manual CNAME
change to DC2
DC1 back
online, sync data
from DC2
Data sent only
to a single DC
distcp
DNS CNAME change
back to DC1
Flip distcp
the other way
Flume buffering events
at collector tier
Create indexes with M/R
- 35. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 35
Tiers to scale
• Flume Collector tier
• Flume Storage tier
• SolrCloud
- 36. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 36
100 – 5000 servers per a datacenter
agent agent agent
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
…
Replicating
fan-out
flow
agent agent agent …
…Flume Collector
More agents and data
FileChannel:
14MB/sec
NIC:
100MB/sec
NIC:
100MB/sec
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
Replicating
fan-out
flow
Max per server:
14MB/s
1.2 TB/day
70k events/s
- 37. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 37
DC 1 collectors
DC 1
storage tier
Flume 1
DC 2
storage tier
Avro
sink
1
Avro
sink
2
Avro
sink
N
…
DC 2 collectors
Avro
sink
1
Avro
sink
2
Avro
sink
N
…
DC N collectors
Avro
sink
1
Avro
sink
2
Avro
sink
N
……
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
Max per server:
14MB/s
1.2 TB/day
70k events/s
- 38. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 38
ZooKeeper
leader1
replica1
Shard1
leader2
replica2
Shard2
leader3
replica3
Shard3
SolrCloud cluster
zk1
zk2
zk3
Pluggable filesystem
(local, HDFS)
New logs
to index
Search
queries
1000
tx/sec/core
2x8 cores
16k tx/sec
3 shards
3 x 16k =
48k tx/sec
- 39. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 39
Central syslog servers
• Network and OS system messages forwarded to several central syslog
servers
Forward syslog to Solr using Flume Morphline SolrSink
• Parse messages with Morphline and grok patterns
SolrCloud
• Index log lines as documents into a Collection (i.e. index)
HUE Solr search
• Simple UI to build a customized search page layout with faceting, sorting.
• Easy drill down with multiple facets: severity, datacenter, hostname, etc
- 41. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 41
Search by time
Sort by select field
Facets by selected fields
- 42. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 42
Wildcard query by field
Highlight the query
keywords
- 43. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 43
Data sources: REST/JSON, log4j, syslog, Avro, Thrift
Parsing: Cloudera Morphlines
NRT Indexing: SolrCloud embedded in CDH
Batch indexing: MapReduce
Analytics: Use your favorite tool, raw detailed data stored in HDFS
- 44. C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 44
email: ari.flink@webex.com
twitter: @raaka
Editor's Notes
- As of Feb 2013
- As of Feb 2013
- As of Feb 2013
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing
- CEP: Complex Event Processing