MongoDB World 2018: MongoDB for High Volume Time Series Data Streams

© 2018 Cisco Systems, Inc. All rights reserved.
MongoDB for High Volume
Time Series Data Streams
MongoDB World 2018
Gabriel Ng, Tom Monk, Kollivakkam Raghavan
Cisco Systems, Inc.

The Problem – Network Assurance
Network assurance is the guarantee that the network is
doing what the operator(s) intended it to do.

The Ecosystem - IT Demands on the Modern Datacenter
Agility
Months -> Hours
SecurityMobility
Scale
10K -> 100K -> 1B
• Network Engineers
• Application Developers
• Security Architects
• Server Engineers
• Network Operations
• Data Scientists
• Dev Ops

Policy-Based Data Center
 Controller with end-to-end application
awareness
 IP fabric connecting all physical and virtual
workloads and services
 Application Network Profile (ANP) pushed
to all components
Database Tier
Application Tier
Web Tier
ProfilesController

The Constraints
 Must work with multiple form factors
 Personal laptop – single VM (limited memory, SSD disk)
 Lab Environment – multiple VMs (HDD disks – limited storage)
 Production – multiple VMs (HDD/SSD disks)
 Solution must work with limited memory
 Constrained Wired Tiger cache
 Need to maximize write throughput without compromising reads
 Querying flexibility
 Extensive use of Aggregation pipeline

Cisco Network Assurance Engine
Cisco NAE

Cisco Network Assurance Engine: How It Works
Comprehensive
Network Modeling
Using formal methods (area of
Comp Sci) to mathematically
compute consistency
Analyze the results and
recommend remediation steps for
problems
Data
Collection
Captures all non-packet data:
intent, policy, state across data
center network
Intelligent
Analysis

User Interface: Search and Visualization

User Interface: Change Management Events
Events: What, Where, Why, and How

User Interface: Incident and Problem Management

High Volume Time Series Data Stream App
 ~12 million time series data points per hour for the
largest fabrics
 Proper analysis of data stream requires keeping
several hours of recent context data on hand
 Streaming data platform with 3 tier web stack WT
cache
Relative size of
WiredTiger cache
Random access index
eventcollection

Agenda
 Incremental Optimizations
 B-tree Indexes
 Date Interval Partitioning
 System Configs
 Pre-aggregation
 Replication Factor
 Final Design

Sample Time Series Data
 The time at which the data was produced is a critical property
 Other examples:
 Stock ticker data
 HTTP logs
 Twitter firehose
{
"timestamp": ISODate("2018-05-14T08:20:27.433Z"),
"rule": "sys/actrl/scope-16777200/rule-16777200-s-any-d-any-f-implarp",
"leaf": "topology/pod-1/node-1023",
"hitcount": NumberDecimal("934852839479")
}

B-tree Indexes: Refresher
 MongoDB indexes use a B-tree
data structure
 B-tree is a form of a binary search
tree, so index keys are kept in
sorted order
Sample index: {_id: 1}
{
_id: 1
…
}
{
_id: 4
…
}
{
_id: 9
…
}
{
_id: 10
…
}
{
_id: 11
…
}
{
_id: 12
…
}
Document data:
{
_id: 13
…
}
{
_id: 15
…
}
{
_id: 16
…
}
{
_id: 20
…
}
{
_id: 25
…
}

B-tree Indexes: Typical Access Pattern
 Many workloads require supporting random access patterns in indexes
 This is why it is a “law” in database design to make sure your indexes fit in memory
1 23 45 678 9 1011

B-tree Indexes: Time Series Writes Pattern
 For time series data, including the timestamp as the first field in a compound index yields beautiful
properties
Data access pattern for compound index
{ timestamp: 1, rule: 1 }
on an insert-only time series workload.
1 2 3 4 5 6 7 8 9 10 11

B-tree Indexes: Breaking the law!
Reference: MongoDB Manual. Ensure Indexes Fit in RAM. Section “Indexes that Hold Only Recent Values in RAM.”
https://docs.mongodb.com/manual/tutorial/ensure-indexes-fit-ram/
Page on disk!
Page in RAM

B-tree Indexes: Prefix Compression
 Putting timestamp first in a compound index also allows WiredTiger to
do prefix compression on the key values
 Timestamp values in hex:
 0x5ab92260
 0x5ab93070
 0x5ab93e80
 0x5ab94c90
 0x5ab95aa0
 Does it look familiar?
 "_id" : ObjectId("5ab95aa0f32f9359485f8bb3")

Before Date Interval Partitioning
“Right-sized”
index
eventcollection
WT
cache
Relative size of
WiredTiger cache

After Date Interval Partitioning
One “logical”
event collection
…
event_may_1_2018
event_may_2_2018
event_may_3_2018
event_may_4_2018
event_may_5_2018
event_may_6_2018
…
event_june_1_2018
event_june_2_2018
event_june_3_2018
event_june_4_2018
event_june_5_2018
event_june_6_2018
WT
cache
Relative size of
WiredTiger cache
Many “physical”
event collections

MongoDB Performance Configs
 Engage with MongoDB support
 Read: https://docs.mongodb.com/manual/administration/production-notes
 Run Mdiags:
 https://github.com/mongodb/support-tools/blob/master/mdiag/mdiag.sh
 XFS better performance than EXT4
 TCP Keepalive setting recommendations
 Enable swap
 Configs specifically for our product:
 WiredTiger cache size

Pre-aggregation
 For trend queries that must span a large range of time, use pre-aggregation to create a collection of
summarized trend stats
 Raw data point: Daily aggregation:
{
"rule": "sys/actrl/scope-16777200/rule-16777200-s-
any-d-any-f-implarp",
"hitcount": NumberDecimal("934852839479")
}
{
"rule": "sys/actrl/scope-16777200/rule-16777200-s-any-
d-any-f-implarp",
"hitcounts": [
"270" : NumberDecimal("934852348595"),
...
"28947" : NumberDecimal("934857839479")
]
}
Reference: Sandeep Parikh & Kelly Stirman, Schema Design for Time Series Data in MongoDB.
https://www.mongodb.com/blog/post/schema-design-for-time-series-data-in-mongodb

Replication Factor 2 or 3?
Reference: MongoDB Manual. Three Member Replica Sets.
https://docs.mongodb.com/manual/core/replica-set-architecture-three-members/
• Write concern 1 (w1) writes block until acknowledged by primary.
• Write concern 2 (w2) writes block until acknowledged by primary and at least one secondary.
• w1 writes are not 100% durable in the case of primary failure – durability vs. performance
Primary with Two Secondary Members
“PSS,” replication factor 3
Primary with a Secondary and an Arbiter
“PSA,” replication factor 2

0
5,000
10,000
15,000
PSS, w2 writes PSS, w1 writes* PSA, w1 writes
Write throughput on single shard (docs/sec)
Write throughput on single shard (docs/sec)
* PSS, w1 writes resulted in the replication state of the secondary nodes drifting minutes apart under heavy write loads
Reference: Mike LaSpina. Benchmarks on three node replica sets. August 10, 2017.

 CPU and disk load high during heavy periods of writes for Primary and
Secondary
 Secondary lags behind during these periods
 Experiment with PSA (Primary, Secondary and Arbiter)
 Higher write throughput because Primary does not need to service two
Secondary nodes
 Tradeoff/judgement call: 100% durable writes or write throughput?

Final Design
 Date interval partitioning in favor of right-sized B-
tree indexes
 Pre-aggregate expensive queries
 Replication factor 2 write concern 1 for max write
throughput

Questions?
 More information on Cisco Network Assurance Engine
http://cs.co/9007D3dWL
 Cisco is hiring!
http://jobs.cisco.com/

MongoDB World 2018: MongoDB for High Volume Time Series Data Streams

More Related Content

MongoDB World 2018: MongoDB for High Volume Time Series Data Streams

Editor's Notes