Scaling Security Workflows in Government Agencies

Scaling Security
Workflows in
Government Agencies
September 28, 2017 | 11:00 AM ET
WEBINAR

Housekeeping
2
• Recording
• Today’s Slides
• Attachments
• Questions
• Feedback

Our Presenters
3
Keith Ober
Systems Engineer
Avere Systems
Bernie Behn
Principal Product Engineer
Avere Systems

What’s the problem?
Data, Data Everywhere

Security Analysis Workflow
• Acquire and Aggregate Inputs
• Normalize Data
• Archive Raw / Archive Normalized Data
• Analyze for Patterns
• Alert or Remediate

Types of Inputs
• Network Equipment: routers, switches, firewalls, VPN appliances, etc.
• IT: physical servers, VM infrastructure, virtual machines, directory services,
end user desktops/laptops
• Application Layer: log files, access logs for applications, web servers
• Miscellaneous: sensor data

Typical Ingest Workflow
Ingest Node(s)
Normalize/Filter
Storage
Analyze data
Report/Alert
Logs, streamed to
Ingest Nodes

Ingest Node(s)
Normalize/Filter
Storage
I/O at scale can slow
down storage, backing up
entire workflow

Ingest Node(s)
Normalize/Filter
Storage
Analyze data
Report/Alert
If analysis is not co-located,
latency can impede
analysis

Typical Analysis Workflow
Ingest Node(s)
Normalize/Filter
Storage
Analyze data
Report/Alert
If analysis is not co-located,
latency can impede
analysis

The meta-problem: Log File Ingest and Processing
12
All router1 log files from the beginning
of time...well, from when we started
gathering them...
All log files from firewall 1 All log files from server 1Time
and
Volume
NET: A lot of historical data over time, applying pressure
both in terms of storage and processing

Log Ingest Writ Large
TheTrue Scale of
Enterprise Ingest!

Five Big Challenges
Let’s Break It Down.

5 challenges when engineering security workflows
15
1. Ingest Latency and Throughput
2. Vendor Lock-In
3. Life Cycle Management
4. Data Availability and Redundancy
5. Cloud Integration

Challenge 1: Ingest Latency and Throughput
Ingest Node
Normalize/Filter
Storage
I/O at scale can slow down
storage, backing up entire
workflow

Ingest Latency Scales Too
Storage
The scale forces multiple storage sites,
and on some products requires a
replication mechanism, introducing more
cost, overhead and latency.
Ingest Node(s)
Normalize/Filter
Storage
Storage
Storage
Storage
Volume of inputs will
drive the number of
ingest nodes required.
IOT devices are increasing the amount of
log data being generated and ingested.

Challenge 2: Vendor Lock-In
Storage
Ingest Node(s)
Normalize/Filter
Storage
Storage
Storage
Storage
As solution scales:
1. Additional demand for storage increases costs, and
lowers performance.
2. Deficiencies in current storage solution amplifies as
deployment grows, longer upgrade outages.
3. Vendors often limit interoperability with other products
when it comes to replication and tiering.

Vendor Lock-In
Ability to transfer data to new solution is difficult:
• Business Continuity -- How do you stop ingesting and processing logs?
• Interoperability -- How do you ensure that your new/proposed storage solution will
work well in a high-performance environment
: Ingest performance
: Read performance
: Scale

Challenge 3: Data Lifecycle Management
Storage
Ingest Node(s)
Normalize/Filter
Storage
Storage
StorageMore Expensive
Storage
Value of the data
1. Lowering the cost to store means you can store
more and derive greater value.
2. New analytic methods/tools bring a fresh round of
analysis and burst workloads.
3. How can we begin to build AI based workloads?
Cheaper
Storage
Warm Data
Cold Data

Lifecycle Management
Storage Performance
• Ingest & analysis requires higher performance storage = More expensive
storage
• Over time simply too much data to store in performance tier
• Deletion of older data possible?

Challenge 4: Data Availability and Redundancy
Ingest Node
Normalize/Filter
Storage
Ingest Node
Normalize/Filter
Ingest Node
Normalize/Filter
Storage
Storage
Reporting and
Processing
Node
Reporting and
Processing
Node
Reporting and
Processing
Node

Data Availability and Redundancy
• Performance at scale requires distributing the reporting/analysis
• Geographical location of ingest may also be distributed
• Critical data: can’t lose it so avoid single point of failure
• Large data sets with streaming data are extremely difficult to backup with
traditional methods and are cost prohibitive

Challenge 5: Cloud Integration: Storage
Storage
Ingest Node(s)
Normalize/Filter
Storage
Storage
Storage
Storage
1. How to start archiving data to Cloud Storage, in
order to lower cost?
2. How can businesses leverage cloud-based AI
workloads against the same data they have
today?

Cloud Integration: Storage
Cloud Storage Pros
• Cloud Storage can be very
inexpensive
• Reduces the need to own and
maintain additional IT assets
• Public clouds have built-in
redundancy
Cloud Storage Cons
• Cloud storage is eventually
consistent...a query immediately
after a write may not succeed
• Lowest-cost storage is object, and
requires S3-compliant application
access

And, so, what do we do?
Cache It If You Can

How can Avere address those challenges?
27
Challenges Avere Solution
Ingest Latency and Throughput Avere write caching
Avere FXT NVRAM
10GB+ Bandwidth
Vendor Lock-In Global Namespace
Flash Move & Mirror
Life Cycle Management
Data Availability and Redundancy HA, Clustering
Flash Move & Mirror
Cloud Integration Avere vFXT compute-based appliance
Avere FlashCloud for AWS S3 and GCS

Speed Ingest via write-behind caching
• Gather writes (ack’ing clients immediately) and flushing in parallel
• Hardware: NVRAM for write protection and caching
• Clustered caching solution distributes writes across multiple nodes
Accelerate read performance with distributed, read-ahead caching
• Read-ahead a request (read a bit more than what was requested)
• Cache requests for other readers (typical in analytic workloads)
• Writes cached as written, speeding analysis workloads
The Power of High-Performance Caching

Caching Basic Architecture
Ingest Node
Normalize/Filter
Storage
● Ingest nodes write to NFS
mount points distributed across
a caching layer
Ingest Node
Normalize/Filter
Ingest Node
Normalize/Filter
● Writes are flushed to storage over time,
smoothing the ingest
● Writes are protected within the cluster via
HA mirror
Reporting /
Analysis
Node(s)
● Reporting / Analysis nodes access
data via the cluster.
● Reads are cached, eventually aged
● Written data in the near term is cached
and available
Avere FXT Cluster

Data Placement
Ingest Node
Normalize/Filter
Storage
Ingest Node
Normalize/Filter
Ingest Node
Normalize/Filter
Avere can Mirror data to
cloud storage for longer-
term archiving
Data is accessible
through the cluster, as
though it were on the
primary storage
Reporting
Analysis
Node(s)

Does this Really Work?
From the Field Use Case

Avere Security Workflow
FXT Cluster
DMZ Network
Central Control
Container based
applications to Normalize
data
DATA
Syslog/NetFlow/…
DATA
Streaming data to Avere
with no direct access
Core Filers
Splunk
Configure Splunk to eat data from
separate vServer (isolating traffic)
Splunk Data Consumers
Web access to visualize data ingested
and analyzed by Splunk
Mirror/Migrate/Cloud
Core Filers

33
SC17
November 12-17, 2017
Denver, Colorado
AIRI 2017
October 1-4, 2017
Washington DC
AWS re:Invent
Nov. 27- Dec. 1, 2017
Las Vegas, Nevada

Contact Us!
34
Keith Ober
Systems Engineer
Avere Systems
kober@averesystems.com
Bernie Behn
Principal Product Engineer
Avere Systems
bbehn@averesystems.com
AvereSystems.com
888.88 AVERE
askavere@averesystems.com
Twitter: @AvereSystems

Scaling Security Workflows in Government Agencies

More Related Content

Scaling Security Workflows in Government Agencies