SlideShare a Scribd company logo
Sizing The Elastic Stack for
Security Use Cases
James Spiteri (special thanks to Dave Moore!)
17th March 2021
What we’ll be covering today
- Elasticsearch Internals and Computing Resources - Quick Overview
- Preparation: How much data can I expect?
- Performance: How much can I get out of my hardware?
- Speed: How can I get optimal search performance?
- Using Cross Cluster Search and Data tiers effectively
- Transforms
- Kibana Considerations - The Detection Engine
- Example sizing exercises
Endpoint SIEM
Elastic Security
Computing Resources
Security sizing meetup
Elasticsearch Internals
What’s happening behind the
scenes?
7
Cluster A group of nodes that work together to operate Elasticsearch.
Node A Java process that runs the Elasticsearch software.
Index A group of shards that form a logical data store.
Shard A Lucene index that stores and processes a portion of an Elasticsearch index.
Segment A Lucene segment that immutably stores a portion of a Lucene index.
Document A record that is submitted to and retrieved from an Elasticsearch index.
8
9
Nodes
Role Description Resources
Storage Memory Compute Network
Data Indexes, stores, and searches data Extreme High High Medium
Master Manages cluster state Low Low Low Low
Ingest Transforms inbound data Low Medium High Medium
Machine Learning Processes machine learning models Low Extreme Extreme Medium
Coordinator Delegates requests and merges search results Low Medium Medium Medium
10
Preparation
12
Calculating data storage
requirements
- Ingest a sample
- Monitor size + Ingest Rates
- Calculate going forward
https://www.elastic.co/guide/en/elas
ticsearch/plugins/current/mapper-si
ze.html
13
Summarizing Considerations:
● How much raw data (GB will we index per day?
● How many days will we retain the data for?
● How many days in the hot zone?
● How many days in the warm zone?
● How many replica shards will you enforce?
In general we add 5% or 10% for margin of error and 15% to stay under the disk watermarks.
Performance
15
What is my hardware capable of?
- Run performance benchmarks using Rally
- Understand what throughput you’ll achieve
https://www.elastic.co/blog/rally-1-0
-0-released-benchmark-elasticsear
ch-like-we-do
Search Speed
17
18
It’s all about balance.
Speed
Cluster Size/Cost
19
Keeping in mind:
● Searches run on a single thread per shard
● Shards have overhead
● Shards are balanced by elasticsearch
● Use datastreams and life cycle policies
● Aim for shard sizes between 10GB and 50GB
● Aim for 20 shards or fewer per GB head of memory
Optimise With CCS
21
Optimize using CCS  Cross Cluster Search
It makes sense to have smaller clusters for different users/customers/datasets. CCS makes this
easy.
Transforms
Streamline your logs and events, save time and money.
Security sizing meetup
Kibana and The Detection Engine
25
The Detection Engine
● Detections should be treated like a search
● Detection performance should be monitored regularly
● The Kibana alerting engine can be scaled vertically
and/or horizontally
Kibana task manager workers can be increased in number
to take advantage of vertical scaling, or can be replicated
across separate Kibana instances and scaled horizontally.
When multiple Kibana instances are running, the task
managers will coordinate across the wire to balance the
tasks across the instances. By updating the number of
max_workers inside of the kibana.yml file from it’s default
of 10, you can vertically scale up or down to appropriately
allocate resources more efficiently per Kibana node.
Examples
27
● Total Data (GB  Raw data (GB per day * Number of days
retained * Number of replicas + 1
● Total Storage (GB  Total data (GB * 1  0.15 disk Watermark
threshold + 0.1 Margin of error)
● Total Data Nodes  ROUNDUPTotal storage (GB / Memory per
data node / Memory:Data ratio)
In case of large deployment it's safer to add a node for failover
capacity.
Formulas and Examples:
28
Sizing a small cluster:
You might be pulling logs and metrics from some applications, databases, web
servers, the network, and other supporting services . Let's assume this pulls in
1GB per day and you need to keep the data 9 months.
You can use 8GB memory per node for this small deployment. Let’s do the math:
● Total Data (GB  1GB x (9  30 days) x 2 
540GB
● Total Storage (GB 540GB x (10.150.1 
675GB
● Total Data Nodes  675GB disk / 8GB RAM
/30 ratio = 3 nodes
Sizing a large(r) deployment
Let’s do the math with the following inputs:
● You receive 100GB per day and we need to keep this data for 30
days in the hot zone and 12 months in the warm zone.
● We have 64GB of memory per node with 30GB allocated for heap
and the remaining for OS cache.
● The typical memory:data ratio for the hot zone used is 130 and for
the warm zone is 1160.
If we receive 100GB per day and we have to keep this data for 30 days, this
gives us:
● Total Data (GB in the hot zone = 100GB x 30 days * 2  6000GB
● Total Storage (GB in the hot zone = 6000GB x (10.150.1 
7500GB
● Total Data Nodes in the hot zone = ROUNDUP7500 / 64 / 30  1 =
5 nodes
● Total Data (GB in the warm zone = 100GB x 365 days * 2 
73000GB
● Total Storage (GB in the warm zone = 73000GB x (10.150.1 
91250GB
● Total Data Nodes in the warm zone = ROUNDUP91250 / 64 / 160
 1  10 nodes
Formulas and Examples:
Try free on Cloud:
elastic.co/cloud
Take a quick spin:
demo.elastic.co
Connect on Slack:
ela.st/slack
1 2 3
Join the Elastic community

More Related Content

Security sizing meetup

  • 1. Sizing The Elastic Stack for Security Use Cases James Spiteri (special thanks to Dave Moore!) 17th March 2021
  • 2. What we’ll be covering today - Elasticsearch Internals and Computing Resources - Quick Overview - Preparation: How much data can I expect? - Performance: How much can I get out of my hardware? - Speed: How can I get optimal search performance? - Using Cross Cluster Search and Data tiers effectively - Transforms - Kibana Considerations - The Detection Engine - Example sizing exercises
  • 7. 7 Cluster A group of nodes that work together to operate Elasticsearch. Node A Java process that runs the Elasticsearch software. Index A group of shards that form a logical data store. Shard A Lucene index that stores and processes a portion of an Elasticsearch index. Segment A Lucene segment that immutably stores a portion of a Lucene index. Document A record that is submitted to and retrieved from an Elasticsearch index.
  • 8. 8
  • 9. 9 Nodes Role Description Resources Storage Memory Compute Network Data Indexes, stores, and searches data Extreme High High Medium Master Manages cluster state Low Low Low Low Ingest Transforms inbound data Low Medium High Medium Machine Learning Processes machine learning models Low Extreme Extreme Medium Coordinator Delegates requests and merges search results Low Medium Medium Medium
  • 10. 10
  • 12. 12 Calculating data storage requirements - Ingest a sample - Monitor size + Ingest Rates - Calculate going forward https://www.elastic.co/guide/en/elas ticsearch/plugins/current/mapper-si ze.html
  • 13. 13 Summarizing Considerations: ● How much raw data (GB will we index per day? ● How many days will we retain the data for? ● How many days in the hot zone? ● How many days in the warm zone? ● How many replica shards will you enforce? In general we add 5% or 10% for margin of error and 15% to stay under the disk watermarks.
  • 15. 15 What is my hardware capable of? - Run performance benchmarks using Rally - Understand what throughput you’ll achieve https://www.elastic.co/blog/rally-1-0 -0-released-benchmark-elasticsear ch-like-we-do
  • 17. 17
  • 18. 18 It’s all about balance. Speed Cluster Size/Cost
  • 19. 19 Keeping in mind: ● Searches run on a single thread per shard ● Shards have overhead ● Shards are balanced by elasticsearch ● Use datastreams and life cycle policies ● Aim for shard sizes between 10GB and 50GB ● Aim for 20 shards or fewer per GB head of memory
  • 21. 21 Optimize using CCS  Cross Cluster Search It makes sense to have smaller clusters for different users/customers/datasets. CCS makes this easy.
  • 22. Transforms Streamline your logs and events, save time and money.
  • 24. Kibana and The Detection Engine
  • 25. 25 The Detection Engine ● Detections should be treated like a search ● Detection performance should be monitored regularly ● The Kibana alerting engine can be scaled vertically and/or horizontally Kibana task manager workers can be increased in number to take advantage of vertical scaling, or can be replicated across separate Kibana instances and scaled horizontally. When multiple Kibana instances are running, the task managers will coordinate across the wire to balance the tasks across the instances. By updating the number of max_workers inside of the kibana.yml file from it’s default of 10, you can vertically scale up or down to appropriately allocate resources more efficiently per Kibana node.
  • 27. 27 ● Total Data (GB  Raw data (GB per day * Number of days retained * Number of replicas + 1 ● Total Storage (GB  Total data (GB * 1  0.15 disk Watermark threshold + 0.1 Margin of error) ● Total Data Nodes  ROUNDUPTotal storage (GB / Memory per data node / Memory:Data ratio) In case of large deployment it's safer to add a node for failover capacity. Formulas and Examples:
  • 28. 28 Sizing a small cluster: You might be pulling logs and metrics from some applications, databases, web servers, the network, and other supporting services . Let's assume this pulls in 1GB per day and you need to keep the data 9 months. You can use 8GB memory per node for this small deployment. Let’s do the math: ● Total Data (GB  1GB x (9  30 days) x 2  540GB ● Total Storage (GB 540GB x (10.150.1  675GB ● Total Data Nodes  675GB disk / 8GB RAM /30 ratio = 3 nodes Sizing a large(r) deployment Let’s do the math with the following inputs: ● You receive 100GB per day and we need to keep this data for 30 days in the hot zone and 12 months in the warm zone. ● We have 64GB of memory per node with 30GB allocated for heap and the remaining for OS cache. ● The typical memory:data ratio for the hot zone used is 130 and for the warm zone is 1160. If we receive 100GB per day and we have to keep this data for 30 days, this gives us: ● Total Data (GB in the hot zone = 100GB x 30 days * 2  6000GB ● Total Storage (GB in the hot zone = 6000GB x (10.150.1  7500GB ● Total Data Nodes in the hot zone = ROUNDUP7500 / 64 / 30  1 = 5 nodes ● Total Data (GB in the warm zone = 100GB x 365 days * 2  73000GB ● Total Storage (GB in the warm zone = 73000GB x (10.150.1  91250GB ● Total Data Nodes in the warm zone = ROUNDUP91250 / 64 / 160  1  10 nodes Formulas and Examples:
  • 29. Try free on Cloud: elastic.co/cloud Take a quick spin: demo.elastic.co Connect on Slack: ela.st/slack 1 2 3 Join the Elastic community