MongoDB Tips and Tricks

MongoDB Performance
Manosh Malai
CTO, Mydbops
3rd September 2020
7th Mydbops Database Meetup

Interested in Open Source technologies
Interested in MongoDB, DevOps & DevOpSec Practices
Tech Speaker/Blogger
CTO, Mydbops IT Solution
Manosh Malai
About Me

Consulting
Services
Managed
Services
Focuses on MySQL, MongoDB and PostgreSQL
Mydbops Services

250 + Clients In 4 Yrs. of Operations
Our Clients

MongoDB Performance Best Practices
MongoDB Performance Analysis Tool
Introduction
Agenda

Why MongoDB
Ad Hoc Queries
Schema-less Database
Indexing
Aggregation
GridFS
Sharding
Replication
Document Orientated

1. SCHEMA DESIGN
2. INDEXING
3. LINUX TUNING
{
{

Modelling Approach RDBMS
Develop Application
and Queries
Define/Re-Define Data
Model
Production
Denormalize/Poor
Performance
R
D
B
M
S
Design normalized Data Model/Schema
Develop Application
Data model dictates how to write queries for
application operation
Application evolve and data became denormalized
Re-structured the Data Model and normalized
This cause poor performance and required downtime

Modelling Approach MongoDB
Develop Application
and Queries
Define Data Model
Production
New Requirement
M
o
n
g
o
D
B
MQL
Develop the Application
Define the Data Model
Application evolve
Improve the Data Model
Application Evolve and improve data model will happen
recursively without any downtime and complication
Design is part of each phase of the application lifetime

Strategy of Modelling
Goal 1 Goal 2
Goal 3 Goal 4
Deep Knowledge about
Application behaviour
Predict C, U, R, D
Operation perform
on Database and
priorities
Based on Prediction,
map the relationship
between entities and
C, U, R, D
Finalize the Data model,
which suite to the
application

RDBMS MongoDB
Eid FName LName Email Mobile JobName
101 Manosh Malai abc@mydboxxxxxxxxxx xxx
102 Kabilesh P.R def@mydboxxxxxxxxxx xxx
id Eid Skill
1 101 Linux
2 101 MongoDB
id Eid CertName CertNO
1 101 RHCSS xxx
2 101 AWS xxx
3 101 MongoDB xxx
{
Eid: "101",
FName: "Manosh",
LName: "Malai",
Email: "abc@mydbops.com",
Mobile: xxxxxxxxxx,
JobName: "xxx",
Skills: ["Linux", "MongoDB"],
Certifications: [
{
CertName: "RHCSS",
CertNo: "xxx"
},
{
CertName: "AWS",
CertNo: "xxx"
},
{
CertName: "MongoDB",
CertNo: "xxx"
}
]
}

Data Model Type
Embedded Model Link/Reference/Normalized Model
Emp_Collection:
{
Eid: "101",
FName: "Manosh",
LName: "Malai",
Mobile: xxxxxxxxxx,
JobName: "xxx",
Skills: ["Linux", "MongoDB"],
Certifications: [
{ CertName: "RHCSS", CertNo: "xxx" },
{ CertName: "AWS", CertNo: "xxx" },
{ CertName: "MongoDB", CertNo: "xxx" }
]
}
Emp_Collection:
{
Eid: "101",
FName: "Manosh",
LName: "Malai",
Mobile: xxxxxxxxxx,
JobName: "xxx",
Skills: ["Linux", "MongoDB"]
}
Emp_Certification_Collection:
{
CertName: "RHCSS",
CertNo: "xxx",
Eid: "101",
}

Choose Embedded VS Reference
How frequently does the embedded data get
accessed?
Does the embedded information change/update
often?
Is the data queried using the embedded
information?

Understand your application’s query patterns, Design your data model, Select the appropriate
indexes.
MongoDB has a flexible schema does not mean you can ignore schema design.
Prioritize embedding, unless there is an unavoidable reason.
Don't be afraid of application-level joins: If the index is built correctly and the returned results
are limited by projection conditions, then application-level joins will not be much more
expensive than joins in relational databases.
Key Consideration(RECAP TOO)

Array should not grow without bound
When the array size growing outbound, index performance on the array will fall down
Avoid lookup if they can avoided
Avoid huge number of collection
Avoid default _id Field: 12 bytes is too large and some computational cost
Optimization for keys: Every Document had schema, so every document store keyname in document
and it consume more space.
Key Consideration(RECAP TOO)

In the Database world, index plays a vital role in a performance, that not an exception with MongoDB
Indexing
Single Field Indexes
Compound Indexes
Multikey Indexes
Text Indexes
Wildcard Indexes
2dsphere Indexes
2d Indexes
geoHaystack Indexes
Hashed Indexes
Index Type Index Properties
TTL Indexes
Unique Indexes
Partial Indexes
Case Insensitive Indexes
Hidden Indexes
Sparse Indexes

Follow ESR Rule in Compound Indexes
Use Covered Queries as much possible
How Prefix Compression improves query performance and Disk usage
Indexing Strategies
and so on.

Follow ESR Rule in Compound Indexes
EQUAL SORT
RANGE
In Single Field Index, the document can either be ascending or descending sort regardless of the physical
ordering of the index key
ESR is no strict rule. Its just a guideline, help to produce better query performance
If we put equality key first, we will limit the amount of data we looking
Avoid blocking/In-memory sorting
Fail to follow ESR guidelines in index creation drives us to unwanted totalKeysExamined, totalDocsExamined
traversal, and put stress on memory and CPU resource. finally, executionTimeMillis of the query too more than
the advised value.

ESR db.emp.find({role: "mongodb-dba", exp: {$gt: 5}}).sort({location: 1})
MongoDB-DBA MySQL-DBA
ROLE:
6
LOCATION:
Bangalore Bangalore
1
Hyderabad
3
Chennai
2
EXP:
10
7
5
5
BLOCKING SORT
Chennai
2
Bangalore
1
Hyderabad
3
RESULT
ROOT db.emp.createIndex({role:1, exp:1, location:1})
E
R
S

ESR db.emp.find({role: "mongodb-dba", exp: {$gt: 5}}).sort({location: 1})
MongoDB-DBA MySQL-DBA
ROLE:
6
LOCATION:
Chennai
2
Hyderabad
3
EXP: 5
7 5
10
ROOT db.emp.createIndex({role:1, location:1, exp:1})
Bangalore
1
E
S
R
Bangalore

Key Consideration
Index creation in foreground will do collection level locking.
Index creation in the background helps to overcome the locking bottleneck but decrease the efficiency of index
traversal.
In MongoDB 4.2 version index creation system was reconstructed. This new indexing method help to
overcome the above-specified incompetence or inefficiency.
Recommend the developer to write the covered query. This kind of query will be entirely satisfied with an index.
So zero documents need to be inspected to satisfy the query, and this makes the query run lot faster. All the
projection keys need to be indexed.

Cont...
Combine: Possibly a Range
Use Index to sort the result and avoid blocking sort.
Remove Duplicate and unused index, it also improve the disk throughput and memory optimization.
Operator name mislead between Equality and Range, use index bound to make sure operator your using is
Range or Equality
$ne
$nin
Regex
$in Alone: Equality Match

B-Tree & Prefix Compression: Query performance & Disk usage
In B-Tree indexes, Low Cardinality value actually harm performance
In Low Cardinality value preference to use Partial Index
prefix Index compression- Repeated prefix value is not written
WITHOUT PREFIX COMPRESSION
MongoDB-DBA,Bangalore,10
MongoDB-DBA,Chennai,5
MySQL-DBA,Bangalore,7
MongoDB-DBA,Hyderabad,5
MongoDB-DBA,Bangalore,10
,Chennai,5
,Hyderabad,5
MySQL-DBA,Bangalore,7
Without Prefix Comp
With Prefix Comp

Linux Tuning
Swappiness sysctl -w vm.swappiness=1
Dirty Ratio
sysctl -w vm.dirty_ratio = 15
sysctl -w vm.dirty_background_ratio = 5
zone_reclaim_mode sysctl -w vm.zone_reclaim_mode=0

Linux Tuning
# Edit the file
/etc/systemd/system/multi-user.target.wants/mongod.service
ExecStart=/usr/bin/mongod --config /etc/mongod.conf
To
ExecStart=/usr/bin/numactl --interleave=all /usr/bin/mongod --config /etc/mongod.conf
systemctl daemon-reload
systemctl stop mongod
systemctl start mongod
NUMA

Linux Tuning
# Verifying
$ cat /sys/block/xvda/queue/scheduler
noop [deadline] cfq
# Adjusting the value dynamically
$ echo "noop" > /sys/block/xvda/queue/scheduler
$ vim /etc/sysconfig/grub
GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200 elevator=noop"
$ grub2-mkconfig -o /boot/grub2/grub.cfg
IO Scheduler

Linux Tuning
$ echo "never" > /sys/kernel/mm/transparent_hugepage/enabled
$ echo "never" > /sys/kernel/mm/transparent_hugepage/defrag
$ vim /etc/sysconfig/grub
GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200 elevator=noop
transparent_hugepage=never"
$ grub2-mkconfig -o /boot/grub2/grub.cfg
Transparent Huge Pages

Linux Tuning
$ vi /etc/systemd/system/multi-user.target.wants/mongod.service
# (file size)
LimitFSIZE=infinity
# (cpu time) LimitCPU=infinity
# (virtual memory size)
LimitAS=infinity
# (locked-in-memory size)
LimitMEMLOCK=infinity
# (open files)
LimitNOFILE=64000
# (processes/threads)
LimitNPROC=64000
ulimit Settings

Linux Tuning
vim /etc/security/limits.conf
mongo hard cpu unlimited
mongo soft cpu unlimited
mongo hard memlock unlimited
mongo soft memlock unlimited
mongo hard nofile 64000
mongo soft nofile 64000
mongo hard nproc 192276
mongo soft nproc 192276
mongo hard fsize unlimited
mongo soft fsize unlimited
mongo hard as unlimited
mongo soft as unlimited
ulimit Settings

Linux Tuning
$ vi /etc/sysctl.conf
net.core.somaxconn = 4096
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_time = 120
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_keepalive_probes = 6
Network Stack

MONGODB PERFORMANCE ANALYSIS TOOL

MongoDB Explain
queryPlanner
executionStats
allPlansExecution
db.<collection name>.find({}).explain()
Important Parameter
queryPlanner.winningPlan.inputStage.stage
executionStats.nReturned
executionStats.totalKeysExamined
executionStats.totalDocsExamined

MongoDB Mtools
mtools is a collection of helper scripts to parse, filter, and visualize MongoDB log files. For every DBA this is a
Swiss army knives tools.
mlogfilter
mloginfo
mlaunch

mlogfilter
mlogfilter mongod.log --slow --json | mongoimport -d test -c mycoll
mlogfilter mongod.log --namespace admin.$cmd --slow 1000
mlogfilter mongod.log --operation <query, insert, update, delete, command, getmore>
mlogfilter mongod.log --pattern '{"_id": 1, "host": 1, "ns": 1}'
mlogfilter mongod.log --from FROM [FROM ...], --to TO [TO ...]
mlogfilter mongod.log --from Aug --to Sep

mloginfo
mloginfo mongod.log --queries
mloginfo mongod.log --restarts
mloginfo mongod.log --connections
mloginfo mongod.log --rsstate

Keyhole
Keyhole help to produce performance analytics summaries. The information includes MongoDB
configurations, cluster statistics, database schema, indexes, and index usages.
Analyzing mongo logs and Full-Time Diagnostic data Capture (FTDC),
Cluster Info:
keyhole --allinfo "mongodb://user:secret@host.local/test?replicaSet=rs"
FTDC Data and Grafana Integration:
keyhole --web --diag /data/db/diagnostic.data
Logs Analytics:
keyhole --loginfo -v /var/log/mongodb/mongod.log.2018-06-07T11-08-32.gz

Reference
https://medium.com/swlh/mongodb-indexes-deep-dive-understanding-indexes-9bcec6ed7aa6
https://www.mongodb.com/blog/post/performance-best-practices-hardware-and-os-configuration
https://www.slideshare.net/mongodb/mongodb-local-toronto-2019-tips-and-tricks-for-effective-indexing
https://www.mongodb.com/blog/post/performance-best-practices-indexing
https://github.com/rueckstiess/mtools
https://github.com/simagix/keyhole
https://www.youtube.com/watch?v=Mj2YM8t2G2w
https://mydbops.wordpress.com/category/mongodb/

MongoDB Tips and Tricks

More Related Content

MongoDB Tips and Tricks