This document discusses data visualization and analytics using MongoDB data. It covers the importance of data visualization, different architectures for analytics, and tooling options for visualizing MongoDB data, including building custom solutions, MongoDB Compass, the MongoDB BI Connector, and the new MongoDB Charts tool. The goal is to help users understand which visualization methods and tools are best suited to their specific needs and data.
Report
Share
Report
Share
1 of 41
More Related Content
Data Analytics: Understanding Your MongoDB Data
1. 20 MAR C H , 2018
# M D B l o c a l
THE PATH TO
TRULY UNDERSTANDING
YOUR MONGODB DATA
6. # M D B l o c a l
TERMINOLOGY
“Business
Intelligence” “Business
Analytics”
ANALYTICS
DATA VISUALISATION
7. # M D B l o c a l
• More data has been created
in the last 2 years than
entire previous history of the
human race
• By 2020:
• 1.7MB per person every
second
DATA GROWTH IS EXPLOSIVE
8. # M D B l o c a l
• Analytics is big $!
• $150B in 2017
• $210B+ in 2020
• Less than 0.5% of data is
analysed and used –
imagine the potential!
THE STATE OF ANALYTICS
Source: IDC. https://www.idc.com/getdoc.jsp?containerId=prUS42371417
9. # M D B l o c a l
EVOLUTION OF ANALYTICS
• Self service
• Mobile access
• Spark
• Real time analytics
• On-prem and cloud
• On demand reporting
2014 20162012
• Dedicated reporting team
• Desktop access
• Hadoop
• Batch analytics
• On prem only
• Monthly reports
2018
10. # M D B l o c a l
IMPORTANCE OF DATA
VISUALISATION
13. # M D B l o c a l
• Charles Minard
(1869)
• Napoleon's march
and retreat on
Moscow in 1812.
EARLY DATA VISUALISATIONS
14. # M D B l o c a l
I
X Y
10 8.04
8 6.95
13 7.58
9 8.81
11 8.33
14 9.96
6 7.24
4 4.26
12 10.84
7 4.82
5 5.68
9.00 7.50
10.00 3.75
0.816
Mean
Variance
Correlation
15. # M D B l o c a l
I
X Y
10 8.04
8 6.95
13 7.58
9 8.81
11 8.33
14 9.96
6 7.24
4 4.26
12 10.84
7 4.82
5 5.68
9.00 7.50
10.00 3.75
0.816
Mean
Variance
Correlation
16. # M D B l o c a l
I
X Y
10 8.04
8 6.95
13 7.58
9 8.81
11 8.33
14 9.96
6 7.24
4 4.26
12 10.84
7 4.82
5 5.68
9.00 7.50
10.00 3.75
0.816
II III IV
X Y X Y X Y
10 9.14 10 7.46 8 6.58
8 8.14 8 6.77 8 5.76
13 8.74 13 12.74 8 7.71
9 8.77 9 7.11 8 8.84
11 9.26 11 7.81 8 8.47
14 8.1 14 8.84 8 7.04
6 6.13 6 6.08 8 5.25
4 3.1 4 5.39 19 12.5
12 9.13 12 8.15 8 5.56
7 7.26 7 6.42 8 7.91
5 4.74 5 5.73 8 6.89
9.00 7.50 9.00 7.50 9.00 7.50
10.00 3.75 10.00 3.75 10.00 3.75
0.816 0.816 0.817
Mean
Variance
Correlation
21. # M D B l o c a l
• Use the correct architecture
• Determine what your needs are
• Multiple data sources?
• Huge amounts of complex data?
• Quick self service?
• Choose the right solution for you
THINGS TO THINK ABOUT
22. # M D B l o c a l
• Run analytics against your main
deployment used by your Online
Transaction Processing (OLTP) apps
• May be OK in some cases, but watch
out for:
• Poor performing analytics queries
• Analytics impacting OLTP workloads
ARCHITECTURE:
SHARED DEPLOYMENT OLTP Client
DB
Analytics
23. # M D B l o c a l
• Hidden secondaries maintain a
copy of the primary’s data set
• Hidden secondaries are used for
workloads with different access
patterns
• Contain identical data, but can
have different indexes
• Hidden secondary cannot
become primary
ARCHITECTURE:
HIDDEN REPLICAS OLTP Client Analytics
Primary
Secondary
Secondary
Secondary
P=0
Hidden=true
24. # M D B l o c a l
• An Extract-Transform-Load tool
retrieves data from one or more
databases, transforms the data
and loads into a data warehouse
• Minimal impact on OLTP
systems; data can be highly
optimised for analysis
• Expensive to setup and maintain
• Data can be stale
ARCHITECTURE:
ETL TO DATA WAREHOUSE Analytics
DB1
DB2
DB3
Data
Warehouse
ETL
OLTP Clients
26. # M D B l o c a l
• Pros
• Custom tailored solution: fits
exactly as required!
• Cons
• High investment
• Maintenance
• Deep understanding of the
underlying tech and its
language(s)
BUILD YOUR OWN
28. # M D B l o c a l
• Day-to-day development/operations
• Data management and manipulation
• Adding indexes
• Viewing server stats
• Schema analysis with visualisations
MONGODB COMPASS
30. # M D B l o c a l
• Understand the range of types and values in your documents
• When you want zero effort visualisations, and don’t need the
ability to customise
MONGODB COMPASS: WHEN TO USE
31. # M D B l o c a l
• Visualise and explore MongoDB
data in SQL-based BI tools:
• Automatically discovers the schema
• Translates complex SQL statements
issued by the BI tool into MongoDB
aggregation queries
• Converts the results into a tabular
format for rendering inside the BI
tool
MONGODB BI CONNECTOR
32. # M D B l o c a l
MONGODB BI CONNECTOR
MySQL protocol
MongoDB
mongosqld
etc.
DRDL
34. # M D B l o c a l
• Existing investment in BI tools (Tableau, Power BI, Qlik etc.)
• You are analysing data from multiple data sources (not just
MongoDB)
• Your MongoDB datasets are highly structured
• Consistent, minimal nesting, no polymorphism
• You have the time and patience for schema mapping
• Extremely powerful but high ramp
BI CONNECTOR: WHEN TO USE
35. # M D B l o c a l
• Lightweight and intuitive
• Build visualisations on
MongoDB data (nested,
polymorphic)
• Share content in a
dashboard
• Beta available soon!
MONGODB CHARTS
37. # M D B l o c a l
• Your data is in MongoDB collections
• You don’t want to flatten / ETL your MongoDB data
• When you want quick answers from simple but customisable
visualisations
• Self service for semi-technical audience
MONGODB CHARTS: WHEN TO USE
38. # M D B l o c a l
DATA VISUALISATION LIFE CYCLE
1. Acquire 2. Prep
- Calcs
- Groups
- Data types
3. Visualise
- Bar
- Pie
- Line
4. Explore
- Dashboards
5. Share
- Export
- Collaborate
- Embed
39. # M D B l o c a l
• Visualisations are incredibly powerful for understanding your data
• Use them to derive insight
• There are multiple options for visualising your MongoDB data
• Combine the tools for the most power!
SUMMARY
40. # M D B l o c a l
Q&A
tom.hollander@mongodb.com
@tomhollander
41. # M D B l o c a l
THANK YOU!
tom.hollander@mongodb.com
@tomhollander
Editor's Notes
96 DVDs per person per day
Eye can process 10million bits per second. Roughly the same as Ethernet.
One of the best statistical drawings ever made.
Tells of 400,000 army marching on moscow and returning with 10,000.
Shows time and loss of life, routes and river crossings etc.