Building Modern Data Platform with Microsoft Azure

Building Modern Cloud
Analytics Solution
Dmitry Anoshin

Outline
• About Me
• Role of Analytics
• History of Cloud
• Analytics powered by Microsoft Azure
• DW modernization Project
• Use cases and Challenges
• Alternative Solution with Azure

About Myself
• Work with Business Intelligence
since 2007

Technical Skills Matrix
2015
2010
2007
Data
Warehouse
ETL/ELT
Business
Intelligence
Big Data
Cloud
Analytics
(AWS,
Azure,
GCP)
Machine
Learning
2019

Other Activities
Jumpstart Sno
wflake: A Step-
by-Step Guide
to Modern
Cloud Analytics.
• Victoria Power BI andVictoria SQL Server meetup
• Victoria andVancouverTableau User Group
• Conferences (EDW 2018, 2019, Data Architecture Summit)
• Amazon internal conferences

BusinessValue
Stakeholders Employees Customers
Value
”The goal of any organization is to generateValue”
The Future of Competition.
https://www.amazon.com/Future-Competition-Co-Creating-Unique-Customers/dp/1578519535

BIValue Chain
Stakeholders Employees Customers
Value
Decisions
Data
Value creation based on effective decisions
Effective decisions based on accurate
information

For Data to be a differentiator, customers
need to be able to…
• Capture and store new non-relational data at
PB-EB scale in real time
• Discover value in a new type of analytics that
go beyond batch reporting to incorporate
real-time, predictive, voice, and image
recognition
• Democratize access to data in a secure and
governed way
New types of analytics
Dashboards Predictive Image
Recognition
VoiceReal-time
New types of data

Cloud Early History
1970
Time Sharing Concept by
GE
1977
Cloud symbol
used in ARPANET
1990
VPN by telecom
1993
Cloud refer to
Distributed
Computing
1994 Cloud
metaphor for
virtualized
services

Cloud Recent History
2002
AWS
2006
AWS Elastic
Compute Cloud
2006
Google Docs
2008
Google App
Engine
2008
Microsoft
Announced Azure
2010
Microsoft Azure

Why moving to the Cloud?
• Elasticity
• Pay for what
you need
• Fail fast
• Fast time to
market
• Secure
• Reliable
• Business SLA

Downsides of on-premise solution
Scale
Constrained
Up-front cost Maintenance
Resources
Tuning and
Deployment

Cloud Restrictions -> Hybrid Clouds
Sensitive Data Data Moving
Cost
Public/Private
Cloud

Cloud Service Models – friendly version

Cloud Analytics
with Microsoft
Azure

Data Analytics with Azure
• Data Factory
• Integration
Service
• Kafka
• Event Hub
• Data Lake Gen 1
• Data Lake Gen 2
• Blob Storage
• HD Insight
• Data Lake Analytics
• Streaming Analytics
• PolyBase
• CosmosDB
• SQL DW
• Analysis Service
• SQL Database
• SQL Server in
VM
• Cosmos DB
Data Integration
and
Transformation
Data Warehouse
and Data bases
Big Data
• Analysis Service
• ML Analytics
• Business Intelligence
Analytics

BI/DW (before)
Storage LayerSource Layer
Ad-hoc SQL
SFTP
Data Warehouse
ETL (PL/SQL)Files
Inventory
Sales
Access Layer

Cloud Migration Strategy
Lift & Shift
• Typical Approach
• Move all-at-once
• Target platform then evolve
• Approach gets you to the cloud quickly
• Relatively small barrier to learning new technology
since it tends to be a close fit
Split & Flip
• Split application into logical functional data layers
• Match the data functionality with the right
technology
• Leverage the wide selection of tools onAWS to
best fit the need
• Move data in phases — prototype, learn and
perfect

Migration Approach
Useful tools:
• Total Cost Ownership (TCO) Calculator
• Azure Database Migration Service
• Azure Migration Assistant

Building Modern Data Platform with Microsoft Azure

What is Azure DW?
• Decouple Storage
and Compute
• MPP
• Distribution Styles:
Hash/Robin/Replicat
e

SQL Database vs SQL Data Warehouse

What is Azure Data Factory?
Azure Data Factory (ADF) is Microsoft’s fully managed ELT service
in the cloud that’s delivered as a Platform as a Service (PaaS)

Lack of Notification
Problem: Users are missing emails or they jump to spam.
Solution: Leverage Messenger with Webhooks. (Slack, Chime or so on).

Lack of Logging
Problem: We didn’t have any detail logs about our ETL performance and we didn’t
have any insights.
Solution: Collecting logs and events. In addition, we are able to collect logs on any
level of jobs and transformation.

Self-Service BI
Problem: Business Users wants Interactive and Self-Service tool. Fast time to Market
and less dependency on IT.
Solution: Implement modern Visual Analytics Platform

Marketing Automation
Problem: Marketing team wants “Move Fast and Break Things”.
Solution: Using ADF the gave Marketing template jobs and they doing their jobs
themselves.
Affiliates
Insights

Integration with BI
Problem: Having best BI tool doesn’t guaranty good SLA.
Solution: Build Integration between Matillion ETL and Tableau based on Trigger. Add
data quality checks.

Evolving to Cloud
Data Analytics
Platform

Streaming Data
Problem: Organization is using NoSQL database and mobile application. It is
critical to deliver near real time analytics
Solution: Using Apache Kaffka, we are able to stream data into the Data lake
and query this data in near real time
Data Lake Dashboard
Kafka
CosmoDB
Mobile App

Clickstream Analytics
Problem: Business wants to analyze Bots traffics and discover broken URLs.
Access logs are ~50GB per day, 5600 log files per day.
Solution: Leveraging Databricks in order to produce Parquet file and store in
Azure Data Lake Gen2. User are able query it with T-SQL and BI Tools.
Databricks ParquetBlob Storage
Access Logs
Load Balancer Data Lake Data Factory SQL DW
Query with SQL or Databricks

DevOps onboarding
Problem: Solution isn’t reliable and could easy break. As a result end users will
experience bad experience and it will affect business decisions.
Solution: Onboarding Continuous Integration methodology for Cloud Data
Platform
• Agile and Kanban board
• Code branching (Git)
• Gated check-ins
• Automated Tests
• Build
• Release

Evolving to Cloud Data Analytics Platform

Building Modern Data Platform with Microsoft Azure

Related slideshows

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

Recommended for you

More Related Content

What's hot

What's hot (20)

Similar to Building Modern Data Platform with Microsoft Azure

Similar to Building Modern Data Platform with Microsoft Azure (20)

More from Dmitry Anoshin

More from Dmitry Anoshin (20)

Recently uploaded

Recently uploaded (20)

Building Modern Data Platform with Microsoft Azure

Editor's Notes