SlideShare a Scribd company logo
Big Data Analytics from Azure
Data Platform to Power BI
Azure Batch, Azure Data Lake, Azure HDInsight, ML, Power BI
March 22, 2017
Roy Kim
@RoyKimYYZ
roykimtoronto@gmail.com
Agenda
 Overview of Big Data + Azure + Data Insights
 Job Postings demo solution architecture &
implementation
 Mobile Demo with Power BI
 Q&A
Author: Roy Kim
By: Roy Kim
Bio
 Roy Kim
 14+ Years of Microsoft Technology Solutions
 .NET, SharePoint, BI, Office 365, Azure Solutions
 IT Consultant
 University of Toronto – Computer Science Degree
Author: Roy Kim
By: Roy Kim
Data to Insight
Author: Roy Kim
Big Data
Data Platform
Technologies
Solution Data Insights
By: Roy Kim

Recommended for you

AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...

Discover, manage, deploy, monitor – rinse and repeat.  In this session we show how Azure Machine Learning can be used to create the right AI model for your challenge and then easily customize it using your development tools while relying on Azure ML to optimize them to run in hardware accelerated environments for the cloud and the edge using FPGAs and Neural Network accelerators.  We then show you how to deploy the model to highly scalable web services and nimble edge applications that Azure can manage and monitor for you.  Finally, we illustrate how you can leverage the model telemetry to retrain and improve your content.

aimlmachine learning
Data Vault Vs Data Lake
Data Vault Vs Data LakeData Vault Vs Data Lake
Data Vault Vs Data Lake

The difference between a Data Lake and a Data Vault is the difference between a stethoscope and a radar.

data lakedata vaultdata warehouse
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital

This white paper will present the opportunities laid down by data lake and advanced analytics, as well as, the challenges in integrating, mining and analyzing the data collected from these sources. It goes over the important characteristics of the data lake architecture and Data and Analytics as a Service (DAaaS) model. It also delves into the features of a successful data lake and its optimal designing. It goes over data, applications, and analytics that are strung together to speed-up the insight brewing process for industry’s improvements with the help of a powerful architecture for mining and analyzing unstructured data – data lake.

#datascience#datalake#bigdataanalytics
Job Postings Demo Solution
Author: Roy Kim References:
https://softwarestrategiesblog.com/2015/09/05/10-ways-big-data-is-revolutionizing-supply-chain-management
Job Postings Azure Data Platform
Data Lake, HDInsight,
SQL, Power BI
Job trends, analysis
By: Roy Kim
Big Data Spectrum
References:
https://softwarestrategiesblog.com/2015/09/05/10-ways-big-data-is-revolutionizing-supply-chain-management
Veracity
By: Roy Kim
Azure Cloud Platform
References:
https://blogs.technet.microsoft.com/cansql/2015/06/03/microsoft-data-platform-overview/
By: Roy Kim
Many
services
growing
and
maturing
Azure Data Platform
References:
https://blogs.technet.microsoft.com/cansql/2015/06/03/microsoft-data-platform-overview/
Two Illustrations:
By: Roy Kim

Recommended for you

Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included

This document provides an overview of a speaker and their upcoming presentation on Microsoft's data platform. The speaker is a 30-year IT veteran who has worked in various roles including BI architect, developer, and consultant. Their presentation will cover collecting and managing data, transforming and analyzing data, and visualizing and making decisions from data. It will also discuss Microsoft's various product offerings for data warehousing and big data solutions.

Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?

With new technologies such as Hive LLAP or Spark SQL, do I still need a data warehouse or can I just put everything in a data lake and report off of that? No! In the presentation I’ll discuss why you still need a relational data warehouse and how to use a data lake and a RDBMS data warehouse to get the best of both worlds. I will go into detail on the characteristics of a data lake and its benefits and why you still need data governance tasks in a data lake. I’ll also discuss using Hadoop as the data lake, data virtualization, and the need for OLAP in a big data solution. And I’ll put it all together by showing common big data architectures.

data warehousedata warehouse architecturedata lake
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data

I often hear from clients: “We don’t know much about Big Data – can you tell us what it is and how it can help our business?”  Yes!  The first step is this vendor-free presentation, where I start with a business level discussion, not a technical one.  Big Data is an opportunity to re-imagine our world, to track new signals that were once impossible, to change the way we experience our communities, our places of work and our personal lives.  I will help you to identify the business value opportunity from Big Data and how to operationalize it.  Yes, we will cover the buzz words: modern data warehouse, Hadoop, cloud, MPP, Internet of Things, and Data Lake, but I will show use cases to better understand them.  In the end, I will give you the ammo to go to your manager and say “We need Big Data an here is why!”  Because if you are not utilizing Big Data to help you make better business decisions, you can bet your competitors are.

big data
Analytics Platform Gartner Magic Quadrant
Data Analytics
By: Roy Kim
Job Postings Data Set
Volume
• Many national
job sites
• New job
postings daily
• Metadata and
full text.
Velocity
• New job
postings
created every
minute
Variety
• Semi-
structured
• Job Title
• Location
• Company
• Unstructured
• Job
Description
Veracity
• Incomplete/Im
precise
• Salary, Per
hour
• FT, PT, Temp,
Contract,
Seasonal
• Main
profession
By: Roy Kim
Power BI – Job Postings Demo Reports
By: Roy Kim

Recommended for you

Azure data stack_2019_08
Azure data stack_2019_08Azure data stack_2019_08
Azure data stack_2019_08

With this support you would be able to have the basic of Azure Data slack and it will help you to pass the DP-200 and DP-201. If you need some basics on Azure, you can download this support : https://www.slideshare.net/AlexandreBERGERE/azure-fundamentals-153339148. This support is a summary from the paths: Azure for the Data Engineer Store data in Azure Work with relational data in Azure Large Scale Data Processing with Azure Data Lake Storage Gen2 Implement a Data Streaming Solution with Azure Streaming Analytics Implement a Data Warehouse with Azure SQL Data Warehouse in Microsoft Learn.

azurebig datadata
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy

Think of big data as all data, no matter what the volume, velocity, or variety. The simple truth is a traditional on-prem data warehouse will not handle big data. So what is Microsoft’s strategy for building a big data solution? And why is it best to have this solution in the cloud? That is what this presentation will cover. Be prepared to discover all the various Microsoft technologies and products from collecting data, transforming it, storing it, to visualizing it. My goal is to help you not only understand each product but understand how they all fit together, so you can be the hero who builds your companies big data solution.

big datamicrosoft
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI

This document discusses techniques for optimizing Power BI performance. It recommends tracing queries using DAX Studio to identify slow queries and refresh times. Tracing tools like SQL Profiler and log files can provide insights into issues occurring in the data sources, Power BI layer, and across the network. Focusing on optimization by addressing wait times through a scientific process can help resolve long-term performance problems.

power bioptimizationmicrosoft
Power BI – Job Postings Demo Reports
By: Roy Kim
Job Postings Big Data Solution Architecture
Azure Data Lake
Analytics
Internet Data
Sets
USQL
Storage Account
Blob Store
Azure Batch
.NET Console
App
Blob Store
(WebHDFS)
Azure HDInsight
Hive
Azure
Active Directory
HDInsight
Azure SQL
database
SQL Data
Warehouse
Storage blob
Storage (Azure)
Visual Studio
Online
Data Lake
Azure SQL / Data
Warehouse
SQL DB
Machine
Learning
ML Studio
StorageTierServicesTier
REST/HTML/..
Visualization
/Reporting
Tools
Presentation
Tier
Mobile
Pig
Scoop
By: Roy Kim
Desktop
Batch
Business
Users
Report Builders
Data Analysts
Azure Data
Factory
Pipeline
Data Factory
Browser
Service
Applicatio
n Insights
Microsoft
Azure
Data
Analysis Services
Tabular
Machine
Learning
Storage Account
Blob Store
(HDFS)
Azure Data Lake
Store
Query
Job Postings from Internet Job Boards
 Web sites that offer APIs
 Use any server-side programming language to retrieve data such as NET, Java,
Node.js, etc.
 If no APIs, consider HTML web page scraping
REST API
http end points typically return JSON or XML data formats
Html Web Page Scraping
HTML Agility Pack to assist in parsing the Document Object Model for data points.
https://www.nuget.org/packages/HtmlAgilityPack
HTML parsing supporting XPath to traverse the Document Object Model
(DOM)
E.g. doc.DocumentElement.SelectSingleNode(“//div*@id=‘Total Sales’+”)By: Roy Kim
Job Postings Data Collector .NET Console Application
.NET console application to read data from the internet and store into Azure Storage
accounts
 Concurrent requests to job postings public API and HTML pages
 Multi-threaded to increase speed and throughput
 Parse HTML pages and JSON
 Store JSON files directly into Azure Data Lake Store with ADLS .NET SDK
 Leverages Azure Application Insights for logging trace and exception error
messages.
 To store files into Azure Data Lake Store, the .NET application needs to access with
an Azure AD service principal with the appropriate access control.
By: Roy Kim

Recommended for you

Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2

Check out this presentation to learn the basics of using Attunity Replicate to stream real-time data to Azure Data Lake Storage Gen2 for analytics projects.

azureadls gen 2attunity replicate
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS

This session will focus on how to get from 'Minimum Viable Product' (MVP) to scale. It will also explain how to deal with unpredictable demand and how to build a scalable business. Attend this session to learn how to: Scale web servers and app services with Elastic Load Balancing and Auto Scaling on Amazon EC2 Scale your storage on Amazon S3 and S3 Reduced Redundancy Storage Scale your database with Amazon DynamoDB, Amazon RDS, and Amazon ElastiCache Scale your customer base by reaching customers globally in minutes with Amazon CloudFront

startupcloudkatasg2013aws
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse

The new Microsoft Azure SQL Data Warehouse (SQL DW) is an elastic data warehouse-as-a-service and is a Massively Parallel Processing (MPP) solution for "big data" with true enterprise class features. The SQL DW service is built for data warehouse workloads from a few hundred gigabytes to petabytes of data with truly unique features like disaggregated compute and storage allowing for customers to be able to utilize the service to match their needs. In this presentation, we take an in-depth look at implementing a SQL DW, elastic scale (grow, shrink, and pause), and hybrid data clouds with Hadoop integration via Polybase allowing for a true SQL experience across structured and unstructured data.

sql dwmppdata warehouse
Job Postings Data Collector App Architecture
By: Roy Kim
Azure Application Insights
By: Roy Kim
Application Insights Core API. This package provides core functionality for transmission of all
Application Insights Telemetry Types and is a dependent package for all other Application Insights
packages.
Azure Batch
A managed Azure service executing
command line applications.
For batch processing or batch
computing--running a large volume of
similar tasks to get some desired result.
Commonly used by organizations that
regularly process, transform, and
analyze large volumes of data.
Simply, a set of Azure Virtual Machines
running a console application to process
data that can be on a recurring schedule
and in parallel
References:
https://github.com/Microsoft/azure-docs/blob/master/articles/batch/batch-technical-overview.md
Author: Roy Kim
Azure Batch – Demo Implementation
Azure Batch runs the Console Application on a daily schedule against one node (Virtual
Machine - 2 cores)
To run console application in parallel through compute nodes, used the sample Parallel
Tasks .NET solution which uses the Azure Batch Client SDK.
https://github.com/Azure/azure-batch-
samples/tree/master/CSharp/ArticleProjects/ParallelTasks
Azure batch is an architecture option to support data collection in terms of velocity and
volume.
By: Roy Kim

Recommended for you

Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage

Microsoft Azure Data Lake Storage is designed to enable operational and exploratory analytics through a hyper-scale repository. Journey through Azure Data Lake Storage Gen1 with Microsoft Data Platform Specialist, Audrey Hammonds. In this video she explains the fundamentals to Gen 1 and Gen 2, walks us through how to provision a Data Lake, and gives tips to avoid turning your Data Lake into a swamp. Learn more about Data Lakes with our blog - Data Lakes: Data Agility is Here Now https://bit.ly/2NUX1H6

 
by CCG
microsoftazuredata lake
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...

Until recently, data was gathered for well-defined objectives such as auditing, forensics, reporting and line-of-business operations; now, exploratory and predictive analysis is becoming ubiquitous, and the default increasingly is to capture and store any and all data, in anticipation of potential future strategic value. These differences in data heterogeneity, scale and usage are leading to a new generation of data management and analytic systems, where the emphasis is on supporting a wide range of very large datasets that are stored uniformly and analyzed seamlessly using whatever techniques are most appropriate, including traditional tools like SQL and BI and newer tools, e.g., for machine learning and stream analytics. These new systems are necessarily based on scale-out architectures for both storage and computation. Hadoop has become a key building block in the new generation of scale-out systems. On the storage side, HDFS has provided a cost-effective and scalable substrate for storing large heterogeneous datasets. However, as key customer and systems touch points are instrumented to log data, and Internet of Things applications become common, data in the enterprise is growing at a staggering pace, and the need to leverage different storage tiers (ranging from tape to main memory) is posing new challenges, leading to caching technologies, such as Spark. On the analytics side, the emergence of resource managers such as YARN has opened the door for analytics tools to bypass the Map-Reduce layer and directly exploit shared system resources while computing close to data copies. This trend is especially significant for iterative computations such as graph analytics and machine learning, for which Map-Reduce is widely recognized to be a poor fit. While Hadoop is widely recognized and used externally, Microsoft has long been at the forefront of Big Data analytics, with Cosmos and Scope supporting all internal customers. These internal services are a key part of our strategy going forward, and are enabling new state of the art external-facing services such as Azure Data Lake and more. I will examine these trends, and ground the talk by discussing the Microsoft Big Data stack.

microsoftbig datatechnology
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricks

- The document discusses building a data lake in Azure using Spark and Databricks. It begins with an introduction of the presenter and their experience. - The rest of the document is organized into sections that discuss decisions around why to use a data lake and Azure/Databricks, how to build the lake by ingesting and organizing data, using Delta Lake for integrated and curated layers, securing the lake, and enabling analytics against the lake. - The key aspects covered include getting data into the lake from various sources using custom Spark jobs, organizing the lake into layers, cataloging data, using Delta Lake for transactional tables, implementing role-based security, and allowing ad-hoc queries.

Azure Data Lake
Intended for data storage in its raw format for future analysis, processing or
data modelling.
For developers, data scientists, and analysts to store data of any size, shape,
and speed.
To do all types of processing and analytics across different platforms and
languages.
Extract and load, minimal transformations
To manage data in characteristic of variety, velocity and volume
Two Components
1. Azure Data Lake Store
2. Azure Data Lake Analytics
References:
https://azure.microsoft.com/en-us/solutions/data-lake/
By: Roy Kim
Azure Data Lake Store
 Azure Data Lake Store is a hyper-scale repository for big data analytic
workloads. Azure Data Lake enables you to capture data of any size, type,
and ingestion speed in one single place for operational and exploratory
analytics.
 The Azure Data Lake store is an Apache Hadoop file system compatible with
Hadoop Distributed File System (HDFS)
 Can be accessed from Hadoop (available with HDInsight cluster) using the
WebHDFS-compatible REST APIs
References:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview
By: Roy Kim
Azure Data Lake Store
Use Cases
 Store social media
posts, log files, sensor
data
 Store corporate data
such as
relational databases
(as flat files)
References:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview
By: Roy Kim
Azure Data Lake Analytics
 Azure Data Lake Analytics is built to make big data analytics easy.
 Focus on writing, running, and managing jobs, rather than operating distributed
infrastructure. Instead of deploying, configuring, and tuning hardware.
 Write queries to transform your data and extract valuable insights. The analytics
service can handle jobs of any scale instantly by setting the dial for how much
power you need.
 U-SQL – a Big Data query language. Likeness of SQL + C#
 ”schema on reads”
 Pay for your job when it is running; making it cost-effective.
 Data Collector app stores .json files in respective folders
 USQL scripts logic:
 reads 1000s of JSON files in a given folder
 Outputs to one TSV (tab delimited) file
 Create a Tables to schematize the TSV files
 Query against tables to analyze or transform to a new output file.
References:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview
By: Roy Kim

Recommended for you

Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)

An overview of AAS and the slides presented during Azure Bootcamp Day 2018 in the SafeNet offices, downtown Milwaukee, WI.

azureanalysis servicesmicrosoft
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...

SQLBits 2020 presentation on how you can build solutions based on the modern data warehouse pattern with Azure Synapse Spark and SQL including demos of Azure Synapse.

azure synapsesqlbits 2020apache spark
Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data Platform

A sharing in a meetup of the AWS Taiwan User Group. The registration page: https://bityl.co/7yRK The promotion page: https://www.facebook.com/groups/awsugtw/permalink/4123481584394988/

awsaws-cdkserverless
Azure Data Lake Analytics – Demo Implementation
By: Roy Kim
 USQL script: process json files into a tab delimited file
Azure Data Lake Analytics – Demo Implementation
By: Roy Kim
Roy Kim
# of JSON
Files
Single output
file
50 compute
nodes
3.4 mins
duration
Azure HDInsight
Hadoop refers to an ecosystem of open-source software that is a framework for
distributed processing, storing, and analysis of big data sets on clusters of
commodity computer hardware.
Azure HDInsight makes the Hadoop components from the Hortonworks Data
Platform (HDP) distribution available in Azure, deploys managed clusters with high
reliability and availability, and provides enterprise-grade security and governance
with Active Directory.
HDInsight offers the cluster types - Hadoop, HBase, Spark, Kafka, Interactive Hive,
Storm, customized, etc.
Supports integration with BI tools such as Power BI, Excel, SQL Server Analysis
Services, and SQL Server Reporting Services.
By: Roy Kim
Azure HDInsight – Demo Implementation
Hadoop Cluster Type
Data Source
Windows Azure Storage Account
Data Lake Store access
Data Lake Store Account
Hive Tables
JobPostings (internal) Table
Schema definition
Data loaded from Azure Data Lake Store .TSV file into Hadoop cluster’s WASB
JobPostings External Table
Data referenced in Azure Data Lake Store .TSV file. This is external to
HDInsight storage account.
By: Roy Kim

Recommended for you

Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...

Data orchestration is the lifeblood of any successful data analytics solution. Take a deep dive into Azure Data Factory's data movement and transformation activities, particularly its integration with Azure's Big Data PaaS offerings such as HDInsight, SQL Data warehouse, Data Lake, and AzureML. Participants will learn how to design, build and manage big data orchestration pipelines using Azure Data Factory and how it stacks up against similar Big Data orchestration tools such as Apache Oozie. Video of presentation: https://channel9.msdn.com/Events/Ignite/Australia-2017/DA332

azurebig datadata engineering
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx

This document provides an overview of a course on implementing a modern data platform architecture using Azure services. The course objectives are to understand cloud and big data concepts, the role of Azure data services in a modern data platform, and how to implement a reference architecture using Azure data services. The course will provide an ARM template for a data platform solution that can address most data challenges.

Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics

This document summarizes how businesses can transform through business intelligence (BI) and advanced analytics using Microsoft's modern BI platform. It outlines the Power BI and Azure Analysis Services tools for visualization, data modeling, and analytics. It also discusses how Collective Intelligence and Microsoft can help customers accelerate their move to a data-driven culture and realize benefits like increased productivity and cost savings by implementing BI and advanced analytics solutions in the cloud. The presentation includes demonstrations of Power BI and Azure Analysis Services.

microsoftbusiness intelligenceanalytics
Azure HD Insight – Demo Implementation
Considerations
 To manage the compute costs, script the provisioning and de-provisioning of the
cluster.
 While a cluster is running, execute scripts and query the data into self service BI
tools and into other data warehouses.
 In comparison to Azure Data Lake, ADL Analytics may be more cost effective
since it is pay per use at a more granular level - # of nodes and execution time.
E.g. Running against 100 nodes may cost a few dollars per minute in ADL
Analytics; whereas, in HDInsight, 13 nodes for small VM size may cost a few
dollars an hour.
By: Roy Kim
Azure SQL Database
A relational database-as-a-service in the cloud built on the Microsoft SQL Server
engine
No need to manage the infrastructure.
Scale up or down based on Database Transaction Units (DTUs).
1TB storage maximum
Can be used as a simpler data warehouse.
By: Roy Kim
Azure SQL Database – Demo Implementation
Developed a simple data warehouse
modelling
Job Postings data loaded from ADLS
Star schema
Added a date dimension table
Table of # of jobs for each province by
a date hierarchy
By: Roy Kim
Azure Data Factory
 Cloud-based data integration service that orchestrates and automates
the movement and transformation of data.
 Create data pipelines that move and transform data, and then run the pipelines on a specified
schedule (hourly, daily, weekly, etc.)
By: Roy Kim

Recommended for you

Building Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows AzureBuilding Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows Azure

Cloud computing is here to stay, and it is never too soon to begin understanding the impact it will have on application architecture. In this talk we will discuss the two most significant architectural mind-shifts, discussing the key patterns changes generally and seeing how these new cloud patterns map naturally into specific programming practices in Windows Azure. Specifically this relates to (a) Azure Roles and Queues and how to combine them using cloud-friendly design patterns, and (b) the combination of relational data and non-relational data, how to decide among them, and how to combine them. The goal is for mere mortals to build highly reliable applications that scale economically. The concepts discussed in this talk are relevant for developers and architects building systems for the cloud today, or who want to be prepared to move to the cloud in the future. This talk was delivered by Bill Wilder at the Vermont Code Camp 2 on 11-Sept-2010.

vtcc2scalesoftware architecture
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...

Learn about the only solution to instantly provision a full-featured ETL environment running on AWS for less than your Sunday newspaper!

data integrationelastic mapreduceaws
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...

The world is creating more data in more ways than ever before. The average internet user in 2017 generates 1.5GB of data per day, with the rate doubling every 18 months. A single autonomous vehicle can generate 4TB per day. Each smart manufacturing plant generates 1PB per day. Storing, managing, and analyzing this data requires integrated database and analytic services that provide reliability and security at scale. AWS offers a range of managed data services that let customers focus on making data useful, including Amazon Aurora, RDS, DynamoDB, Redshift, Spectrum, ElastiCache, Kinesis, EMR, Elasticsearch Service, and Glue. In this session, we discuss these services, share our vision for innovation, and show how our customers use these services today. Learn More: https://aws.amazon.com/government-education/

amazon web servicescloudcloud technology
Azure Data Factory
Category Data store Supported as a source Supported as a sink
Azure Azure Blob storage
Azure Data Lake Store
Azure SQL Database
Azure SQL Data Warehouse
Azure Table storage
Azure DocumentDB
Azure Search Index
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
Databases SQL Server*
Oracle*
MySQL*
DB2*
Teradata*
PostgreSQL*
Sybase*
Cassandra*
MongoDB*
Amazon Redshift
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
File File System*
HDFS*
Amazon S3
FTP
✓
✓
✓
✓
✓
Others Salesforce
Generic ODBC*
Generic OData
Web Table (table from HTML)
GE Historian*
✓
✓
✓
✓
✓
By: Roy Kim
Azure Machine Learning – Demo Implementation
By: Roy Kim
Predicting Salary for a
given set of parameters
such as job title and
location
Azure Machine Learning – Demo Implementation
By: Roy Kim
Power BI Mobile
By: Roy Kim

Recommended for you

Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...

The world is creating more data in more ways than ever before. The average internet user in 2017 generates 1.5GB of data per day, with the rate doubling every 18 months. A single autonomous vehicle can generate 4TB per day. Each smart manufacturing plant generates 1PB per day. Storing, managing, and analyzing this data requires integrated database and analytic services that provide reliability and security at scale. AWS offers a range of managed data services that let customers focus on making data useful, including Amazon Aurora, RDS, DynamoDB, Redshift, Spectrum, ElastiCache, Kinesis, EMR, Elasticsearch Service, and Glue. In this session, we discuss these services, share our vision for innovation, and show how our customers use these services today. Learn More: https://aws.amazon.com/government-education/

2017awspssummitcloudamazon web services
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with Databricks

This document summarizes Atlassian's adoption of Databricks to manage their growing data pipelines and platforms. It discusses the challenges they faced with their previous architecture around development time, collaboration, and costs. With Databricks, Atlassian was able to build scalable data pipelines using notebooks and connectors, orchestrate workflows with Airflow, and provide self-service analytics and machine learning to teams while reducing infrastructure costs and data engineering dependencies. The key benefits included reduced development time by 30%, decreased infrastructure costs by 60%, and increased adoption of Databricks and self-service across teams.

ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS Glue

Organizations need to gain insight and knowledge from a growing number of Internet of Things (IoT), APIs, clickstreams, unstructured and log data sources. However, organizations are also often limited by legacy data warehouses and ETL processes that were designed for transactional data. In this session, we introduce key ETL features of AWS Glue, cover common use cases ranging from scheduled nightly data warehouse loads to near real-time, event-driven ETL flows for your data lake. We discuss how to build scalable, efficient, and serverless ETL pipelines using AWS Glue. Additionally, Merck will share how they built an end-to-end ETL pipeline for their application release management system, and launched it in production in less than a week using AWS Glue.

aws re:invent 2017amazonanalytics & big data
The main features of
your Power BI service UI:
1. navigation bar
2. dashboard with tiles
3. Q&A question box
4. help and feedback
buttons
5. dashboard title
6. Office 365 app
launcher
7. Power BI home
buttons
8. Additional dashboard
actions
Power BI App Service
By: Roy Kim
• Frequently updated and accessed reports
• Minutes, hours, daily, weekly
• Fast and easy access of reports and dashboards
• IoT and sensor data
• Retail and customer analytics
• Team and organizational performance and productivity e.g. ticket
management
• Collaborative analysis and decision making
• Not always in front of a large screen device
Key Mobile Scenarios
By: Roy Kim
• Navigation
• Dashboards and Reports
• Responsive design
• Visualization interaction
• Sharing
• Annotations
• Q&A
• Alerts
• Favourites
Annotations
Mobile App IOS Key Features & Demo
By: Roy Kim
Architecture
Governance
Security
Capacity
Business
Processes
DataPerformance
Application
Design
Operations
By: Roy Kim

Recommended for you

Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake

Today, data lakes are widely used and have become extremely affordable as data volumes have grown. However, they are only meant for storage and by themselves provide no direct value. With up to 80% of data stored in the data lake today, how do you unlock the value of the data lake? The value lies in the compute engine that runs on top of a data lake. Join us for this webinar where Ahana co-founder and Chief Product Officer Dipti Borkar will discuss how to unlock the value of your data lake with the emerging Open Data Lake analytics architecture. Dipti will cover: -Open Data Lake analytics - what it is and what use cases it supports -Why companies are moving to an open data lake analytics approach -Why the open source data lake query engine Presto is critical to this approach

datadata managementdataversity
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle Cloud

This document discusses connecting Oracle Analytics Cloud (OAC) Essbase data to Microsoft Power BI. It provides an overview of Power BI and OAC, describes various methods for connecting the two including using a REST API and exporting data to Excel or CSV files, and demonstrates some visualization capabilities in Power BI including trends over time. Key lessons learned are that data can be accessed across tools through various connections, analytics concepts are often similar between tools, and while partnerships exist between Microsoft and Oracle, integration between specific products like Power BI and OAC is still limited.

power bioracle cloudessbase
ArcReady - Architecting For The Cloud
ArcReady - Architecting For The CloudArcReady - Architecting For The Cloud
ArcReady - Architecting For The Cloud

For our next ArcReady, we will explore a topic on everyone’s mind: Cloud computing. Several industry companies have announced cloud computing services . In October 2008 at the Professional Developers Conference, Microsoft announced the next phase of our Software + Services vision: the Azure Services Platform. The Azure Services Platforms provides a wide range of internet services that can be consumed from both on premises environments or the internet. Session 1: Cloud Services In our first session we will explore the current state of cloud services. We will then look at how applications should be architected for the cloud and explore a reference application deployed on Windows Azure. We will also look at the services that can be built for on premise application, using .NET Services. We will also address some of the concerns that enterprises have about cloud services, such as regulatory and compliance issues. Session 2: The Azure Platform In our second session we will take a slightly different look at cloud based services by exploring Live Mesh and Live Services. Live Mesh is a data synchronization client that has a rich API to build applications on. Live services are a collection of APIs that can be used to create rich applications for your customers. Live Services are based on internet standard protocols and data formats.

windowsazurecomputing
Security Architecture
Azure Data Lake
Analytics
Internet Data
Sources
USQL
Storage Account
Blob Store
Azure Batch
.NET Console
App
Azure Data Lake Store
Blob Store
(HDFS)
Azure HDInsight
Hive
Storage Account
Blob Store
(HDFS)
Azure
Active Directory
HDInsight
Azure SQL
database
SQL Data
Warehouse
Storage blob
Storage (Azure)
Visual Studio
Online
Data Lake
Azure SQL / Data
Warehouse
SQL DB
Analysis Services
(preview)
Tabular
StorageTierServicesTier
REST/HTML/..
Visualization
/Reporting
Tools
Pig
Scoop
Roy Kim
Desktop
Batch
BI
Developers
IT Ops
Azure Data
Factory
Pipeline
Data Factory
Microsoft
AzureAPI
Key
API
Key
AAD App
Service Principal
AAD App
Service Principal
AAD User
SQL account
AAD User
SQL account
Account
Key
AAD User
End Users
By: Roy Kim
Applicatio
n Insights
Data Processing & Formats
Azure Data Lake
Analytics
Internet Data
Sources
USQL
Azure Batch
.NET Console
App
Azure HDInsight
Hive
HDInsight
Azure SQL
database
SQL Data
Warehouse
Data Lake
Azure SQL
SQL DB
Analysis Services
(preview)
Tabular
DataFormatServicesTier
REST/HTML/.. Pig
Scoop
Roy Kim
Batch
Azure Data
Factory
Pipeline
Data Factory
JSON
HTML
JSON
TSV Hive Ext.
Table
Relational
DB
Hive Int.
Table
Table
By: Roy Kim
Closing Remarks
 Cloud services such as Azure Data Platform provide new capabilities in
Data Analytics. That is in terms of scale, cost and agility.
 Azure Data Lake is a productive option for organizations new to
Hadoop. Yet continue to plan for other Hadoop offerings best fit for
other scenarios.
 Many azure services fit together to make the appropriate solution.
That is SaaS, PaaS, IaaS, Data, App, Operational, etc.
 As part of planning and design, be aware of MS roadmap and industry
trends.
By: Roy Kim
Q&A
By: Roy Kim
• @RoyKimYYZ
• roykimtoronto@gmail.com
roykim.ca

Recommended for you

How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?

So you got a handle on what Big Data is and how you can use it to find business value in your data.  Now you need an understanding of the Microsoft products that can be used to create a Big Data solution.  Microsoft has many pieces of the puzzle and in this presentation I will show how they fit together.  How does Microsoft enhance and add value to Big Data?  From collecting data, transforming it, storing it, to visualizing it, I will show you Microsoft’s solutions for every step of the way

big datamicrosoft
Sky High With Azure
Sky High With AzureSky High With Azure
Sky High With Azure

Join us for a deep dive into Windows Azure. We’ll start with a developer-focused overview of this brave new platform and the cloud computing services that can be used either together or independently to build amazing applications. As the day unfolds, we’ll explore data storage, SQL Azure™, and the basics of deployment with Windows Azure. Register today for these free, live sessions in your local area.

microsoft cloud computing azure
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar

This is a slide deck from QuerySurge's Big Data Testing webinar. Learn why Testing is pivotal to the success of your Big Data Strategy . Learn more at www.querysurge.com The growing variety of new data sources is pushing organizations to look for streamlined ways to manage complexities and get the most out of their data-related investments. The companies that do this correctly are realizing the power of big data for business expansion and growth. Learn why testing your enterprise's data is pivotal for success with big data, Hadoop and NoSQL. Learn how to increase your testing speed, boost your testing coverage (up to 100%), and improve the level of quality within your data warehouse - all with one ETL testing tool. This information is geared towards: - Big Data & Data Warehouse Architects, - ETL Developers - ETL Testers, Big Data Testers - Data Analysts - Operations teams - Business Intelligence (BI) Architects - Data Management Officers & Directors You will learn how to: - Improve your Data Quality - Accelerate your data testing cycles - Reduce your costs & risks - Provide a huge ROI (as high as 1,300%)

 
by RTTS
mongodboracleibm db2
Appendix - Azure Data Lake Analytics
By: Roy Kim
50 assigned
DLAU for job
Appendix - Azure Data Lake Analytics
By: Roy Kim
Appendix –BI Tooling

More Related Content

What's hot

Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)
James Serra
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
James Serra
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
James Serra
 
Data Vault Vs Data Lake
Data Vault Vs Data LakeData Vault Vs Data Lake
Data Vault Vs Data Lake
Calum Miller
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
sambiswal
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
James Serra
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
James Serra
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data
James Serra
 
Azure data stack_2019_08
Azure data stack_2019_08Azure data stack_2019_08
Azure data stack_2019_08
Alexandre BERGERE
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
James Serra
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
Kellyn Pot'Vin-Gorman
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Carole Gunst
 
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
Amazon Web Services
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
James Serra
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
CCG
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive
 
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricks
Brandon Berlinrut
 
Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)
Turner Kunkel
 
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Michael Rys
 

What's hot (20)

Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
 
Data Vault Vs Data Lake
Data Vault Vs Data LakeData Vault Vs Data Lake
Data Vault Vs Data Lake
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data
 
Azure data stack_2019_08
Azure data stack_2019_08Azure data stack_2019_08
Azure data stack_2019_08
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
 
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricks
 
Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)
 
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
 

Similar to Big Data Analytics from Azure Cloud to Power BI Mobile

Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data Platform
Shu-Jeng Hsieh
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Lace Lofranco
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
FedoRam1
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
Collective Intelligence Inc.
 
Building Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows AzureBuilding Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows Azure
Bill Wilder
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Precisely
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Amazon Web Services
 
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with Databricks
Databricks
 
ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS Glue
Amazon Web Services
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
DATAVERSITY
 
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle Cloud
Kellyn Pot'Vin-Gorman
 
ArcReady - Architecting For The Cloud
ArcReady - Architecting For The CloudArcReady - Architecting For The Cloud
ArcReady - Architecting For The Cloud
Microsoft ArcReady
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
James Serra
 
Sky High With Azure
Sky High With AzureSky High With Azure
Sky High With Azure
Clint Edmonson
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
RTTS
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
Amazon Web Services
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Amazon Web Services
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
Slava Kokaev
 

Similar to Big Data Analytics from Azure Cloud to Power BI Mobile (20)

Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data Platform
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
 
Building Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows AzureBuilding Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows Azure
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with Databricks
 
ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS Glue
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle Cloud
 
ArcReady - Architecting For The Cloud
ArcReady - Architecting For The CloudArcReady - Architecting For The Cloud
ArcReady - Architecting For The Cloud
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Sky High With Azure
Sky High With AzureSky High With Azure
Sky High With Azure
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
 

More from Roy Kim

Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Roy Kim
 
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUGAzure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Roy Kim
 
Azure Key Vault with a PaaS Architecture and ARM Template Deployment
Azure Key Vault with a PaaS Architecture and ARM Template DeploymentAzure Key Vault with a PaaS Architecture and ARM Template Deployment
Azure Key Vault with a PaaS Architecture and ARM Template Deployment
Roy Kim
 
Azure App Gateway and Log Analytics under Penetration Tests
Azure App Gateway and Log Analytics under Penetration TestsAzure App Gateway and Log Analytics under Penetration Tests
Azure App Gateway and Log Analytics under Penetration Tests
Roy Kim
 
Applying Advanced Techniques to Azure Web Apps
Applying Advanced Techniques to Azure Web AppsApplying Advanced Techniques to Azure Web Apps
Applying Advanced Techniques to Azure Web Apps
Roy Kim
 
Design and Configure Azure App Service Web Apps
Design and Configure Azure App Service Web AppsDesign and Configure Azure App Service Web Apps
Design and Configure Azure App Service Web Apps
Roy Kim
 
SharePoint 2016 Hybrid Overview
SharePoint 2016 Hybrid OverviewSharePoint 2016 Hybrid Overview
SharePoint 2016 Hybrid Overview
Roy Kim
 
SharePoint Hosted Add-in with AngularJS and Bootstrap
SharePoint Hosted Add-in with AngularJS and BootstrapSharePoint Hosted Add-in with AngularJS and Bootstrap
SharePoint Hosted Add-in with AngularJS and Bootstrap
Roy Kim
 
Designing for SharePoint Provider Hosted Apps
Designing for SharePoint Provider Hosted AppsDesigning for SharePoint Provider Hosted Apps
Designing for SharePoint Provider Hosted Apps
Roy Kim
 
Microsoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsMicrosoft Azure For Solutions Architects
Microsoft Azure For Solutions Architects
Roy Kim
 
SharePoint 2013 Hosted App Presentation by Roy Kim
SharePoint 2013 Hosted App Presentation by Roy KimSharePoint 2013 Hosted App Presentation by Roy Kim
SharePoint 2013 Hosted App Presentation by Roy Kim
Roy Kim
 
Networking For Application Developers by Roy Kim
Networking For Application Developers by Roy KimNetworking For Application Developers by Roy Kim
Networking For Application Developers by Roy Kim
Roy Kim
 
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer FeatureSharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
Roy Kim
 

More from Roy Kim (13)

Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
 
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUGAzure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
 
Azure Key Vault with a PaaS Architecture and ARM Template Deployment
Azure Key Vault with a PaaS Architecture and ARM Template DeploymentAzure Key Vault with a PaaS Architecture and ARM Template Deployment
Azure Key Vault with a PaaS Architecture and ARM Template Deployment
 
Azure App Gateway and Log Analytics under Penetration Tests
Azure App Gateway and Log Analytics under Penetration TestsAzure App Gateway and Log Analytics under Penetration Tests
Azure App Gateway and Log Analytics under Penetration Tests
 
Applying Advanced Techniques to Azure Web Apps
Applying Advanced Techniques to Azure Web AppsApplying Advanced Techniques to Azure Web Apps
Applying Advanced Techniques to Azure Web Apps
 
Design and Configure Azure App Service Web Apps
Design and Configure Azure App Service Web AppsDesign and Configure Azure App Service Web Apps
Design and Configure Azure App Service Web Apps
 
SharePoint 2016 Hybrid Overview
SharePoint 2016 Hybrid OverviewSharePoint 2016 Hybrid Overview
SharePoint 2016 Hybrid Overview
 
SharePoint Hosted Add-in with AngularJS and Bootstrap
SharePoint Hosted Add-in with AngularJS and BootstrapSharePoint Hosted Add-in with AngularJS and Bootstrap
SharePoint Hosted Add-in with AngularJS and Bootstrap
 
Designing for SharePoint Provider Hosted Apps
Designing for SharePoint Provider Hosted AppsDesigning for SharePoint Provider Hosted Apps
Designing for SharePoint Provider Hosted Apps
 
Microsoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsMicrosoft Azure For Solutions Architects
Microsoft Azure For Solutions Architects
 
SharePoint 2013 Hosted App Presentation by Roy Kim
SharePoint 2013 Hosted App Presentation by Roy KimSharePoint 2013 Hosted App Presentation by Roy Kim
SharePoint 2013 Hosted App Presentation by Roy Kim
 
Networking For Application Developers by Roy Kim
Networking For Application Developers by Roy KimNetworking For Application Developers by Roy Kim
Networking For Application Developers by Roy Kim
 
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer FeatureSharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
 

Recently uploaded

Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
Liveplex
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Andrey Yasko
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
BookNet Canada
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Mydbops
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
ArgaBisma
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
Vijayananda Mohire
 
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Awais Yaseen
 

Recently uploaded (20)

Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALLBLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
BLOCKCHAIN FOR DUMMIES: GUIDEBOOK FOR ALL
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
 
Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...Transcript: Details of description part II: Describing images in practice - T...
Transcript: Details of description part II: Describing images in practice - T...
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdfWhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
WhatsApp Image 2024-03-27 at 08.19.52_bfd93109.pdf
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 
Quantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLMQuantum Communications Q&A with Gemini LLM
Quantum Communications Q&A with Gemini LLM
 
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
 

Big Data Analytics from Azure Cloud to Power BI Mobile

  • 1. Big Data Analytics from Azure Data Platform to Power BI Azure Batch, Azure Data Lake, Azure HDInsight, ML, Power BI March 22, 2017 Roy Kim @RoyKimYYZ roykimtoronto@gmail.com
  • 2. Agenda  Overview of Big Data + Azure + Data Insights  Job Postings demo solution architecture & implementation  Mobile Demo with Power BI  Q&A Author: Roy Kim By: Roy Kim
  • 3. Bio  Roy Kim  14+ Years of Microsoft Technology Solutions  .NET, SharePoint, BI, Office 365, Azure Solutions  IT Consultant  University of Toronto – Computer Science Degree Author: Roy Kim By: Roy Kim
  • 4. Data to Insight Author: Roy Kim Big Data Data Platform Technologies Solution Data Insights By: Roy Kim
  • 5. Job Postings Demo Solution Author: Roy Kim References: https://softwarestrategiesblog.com/2015/09/05/10-ways-big-data-is-revolutionizing-supply-chain-management Job Postings Azure Data Platform Data Lake, HDInsight, SQL, Power BI Job trends, analysis By: Roy Kim
  • 9. Analytics Platform Gartner Magic Quadrant
  • 11. Job Postings Data Set Volume • Many national job sites • New job postings daily • Metadata and full text. Velocity • New job postings created every minute Variety • Semi- structured • Job Title • Location • Company • Unstructured • Job Description Veracity • Incomplete/Im precise • Salary, Per hour • FT, PT, Temp, Contract, Seasonal • Main profession By: Roy Kim
  • 12. Power BI – Job Postings Demo Reports By: Roy Kim
  • 13. Power BI – Job Postings Demo Reports By: Roy Kim
  • 14. Job Postings Big Data Solution Architecture Azure Data Lake Analytics Internet Data Sets USQL Storage Account Blob Store Azure Batch .NET Console App Blob Store (WebHDFS) Azure HDInsight Hive Azure Active Directory HDInsight Azure SQL database SQL Data Warehouse Storage blob Storage (Azure) Visual Studio Online Data Lake Azure SQL / Data Warehouse SQL DB Machine Learning ML Studio StorageTierServicesTier REST/HTML/.. Visualization /Reporting Tools Presentation Tier Mobile Pig Scoop By: Roy Kim Desktop Batch Business Users Report Builders Data Analysts Azure Data Factory Pipeline Data Factory Browser Service Applicatio n Insights Microsoft Azure Data Analysis Services Tabular Machine Learning Storage Account Blob Store (HDFS) Azure Data Lake Store Query
  • 15. Job Postings from Internet Job Boards  Web sites that offer APIs  Use any server-side programming language to retrieve data such as NET, Java, Node.js, etc.  If no APIs, consider HTML web page scraping REST API http end points typically return JSON or XML data formats Html Web Page Scraping HTML Agility Pack to assist in parsing the Document Object Model for data points. https://www.nuget.org/packages/HtmlAgilityPack HTML parsing supporting XPath to traverse the Document Object Model (DOM) E.g. doc.DocumentElement.SelectSingleNode(“//div*@id=‘Total Sales’+”)By: Roy Kim
  • 16. Job Postings Data Collector .NET Console Application .NET console application to read data from the internet and store into Azure Storage accounts  Concurrent requests to job postings public API and HTML pages  Multi-threaded to increase speed and throughput  Parse HTML pages and JSON  Store JSON files directly into Azure Data Lake Store with ADLS .NET SDK  Leverages Azure Application Insights for logging trace and exception error messages.  To store files into Azure Data Lake Store, the .NET application needs to access with an Azure AD service principal with the appropriate access control. By: Roy Kim
  • 17. Job Postings Data Collector App Architecture By: Roy Kim
  • 18. Azure Application Insights By: Roy Kim Application Insights Core API. This package provides core functionality for transmission of all Application Insights Telemetry Types and is a dependent package for all other Application Insights packages.
  • 19. Azure Batch A managed Azure service executing command line applications. For batch processing or batch computing--running a large volume of similar tasks to get some desired result. Commonly used by organizations that regularly process, transform, and analyze large volumes of data. Simply, a set of Azure Virtual Machines running a console application to process data that can be on a recurring schedule and in parallel References: https://github.com/Microsoft/azure-docs/blob/master/articles/batch/batch-technical-overview.md Author: Roy Kim
  • 20. Azure Batch – Demo Implementation Azure Batch runs the Console Application on a daily schedule against one node (Virtual Machine - 2 cores) To run console application in parallel through compute nodes, used the sample Parallel Tasks .NET solution which uses the Azure Batch Client SDK. https://github.com/Azure/azure-batch- samples/tree/master/CSharp/ArticleProjects/ParallelTasks Azure batch is an architecture option to support data collection in terms of velocity and volume. By: Roy Kim
  • 21. Azure Data Lake Intended for data storage in its raw format for future analysis, processing or data modelling. For developers, data scientists, and analysts to store data of any size, shape, and speed. To do all types of processing and analytics across different platforms and languages. Extract and load, minimal transformations To manage data in characteristic of variety, velocity and volume Two Components 1. Azure Data Lake Store 2. Azure Data Lake Analytics References: https://azure.microsoft.com/en-us/solutions/data-lake/ By: Roy Kim
  • 22. Azure Data Lake Store  Azure Data Lake Store is a hyper-scale repository for big data analytic workloads. Azure Data Lake enables you to capture data of any size, type, and ingestion speed in one single place for operational and exploratory analytics.  The Azure Data Lake store is an Apache Hadoop file system compatible with Hadoop Distributed File System (HDFS)  Can be accessed from Hadoop (available with HDInsight cluster) using the WebHDFS-compatible REST APIs References: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview By: Roy Kim
  • 23. Azure Data Lake Store Use Cases  Store social media posts, log files, sensor data  Store corporate data such as relational databases (as flat files) References: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview By: Roy Kim
  • 24. Azure Data Lake Analytics  Azure Data Lake Analytics is built to make big data analytics easy.  Focus on writing, running, and managing jobs, rather than operating distributed infrastructure. Instead of deploying, configuring, and tuning hardware.  Write queries to transform your data and extract valuable insights. The analytics service can handle jobs of any scale instantly by setting the dial for how much power you need.  U-SQL – a Big Data query language. Likeness of SQL + C#  ”schema on reads”  Pay for your job when it is running; making it cost-effective.  Data Collector app stores .json files in respective folders  USQL scripts logic:  reads 1000s of JSON files in a given folder  Outputs to one TSV (tab delimited) file  Create a Tables to schematize the TSV files  Query against tables to analyze or transform to a new output file. References: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview By: Roy Kim
  • 25. Azure Data Lake Analytics – Demo Implementation By: Roy Kim  USQL script: process json files into a tab delimited file
  • 26. Azure Data Lake Analytics – Demo Implementation By: Roy Kim Roy Kim # of JSON Files Single output file 50 compute nodes 3.4 mins duration
  • 27. Azure HDInsight Hadoop refers to an ecosystem of open-source software that is a framework for distributed processing, storing, and analysis of big data sets on clusters of commodity computer hardware. Azure HDInsight makes the Hadoop components from the Hortonworks Data Platform (HDP) distribution available in Azure, deploys managed clusters with high reliability and availability, and provides enterprise-grade security and governance with Active Directory. HDInsight offers the cluster types - Hadoop, HBase, Spark, Kafka, Interactive Hive, Storm, customized, etc. Supports integration with BI tools such as Power BI, Excel, SQL Server Analysis Services, and SQL Server Reporting Services. By: Roy Kim
  • 28. Azure HDInsight – Demo Implementation Hadoop Cluster Type Data Source Windows Azure Storage Account Data Lake Store access Data Lake Store Account Hive Tables JobPostings (internal) Table Schema definition Data loaded from Azure Data Lake Store .TSV file into Hadoop cluster’s WASB JobPostings External Table Data referenced in Azure Data Lake Store .TSV file. This is external to HDInsight storage account. By: Roy Kim
  • 29. Azure HD Insight – Demo Implementation Considerations  To manage the compute costs, script the provisioning and de-provisioning of the cluster.  While a cluster is running, execute scripts and query the data into self service BI tools and into other data warehouses.  In comparison to Azure Data Lake, ADL Analytics may be more cost effective since it is pay per use at a more granular level - # of nodes and execution time. E.g. Running against 100 nodes may cost a few dollars per minute in ADL Analytics; whereas, in HDInsight, 13 nodes for small VM size may cost a few dollars an hour. By: Roy Kim
  • 30. Azure SQL Database A relational database-as-a-service in the cloud built on the Microsoft SQL Server engine No need to manage the infrastructure. Scale up or down based on Database Transaction Units (DTUs). 1TB storage maximum Can be used as a simpler data warehouse. By: Roy Kim
  • 31. Azure SQL Database – Demo Implementation Developed a simple data warehouse modelling Job Postings data loaded from ADLS Star schema Added a date dimension table Table of # of jobs for each province by a date hierarchy By: Roy Kim
  • 32. Azure Data Factory  Cloud-based data integration service that orchestrates and automates the movement and transformation of data.  Create data pipelines that move and transform data, and then run the pipelines on a specified schedule (hourly, daily, weekly, etc.) By: Roy Kim
  • 33. Azure Data Factory Category Data store Supported as a source Supported as a sink Azure Azure Blob storage Azure Data Lake Store Azure SQL Database Azure SQL Data Warehouse Azure Table storage Azure DocumentDB Azure Search Index ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Databases SQL Server* Oracle* MySQL* DB2* Teradata* PostgreSQL* Sybase* Cassandra* MongoDB* Amazon Redshift ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ File File System* HDFS* Amazon S3 FTP ✓ ✓ ✓ ✓ ✓ Others Salesforce Generic ODBC* Generic OData Web Table (table from HTML) GE Historian* ✓ ✓ ✓ ✓ ✓ By: Roy Kim
  • 34. Azure Machine Learning – Demo Implementation By: Roy Kim Predicting Salary for a given set of parameters such as job title and location
  • 35. Azure Machine Learning – Demo Implementation By: Roy Kim
  • 37. The main features of your Power BI service UI: 1. navigation bar 2. dashboard with tiles 3. Q&A question box 4. help and feedback buttons 5. dashboard title 6. Office 365 app launcher 7. Power BI home buttons 8. Additional dashboard actions Power BI App Service By: Roy Kim
  • 38. • Frequently updated and accessed reports • Minutes, hours, daily, weekly • Fast and easy access of reports and dashboards • IoT and sensor data • Retail and customer analytics • Team and organizational performance and productivity e.g. ticket management • Collaborative analysis and decision making • Not always in front of a large screen device Key Mobile Scenarios By: Roy Kim
  • 39. • Navigation • Dashboards and Reports • Responsive design • Visualization interaction • Sharing • Annotations • Q&A • Alerts • Favourites Annotations Mobile App IOS Key Features & Demo By: Roy Kim
  • 41. Security Architecture Azure Data Lake Analytics Internet Data Sources USQL Storage Account Blob Store Azure Batch .NET Console App Azure Data Lake Store Blob Store (HDFS) Azure HDInsight Hive Storage Account Blob Store (HDFS) Azure Active Directory HDInsight Azure SQL database SQL Data Warehouse Storage blob Storage (Azure) Visual Studio Online Data Lake Azure SQL / Data Warehouse SQL DB Analysis Services (preview) Tabular StorageTierServicesTier REST/HTML/.. Visualization /Reporting Tools Pig Scoop Roy Kim Desktop Batch BI Developers IT Ops Azure Data Factory Pipeline Data Factory Microsoft AzureAPI Key API Key AAD App Service Principal AAD App Service Principal AAD User SQL account AAD User SQL account Account Key AAD User End Users By: Roy Kim Applicatio n Insights
  • 42. Data Processing & Formats Azure Data Lake Analytics Internet Data Sources USQL Azure Batch .NET Console App Azure HDInsight Hive HDInsight Azure SQL database SQL Data Warehouse Data Lake Azure SQL SQL DB Analysis Services (preview) Tabular DataFormatServicesTier REST/HTML/.. Pig Scoop Roy Kim Batch Azure Data Factory Pipeline Data Factory JSON HTML JSON TSV Hive Ext. Table Relational DB Hive Int. Table Table By: Roy Kim
  • 43. Closing Remarks  Cloud services such as Azure Data Platform provide new capabilities in Data Analytics. That is in terms of scale, cost and agility.  Azure Data Lake is a productive option for organizations new to Hadoop. Yet continue to plan for other Hadoop offerings best fit for other scenarios.  Many azure services fit together to make the appropriate solution. That is SaaS, PaaS, IaaS, Data, App, Operational, etc.  As part of planning and design, be aware of MS roadmap and industry trends. By: Roy Kim
  • 44. Q&A By: Roy Kim • @RoyKimYYZ • roykimtoronto@gmail.com roykim.ca
  • 45. Appendix - Azure Data Lake Analytics By: Roy Kim 50 assigned DLAU for job
  • 46. Appendix - Azure Data Lake Analytics By: Roy Kim