SlideShare a Scribd company logo
Anzo Smart Data Lake® 4.0
A Data Lake Platform for the Enterprise Information Fabric
Ben Szekely – Vice President of Solution Engineering
Austin Meyer- Solutions Engineer, Pre Sales
“The complex process of bringing together
different data sources throughout an
organization is now being automated
creating a single, semantic layer of an
organization’s data.“
WIRED News
The Semantic Layer
“Semantic approaches are the future of
the enterprise information fabric“
Michele Goetz - Principal Analyst - Forrester Research
The Information Fabric
“Semantic approaches are the future of
the enterprise information fabric“
Michele Goetz - Principal Analyst - Forrester Research
The Information Fabric
Anzo Smart Data Lake® 4.0
The industry leading platform for building a Semantic Layer
Open StandardsEnd-To-End Enterprise Scale
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric
The Case for a Semantic Layer
The Drive for Insight
Ask more questions - faster
Stay ahead of
the competition
Uncover revenue
growth opportunities
What is the market
landscape for lung cancer
therapies in 2020?
How do I design my
clinical trial to benefit
the most patients?
Business
Question
Insights
Delivered
Build
Analytics
Prep the Data
Warehouse
Conversational data exploration is not practical
Each question takes too long to ask and answer.
Locate & Load
(ETL) Data
TIME
The data driven business must
minimize costs to deliver on
business commitments
Shrink Execution Costs
Which is increasingly difficult to
achieve in complex data and
regulatory environments
In Complex Environments
The Urgency of Requirements
“I need to deliver on-
time Adverse Event
reports to the FDA”
“The complexity of
clinical data standards
inhibits the time it takes
to design trials”
Cost effective flexibility
Executing in dynamic and complex data
environments is costly
requiring brittle one-off solutions and
manual efforts to combine data
REQUIREMENTS
Enterprise Data
Executing in dynamic and complex data
environments is costly
that simply can't keep up…
REQUIREMENTS
Enterprise Data
Tying the information
together for on demand
access to data and
insights
The Semantic Layer is the
Rosetta Stone for data and
the business
Business
Question
Insights
Delivered
Build
Analytics
Prep the Data
Warehouse
Locate & Load
(ETL) Data
The Semantic Layer disrupts business inhibitors
Business
Question
Insights
Delivered
Build
Analytics
Prep the Data
Warehouse
Locate & Load
(ETL) Data
TIME
The Semantic Layer disrupts business inhibitors
Business
Question
Insights
Delivered
Build
Analytics
Prep the Data
Warehouse
Semantic
Layer
Insights
Delivered
Locate & Load
(ETL) Data
Locate &
Ingest Data
TIME TIME
Business
Question
The Semantic Layer disrupts business inhibitors
How do we Realize a Semantic Layer?
©2017 Cambridge Semantics Inc. All rights reserved.
First Gen
The Semantic Layer requires more than data
alone.
Where Data Lakes Fall Short
High value Data Lakes must tie information together, in
the language of the business.
“Last Mile”
Analytics
Enterprise
Data Sources
Cloud
Storage
...which requires custom coding
and tool integration
Data Ingestion
Structured Data
Ingest and ETL
Storage
Infrastructure
Data Cataloging
Basic Metadata
Cataloging
Self Service
Access to Raw
Data Sets (SQL)
Second Gen
Unstructured
Content
Diverse Data
Sources
Enterprise
Data Lakes
Internal
Databases &
Applications
Industry
Standards &
Open Data
Third-Party
Data
Open Standards Scalability Security Governance
The Enterprise Knowledge Graph
DATA WAREHOUSES 1ST GEN DATA LAKE (HADOOP)
Semantic Layer
Connects data with
business meaning
Data On Demand
Gives the business
users access to data
Enterprise
Knowledge
Graph
ANZO SMART DATA LAKE®
How does Anzo Smart Data Lake® work?
Automated
ETL & Ingestion
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric
Data Catalog
Manages and secures
the semantic layer
©2017 Cambridge Semantics Inc. All rights reserved.
Product
Name
Product ID Opportunity
Product ID
Product
Name 1
Product ID 1 Product ID 1
Product
Name 2
Product ID 2 Product ID 2
… … …
Product
Automated ETL Generation and Execution
Opportunity
Product ID
Account ID Geo
Product ID 1 Acc ID 1 Americas
… … …
Marketing Bookings
Product ID 1
Product ID
Product ID Account ID Geo
Product ID 1 Acc ID 1 Americas
… … …
Revenue
Revenue
Product ID 1
Product ID
Acc ID 1
Account ID
Product
Product ID 1
Product ID
Product ID 1
Product ID
Product Name 1
Product
Name
Marketing
Product ID 1
Product ID
Americas
Geo
Acc ID 1
Account ID
Graphmarts and Data Layers
Loads knowledge graph into
memory for layered prep and
analytics
Data Catalog
Manages and secures
the semantic layer
Anzo
Graphmart
Go Live
Base Layer
Graph Data Loaded
from Catalog
Data Prep Layer
Rule Layer
Access Control
Layer
Cleanse
Layer
Relationship Layer Rule Layer
Anzo Graphmarts and
Data Layers
Create new relationships
Transformation and conformance
Transform onto canonical models
Define granular access control
Deploy in the cloud on-demand
Data on demand services
©2017 Cambridge Semantics Inc. All rights reserved.
Product
Name
Product ID Opportunity
Product ID
Product
Name 1
Product ID 1 Oppty Product
ID 1
Product
Name 2
Product ID 2 Oppty Product
ID 2
… … …
Product
Data Layers – Create Dynamic Relationships
RelationshipLayer
Opportunity
Product ID
Account ID Geo
Oppty
Product ID 1
Acc ID 1 Americas
… … …
Marketing Bookings
Product ID 1
Product ID
Product
Product ID Account ID Geo
Oppty
Product ID 1
Acc ID 1 Americas
… … …
Revenue
Product
Revenue
Product ID 1
Product ID
Acc ID 1
Account ID
Product
Product ID 1
Product ID
Opportunity
Product ID 1
Opportunity
Product ID
Product Name 1
Product
Name
Marketing
Opportunity
Product ID 1
Opportunity
Product ID
Americas
Geo
Acc ID 1
Account ID
Product ID 1
Product ID 1
Acc ID 1
Acc ID 1
Opportunity
Product ID 1
Opportunity
Product ID 1
Product ID 1
©2017 Cambridge Semantics Inc. All rights reserved.
6/30/2016
Holding
VersionDate
1006003
SecurityCode
1-3WGC-0
AccountCode
6/30/20166/30/2016
AccountCode SecurityCode VersionDate
1-3WGC-0 1006003 6/30/2016
1-3WGC-2 1013967 7/31/2017
… … …
Holdings
Data Layers – Create Dynamic Relationships
AccountCode VersionDate AccountName
1-3WGC-0 6/30/2016 BLDRS Asia 50 ADR Index
Fund
1-3WGC-2 7/31/2017 BLDRS Emerging Markets
50 ADR Index Fund
… …
Account Reference
SecurityCode VersionDate Ticker
1006003 6/30/2016 RYAAY US
1013967 7/31/2017 AAAP US
… …
Security Reference
BLDRS Asia 50
ADR Index Fund
AccountName
1-3WGC-0
AccountCode
Account
6/30/2016
VersionDate
Security
RYAAY US
Ticker
1006003
SecurityCode
6/30/2016
VersionDate
1006003
1006003
6/30/2016
1-3WGC-0
1-3WGC-0
6/30/2016
RelationshipLayer
©2017 Cambridge Semantics Inc. All rights reserved.
CSI_AZ_001
DEMOGRAPHICS
TRIALID
9
LOCATIONID
21
PATIENTID
PATIENTID TRIALID LOCATIONID
21 CSI_AZ_001 9
41 CSI_AZ_001 1
… … …
DEMOGRAPHICS
Data Layers for Data Prep Layer
Subjects
DataPrepLayer
PATIENTID PREFERRED TOXICITY
21 Abscess – Abdominal 4
41 HA 2
… … …
SIDEEFFECTRECORD
SUBJECTID STUDYID SITEID
2001 CSI_AZ_002 8
15024 CSI_AZ_006 9
… … …
Subjects
SUBJECTID PREFERREDTERM TOXICITYGRADE
2001 Abdominal abscess 4
15024 Periorbital oedema 5
… … …
AdverseEvent
SIDEEFFECTRECORD
AdverseEvent
21
PATIENTID
PREFERRED
Abscess -
Abdominal
4
TOXICITY
CSI_AZ_002
STUDYID
8
SITEID
2001
SUBJECTID
Abdominal
abscess
PREFERRED
TERM 4
TOXICITY
GRADE
2001
SUBJECTID
Abdominal
abscess
RAWCONFORMED
TOXICITY
GRADE
PREFERRED
TERM
SUBJECTID
SITEID
STUDYID
SUBJECTID
Anzo Graph Query Engine
In-memory MPP architecture
Managed through Graphmarts and Data Layers
Horizontal scale with on-demand cloud computing
Interactive data preparation and analytics
Scales to trillions of facts
Self-guided exploratory analytics
Secure and governed
Model driven configuration, captured in Data Catalog
Data Layers and extracts
Discovery for unstructured and structured data
No query writing – automated query generation
Hi-Res Analytics
Hi-Res Analytics
©2017 Cambridge Semantics Inc. All rights reserved.
Hi-Res Analytics - Model-Driven Exploration
©2017 Cambridge Semantics Inc. All rights reserved.
• Learn about navigating the business friendly graph
model
• Learn how to clean and prepare data with basic
formulas
• Learn how to assemble data sets to answer questions
in analytic tools
Claim
Patient
Record
Drug
Note
Subscriber
A Citizen Data Scientist comes up to speed in hours in
collaborative workshops
Hi-Res Analytics for the Citizen Data Scientist
Data
Layers
Hi-Res Analytics
Data Layers
©2017 Cambridge Semantics Inc. All rights reserved.
Unstructured Data is First-Class Citizen in ASDL
©2017 Cambridge Semantics Inc. All rights reserved.
Claim ID Process Date Subscriber ID
44223 10/3/2015 C12345
44224 10/7/2015 C23412
… … …
Claims
On July 3, 2016 Patient BA213 seemed
frustrated after experiencing headache and
nausea following 500mg dosage of sleep aid
therapeutic, Narcoleptol.
On Site Doctor Note
Building and Expanding the Enterprise Knowledge Graph
Patient ID Condition Drug Name
BA213 Sleep Apnea Narcoleptol
CS289 Type II Diabetes Insulin
… …
EHR
BA213
Patient ID
Drugprescribed
Narcoleptol
brand
name
Sleep
Apnea
Condition
10/3/2015
Process
Date Subscriber
C12345
Subs. ID
Patient
Record
about
500mg
Dosage
Note
3/7/2016
when
Headache
and nausea
event
-.05
Sentiment
score
Claim
44223
Claim ID
about
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric
Capability 2 – Unstructured NEED SCREENSHOTS
©2017 Cambridge Semantics Inc. All rights reserved.
AnalyticsEndpoint
Product ID Geo Account ID Opportunity
Prod ID
Product ID 1 Americas Acc ID 1 Opportunity
Prod ID
… …
My Product Set
Analytics Endpoints for On-Demand Access in any Tool
Bookings
Product ID 1
Product ID
Product
Product
Revenue
Product ID 1
Product ID
Acc ID 1
Account ID
Product
Product ID 1
Product ID
Opportunity
Product ID 1
Opportunity
Product ID
Product Name 1
Product
Name
Marketing
Opportunity
Product ID 1
Opportunity
Product ID
Americas
Geo
Acc ID 1
Account ID
Enterprise Knowledge Graph
Enabling on-demand
access to data
by those seeking
answers and insight
Scalability
Security
Governance
Lineage
Automated
Structured
Data Ingestion
Natural Language
Processing and
Text Analytics
Rich
models
Hi-Res Analytics
Anzo Smart Data Lake®
Data On Demand
Enterprise Knowledge Graph Data On Demand
Automated Ingestion of
Patient Data
Patient
Safety
Clinical
Trial Ops
R & D
Health
Economics
The Smart Data Lake for Digital Patient Health
Insight for Decision
Makers
Improving patient
outcomes, safety, and
comfort
Reducing the time bring
medicines to patients
Lowering the cost of
healthcare
Insurance Claims
Clinical Trials
Rx Data
Health Records
Genetic Data
Wearables
+
Enterprise Knowledge Graph Data On Demand
Automated Ingestion of
Customer Data
Sales
Complianc
e
Marketing
Risk
Management
The Smart Data Lake for Customer 360
Insight for Decision
Makers
Connect with your
customers
Reduce risk
Increase revenue
Account Data
Trading Data
Marketing Data
Relationship
Data
©2017 Cambridge Semantics Inc. All rights reserved.
Node 1 Node 2
GQE Cluster
Node N
…
Node 1 Node 2
Hadoop/Spark/HDFS Cluster
Node M
…
…
Anzo Enterprise Server
Node 1 Node 2 Node P
…
ASDL Server
Anzo Ingest Servers
Node 1 Node 2 Node P
…
Client
Browser
Active Directory
Anzo on the Web App
ASDL Web App
HTTP/ODATA/SPARQL
Structured,
Graph Data
SPARQL
HTTP/GRPC
HTTP/HTTPS
HTTP/JMS Metadata
Synchronization
HTTP/JMS
Metadata
Synchronization
HDFS
Fuse
Apache Livy
Metadata
HTTP/HTTPS
Elastic Search Cluster
Node 1 Node 2 Node N
…
DS1 DS2 DS3…
…
JDBC
…
Schema
Job Execution
HTTP/HTTPS
Ustructured
Data
Documents
Anzo Smart Data Lake Architecture
A Data Journey of Differentiating Capabilities
Unstructured Data
Notes, Docs,
Emails, Articles
Structured Data
Relational, CSV,
HDFS, External
Data Feeds
AccessPrepareCatalogIngest
NLP, Text Analytics,
Sentiment Analysis
Hi-Res Analytics
Data Catalog
Graphmarts
Data Layers
Data Lake
[Metadata or Data]
Semantic
Layer
HTTP
ODATA
Services
Business
User
IT
User
Capability 1 - Ingestion and Cataloging
Unstructured Data
Notes, Docs,
Emails, Articles
Structured Data
Relational, CSV,
HDFS, External
Data Feeds
AccessPrepareCatalogIngest
NLP, Text Analytics,
Sentiment Analysis
Hi-Res Analytics
Data Catalog
Graphmarts
Data Layers
Data Lake
[Metadata or Data]
Semantic
Layer
HTTP
ODATA
Services
Business
User
IT
User
Capability 2 – Unstructured Data Ingestion
Unstructured Data
Notes, Docs,
Emails, Articles
Structured Data
Relational, CSV,
HDFS, External
Data Feeds
AccessPrepareCatalogIngest
NLP, Text Analytics,
Sentiment Analysis
Hi-Res Analytics
Data Catalog
Graphmarts
Data Layers
Data Lake
[Metadata or Data]
Semantic
Layer
HTTP
ODATA
Services
Business
User
NoSQL
IT
User
Capability 3 – Graphmarts and Data Layers
Unstructured Data
Notes, Docs,
Emails, Articles
Structured Data
Relational, CSV,
HDFS, External
Data Feeds
AccessPrepareCatalogIngest
NLP, Text Analytics,
Sentiment Analysis
Hi-Res Analytics
Data Catalog
Graphmarts
Data Layers
Data Lake
[Metadata or Data]
Semantic
Layer
HTTP
ODATA
Services
Business
User
NoSQL
IT
UserCan Be Tabular
Virtual Hub and Spoke ETL
Structured Data
Relational, CSV,
HDFS, Data Feeds
External and Internal
AccessPrepareCatalogIngest
Data Catalog and
Metadata Capture
On Demand Access
to Data
Big Data Stores
MappingMapping
Semantic
Layer
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric
Capability 4 – Hi-Res Analytics™
Unstructured Data
Notes, Docs,
Emails, Articles
Structured Data
Relational, CSV,
HDFS, External
Data Feeds
AccessPrepareCatalogIngest
NLP, Text Analytics,
Sentiment Analysis
Hi-Res Analytics
Data Catalog
Graphmarts
Data Layers
Data Lake
[Metadata or Data]
Semantic
Layer
HTTP
ODATA
Services
Business
User
NoSQL
IT
UserCan Be Tabular
©2017 Cambridge Semantics Inc. All rights reserved.
Creates a high resolution digital
twin of diverse and complex data
sets using open W3C standards –
structured and unstructured
Enhance Digital Transformation
Makes it easy for aspiring
citizen data scientists to ask
questions or extract data using
sophisticated but intuitive
auto-generation of queries
Empower Citizen Data Scientists
Uses the language of the
business to let users create and
share insights quickly by working
the way they think.
Make Data Understandable
A future-proof layer for fueling
data into emerging technologies
including ML and text analytics.
Build a Bridge to the Future
Anzo Smart Data Lake – Strategic Benefits
©2017 Cambridge Semantics Inc. All rights reserved.
INDEFINITE
Drug Discovery Preclinical Product
Development
FDA Review Scale-Up to Mfg.
Post-Marketing
Surveillance
ONE FDA-
APPROVED
DRUG
0.5 – 2
YEARS6 – 7 YEARS3 – 6 YEARS
NUMBER OF VOLUNTEERS
PHASE
1
PHASE
2
PHASE
3
5250~ 5,000 – 10,000
COMPOUNDS
PRE-DISCOVERY
20–100 100–500 1,000–5,000
INDSUBMITTED
NDA/BLASUBMITTED
The Information Fabric – A Semantic Layer for the Enterprise
R & D
Intelligence
(CI)
Product
Development
& Regulatory
PV & Safety
Case
Management
Source of Influence
Commercial
Analytics
Clinical Trial
Operations
Medical
Advisory Board
Analytics
Real World
Research
Clinical Data
Standards
Management
Voice of the
Customer
Analytics
Clinical Trial
Exploratory
Analytics
How long does it take to implement?
Does ASDL replace my data lake?
Where can I find out more?
• Get started building your Semantic Layer today
• Build on the data lake investments you have already made
• Stop by our booth - 441
Getting Started
Thank You!
Click here to request a demo

More Related Content

Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Information Fabric

  • 1. Anzo Smart Data Lake® 4.0 A Data Lake Platform for the Enterprise Information Fabric Ben Szekely – Vice President of Solution Engineering Austin Meyer- Solutions Engineer, Pre Sales
  • 2. “The complex process of bringing together different data sources throughout an organization is now being automated creating a single, semantic layer of an organization’s data.“ WIRED News The Semantic Layer
  • 3. “Semantic approaches are the future of the enterprise information fabric“ Michele Goetz - Principal Analyst - Forrester Research The Information Fabric
  • 4. “Semantic approaches are the future of the enterprise information fabric“ Michele Goetz - Principal Analyst - Forrester Research The Information Fabric
  • 5. Anzo Smart Data Lake® 4.0 The industry leading platform for building a Semantic Layer Open StandardsEnd-To-End Enterprise Scale
  • 7. The Case for a Semantic Layer
  • 8. The Drive for Insight Ask more questions - faster Stay ahead of the competition Uncover revenue growth opportunities What is the market landscape for lung cancer therapies in 2020? How do I design my clinical trial to benefit the most patients?
  • 9. Business Question Insights Delivered Build Analytics Prep the Data Warehouse Conversational data exploration is not practical Each question takes too long to ask and answer. Locate & Load (ETL) Data TIME
  • 10. The data driven business must minimize costs to deliver on business commitments Shrink Execution Costs Which is increasingly difficult to achieve in complex data and regulatory environments In Complex Environments The Urgency of Requirements “I need to deliver on- time Adverse Event reports to the FDA” “The complexity of clinical data standards inhibits the time it takes to design trials” Cost effective flexibility
  • 11. Executing in dynamic and complex data environments is costly requiring brittle one-off solutions and manual efforts to combine data REQUIREMENTS Enterprise Data
  • 12. Executing in dynamic and complex data environments is costly that simply can't keep up… REQUIREMENTS Enterprise Data
  • 13. Tying the information together for on demand access to data and insights The Semantic Layer is the Rosetta Stone for data and the business
  • 14. Business Question Insights Delivered Build Analytics Prep the Data Warehouse Locate & Load (ETL) Data The Semantic Layer disrupts business inhibitors
  • 15. Business Question Insights Delivered Build Analytics Prep the Data Warehouse Locate & Load (ETL) Data TIME The Semantic Layer disrupts business inhibitors
  • 16. Business Question Insights Delivered Build Analytics Prep the Data Warehouse Semantic Layer Insights Delivered Locate & Load (ETL) Data Locate & Ingest Data TIME TIME Business Question The Semantic Layer disrupts business inhibitors
  • 17. How do we Realize a Semantic Layer?
  • 18. ©2017 Cambridge Semantics Inc. All rights reserved. First Gen The Semantic Layer requires more than data alone. Where Data Lakes Fall Short High value Data Lakes must tie information together, in the language of the business. “Last Mile” Analytics Enterprise Data Sources Cloud Storage ...which requires custom coding and tool integration Data Ingestion Structured Data Ingest and ETL Storage Infrastructure Data Cataloging Basic Metadata Cataloging Self Service Access to Raw Data Sets (SQL) Second Gen
  • 19. Unstructured Content Diverse Data Sources Enterprise Data Lakes Internal Databases & Applications Industry Standards & Open Data Third-Party Data Open Standards Scalability Security Governance The Enterprise Knowledge Graph
  • 20. DATA WAREHOUSES 1ST GEN DATA LAKE (HADOOP) Semantic Layer Connects data with business meaning Data On Demand Gives the business users access to data Enterprise Knowledge Graph ANZO SMART DATA LAKE®
  • 21. How does Anzo Smart Data Lake® work?
  • 24. Data Catalog Manages and secures the semantic layer
  • 25. ©2017 Cambridge Semantics Inc. All rights reserved. Product Name Product ID Opportunity Product ID Product Name 1 Product ID 1 Product ID 1 Product Name 2 Product ID 2 Product ID 2 … … … Product Automated ETL Generation and Execution Opportunity Product ID Account ID Geo Product ID 1 Acc ID 1 Americas … … … Marketing Bookings Product ID 1 Product ID Product ID Account ID Geo Product ID 1 Acc ID 1 Americas … … … Revenue Revenue Product ID 1 Product ID Acc ID 1 Account ID Product Product ID 1 Product ID Product ID 1 Product ID Product Name 1 Product Name Marketing Product ID 1 Product ID Americas Geo Acc ID 1 Account ID
  • 26. Graphmarts and Data Layers Loads knowledge graph into memory for layered prep and analytics Data Catalog Manages and secures the semantic layer
  • 27. Anzo Graphmart Go Live Base Layer Graph Data Loaded from Catalog Data Prep Layer Rule Layer Access Control Layer Cleanse Layer Relationship Layer Rule Layer Anzo Graphmarts and Data Layers Create new relationships Transformation and conformance Transform onto canonical models Define granular access control Deploy in the cloud on-demand Data on demand services
  • 28. ©2017 Cambridge Semantics Inc. All rights reserved. Product Name Product ID Opportunity Product ID Product Name 1 Product ID 1 Oppty Product ID 1 Product Name 2 Product ID 2 Oppty Product ID 2 … … … Product Data Layers – Create Dynamic Relationships RelationshipLayer Opportunity Product ID Account ID Geo Oppty Product ID 1 Acc ID 1 Americas … … … Marketing Bookings Product ID 1 Product ID Product Product ID Account ID Geo Oppty Product ID 1 Acc ID 1 Americas … … … Revenue Product Revenue Product ID 1 Product ID Acc ID 1 Account ID Product Product ID 1 Product ID Opportunity Product ID 1 Opportunity Product ID Product Name 1 Product Name Marketing Opportunity Product ID 1 Opportunity Product ID Americas Geo Acc ID 1 Account ID Product ID 1 Product ID 1 Acc ID 1 Acc ID 1 Opportunity Product ID 1 Opportunity Product ID 1 Product ID 1
  • 29. ©2017 Cambridge Semantics Inc. All rights reserved. 6/30/2016 Holding VersionDate 1006003 SecurityCode 1-3WGC-0 AccountCode 6/30/20166/30/2016 AccountCode SecurityCode VersionDate 1-3WGC-0 1006003 6/30/2016 1-3WGC-2 1013967 7/31/2017 … … … Holdings Data Layers – Create Dynamic Relationships AccountCode VersionDate AccountName 1-3WGC-0 6/30/2016 BLDRS Asia 50 ADR Index Fund 1-3WGC-2 7/31/2017 BLDRS Emerging Markets 50 ADR Index Fund … … Account Reference SecurityCode VersionDate Ticker 1006003 6/30/2016 RYAAY US 1013967 7/31/2017 AAAP US … … Security Reference BLDRS Asia 50 ADR Index Fund AccountName 1-3WGC-0 AccountCode Account 6/30/2016 VersionDate Security RYAAY US Ticker 1006003 SecurityCode 6/30/2016 VersionDate 1006003 1006003 6/30/2016 1-3WGC-0 1-3WGC-0 6/30/2016 RelationshipLayer
  • 30. ©2017 Cambridge Semantics Inc. All rights reserved. CSI_AZ_001 DEMOGRAPHICS TRIALID 9 LOCATIONID 21 PATIENTID PATIENTID TRIALID LOCATIONID 21 CSI_AZ_001 9 41 CSI_AZ_001 1 … … … DEMOGRAPHICS Data Layers for Data Prep Layer Subjects DataPrepLayer PATIENTID PREFERRED TOXICITY 21 Abscess – Abdominal 4 41 HA 2 … … … SIDEEFFECTRECORD SUBJECTID STUDYID SITEID 2001 CSI_AZ_002 8 15024 CSI_AZ_006 9 … … … Subjects SUBJECTID PREFERREDTERM TOXICITYGRADE 2001 Abdominal abscess 4 15024 Periorbital oedema 5 … … … AdverseEvent SIDEEFFECTRECORD AdverseEvent 21 PATIENTID PREFERRED Abscess - Abdominal 4 TOXICITY CSI_AZ_002 STUDYID 8 SITEID 2001 SUBJECTID Abdominal abscess PREFERRED TERM 4 TOXICITY GRADE 2001 SUBJECTID Abdominal abscess RAWCONFORMED TOXICITY GRADE PREFERRED TERM SUBJECTID SITEID STUDYID SUBJECTID
  • 31. Anzo Graph Query Engine In-memory MPP architecture Managed through Graphmarts and Data Layers Horizontal scale with on-demand cloud computing Interactive data preparation and analytics Scales to trillions of facts
  • 32. Self-guided exploratory analytics Secure and governed Model driven configuration, captured in Data Catalog Data Layers and extracts Discovery for unstructured and structured data No query writing – automated query generation Hi-Res Analytics Hi-Res Analytics
  • 33. ©2017 Cambridge Semantics Inc. All rights reserved. Hi-Res Analytics - Model-Driven Exploration
  • 34. ©2017 Cambridge Semantics Inc. All rights reserved. • Learn about navigating the business friendly graph model • Learn how to clean and prepare data with basic formulas • Learn how to assemble data sets to answer questions in analytic tools Claim Patient Record Drug Note Subscriber A Citizen Data Scientist comes up to speed in hours in collaborative workshops Hi-Res Analytics for the Citizen Data Scientist
  • 36. ©2017 Cambridge Semantics Inc. All rights reserved. Unstructured Data is First-Class Citizen in ASDL
  • 37. ©2017 Cambridge Semantics Inc. All rights reserved. Claim ID Process Date Subscriber ID 44223 10/3/2015 C12345 44224 10/7/2015 C23412 … … … Claims On July 3, 2016 Patient BA213 seemed frustrated after experiencing headache and nausea following 500mg dosage of sleep aid therapeutic, Narcoleptol. On Site Doctor Note Building and Expanding the Enterprise Knowledge Graph Patient ID Condition Drug Name BA213 Sleep Apnea Narcoleptol CS289 Type II Diabetes Insulin … … EHR BA213 Patient ID Drugprescribed Narcoleptol brand name Sleep Apnea Condition 10/3/2015 Process Date Subscriber C12345 Subs. ID Patient Record about 500mg Dosage Note 3/7/2016 when Headache and nausea event -.05 Sentiment score Claim 44223 Claim ID about
  • 40. Capability 2 – Unstructured NEED SCREENSHOTS
  • 41. ©2017 Cambridge Semantics Inc. All rights reserved. AnalyticsEndpoint Product ID Geo Account ID Opportunity Prod ID Product ID 1 Americas Acc ID 1 Opportunity Prod ID … … My Product Set Analytics Endpoints for On-Demand Access in any Tool Bookings Product ID 1 Product ID Product Product Revenue Product ID 1 Product ID Acc ID 1 Account ID Product Product ID 1 Product ID Opportunity Product ID 1 Opportunity Product ID Product Name 1 Product Name Marketing Opportunity Product ID 1 Opportunity Product ID Americas Geo Acc ID 1 Account ID
  • 42. Enterprise Knowledge Graph Enabling on-demand access to data by those seeking answers and insight Scalability Security Governance Lineage Automated Structured Data Ingestion Natural Language Processing and Text Analytics Rich models Hi-Res Analytics Anzo Smart Data Lake® Data On Demand
  • 43. Enterprise Knowledge Graph Data On Demand Automated Ingestion of Patient Data Patient Safety Clinical Trial Ops R & D Health Economics The Smart Data Lake for Digital Patient Health Insight for Decision Makers Improving patient outcomes, safety, and comfort Reducing the time bring medicines to patients Lowering the cost of healthcare Insurance Claims Clinical Trials Rx Data Health Records Genetic Data Wearables +
  • 44. Enterprise Knowledge Graph Data On Demand Automated Ingestion of Customer Data Sales Complianc e Marketing Risk Management The Smart Data Lake for Customer 360 Insight for Decision Makers Connect with your customers Reduce risk Increase revenue Account Data Trading Data Marketing Data Relationship Data
  • 45. ©2017 Cambridge Semantics Inc. All rights reserved. Node 1 Node 2 GQE Cluster Node N … Node 1 Node 2 Hadoop/Spark/HDFS Cluster Node M … … Anzo Enterprise Server Node 1 Node 2 Node P … ASDL Server Anzo Ingest Servers Node 1 Node 2 Node P … Client Browser Active Directory Anzo on the Web App ASDL Web App HTTP/ODATA/SPARQL Structured, Graph Data SPARQL HTTP/GRPC HTTP/HTTPS HTTP/JMS Metadata Synchronization HTTP/JMS Metadata Synchronization HDFS Fuse Apache Livy Metadata HTTP/HTTPS Elastic Search Cluster Node 1 Node 2 Node N … DS1 DS2 DS3… … JDBC … Schema Job Execution HTTP/HTTPS Ustructured Data Documents Anzo Smart Data Lake Architecture
  • 46. A Data Journey of Differentiating Capabilities Unstructured Data Notes, Docs, Emails, Articles Structured Data Relational, CSV, HDFS, External Data Feeds AccessPrepareCatalogIngest NLP, Text Analytics, Sentiment Analysis Hi-Res Analytics Data Catalog Graphmarts Data Layers Data Lake [Metadata or Data] Semantic Layer HTTP ODATA Services Business User IT User
  • 47. Capability 1 - Ingestion and Cataloging Unstructured Data Notes, Docs, Emails, Articles Structured Data Relational, CSV, HDFS, External Data Feeds AccessPrepareCatalogIngest NLP, Text Analytics, Sentiment Analysis Hi-Res Analytics Data Catalog Graphmarts Data Layers Data Lake [Metadata or Data] Semantic Layer HTTP ODATA Services Business User IT User
  • 48. Capability 2 – Unstructured Data Ingestion Unstructured Data Notes, Docs, Emails, Articles Structured Data Relational, CSV, HDFS, External Data Feeds AccessPrepareCatalogIngest NLP, Text Analytics, Sentiment Analysis Hi-Res Analytics Data Catalog Graphmarts Data Layers Data Lake [Metadata or Data] Semantic Layer HTTP ODATA Services Business User NoSQL IT User
  • 49. Capability 3 – Graphmarts and Data Layers Unstructured Data Notes, Docs, Emails, Articles Structured Data Relational, CSV, HDFS, External Data Feeds AccessPrepareCatalogIngest NLP, Text Analytics, Sentiment Analysis Hi-Res Analytics Data Catalog Graphmarts Data Layers Data Lake [Metadata or Data] Semantic Layer HTTP ODATA Services Business User NoSQL IT UserCan Be Tabular
  • 50. Virtual Hub and Spoke ETL Structured Data Relational, CSV, HDFS, Data Feeds External and Internal AccessPrepareCatalogIngest Data Catalog and Metadata Capture On Demand Access to Data Big Data Stores MappingMapping Semantic Layer
  • 55. Capability 4 – Hi-Res Analytics™ Unstructured Data Notes, Docs, Emails, Articles Structured Data Relational, CSV, HDFS, External Data Feeds AccessPrepareCatalogIngest NLP, Text Analytics, Sentiment Analysis Hi-Res Analytics Data Catalog Graphmarts Data Layers Data Lake [Metadata or Data] Semantic Layer HTTP ODATA Services Business User NoSQL IT UserCan Be Tabular
  • 56. ©2017 Cambridge Semantics Inc. All rights reserved. Creates a high resolution digital twin of diverse and complex data sets using open W3C standards – structured and unstructured Enhance Digital Transformation Makes it easy for aspiring citizen data scientists to ask questions or extract data using sophisticated but intuitive auto-generation of queries Empower Citizen Data Scientists Uses the language of the business to let users create and share insights quickly by working the way they think. Make Data Understandable A future-proof layer for fueling data into emerging technologies including ML and text analytics. Build a Bridge to the Future Anzo Smart Data Lake – Strategic Benefits
  • 57. ©2017 Cambridge Semantics Inc. All rights reserved. INDEFINITE Drug Discovery Preclinical Product Development FDA Review Scale-Up to Mfg. Post-Marketing Surveillance ONE FDA- APPROVED DRUG 0.5 – 2 YEARS6 – 7 YEARS3 – 6 YEARS NUMBER OF VOLUNTEERS PHASE 1 PHASE 2 PHASE 3 5250~ 5,000 – 10,000 COMPOUNDS PRE-DISCOVERY 20–100 100–500 1,000–5,000 INDSUBMITTED NDA/BLASUBMITTED The Information Fabric – A Semantic Layer for the Enterprise R & D Intelligence (CI) Product Development & Regulatory PV & Safety Case Management Source of Influence Commercial Analytics Clinical Trial Operations Medical Advisory Board Analytics Real World Research Clinical Data Standards Management Voice of the Customer Analytics Clinical Trial Exploratory Analytics
  • 58. How long does it take to implement? Does ASDL replace my data lake? Where can I find out more? • Get started building your Semantic Layer today • Build on the data lake investments you have already made • Stop by our booth - 441 Getting Started
  • 59. Thank You! Click here to request a demo