SlideShare a Scribd company logo
Rafael Richards MD MS 2014 
Linked 
Vitals 
1 
Linked Vitals  
 
A Linked Data Translation Approach  
to Semantic Interoperability 
November 12, 2014 
Dataversity Webinar 
Rafael M Richards MD MS 
Physician Informaticist 
Veterans Health Administratioan 
U.S. Department of Veterans Affairs
Rafael Richards MD MS 2014 
Linked 
Vitals 
2 
Problem Statement: General 
How does one semantically integrate data such as vital 
signs between different patient information systems?
Rafael Richards MD MS 2014 
Linked 
Vitals 
3 
Problem Statement: Specific 
How does one integrate vital sign data between VA VISTA electronic 
health record (EHR) system and a potential exchange partner using 
the HL7 FHIR standard? 
Language barriers to exchange
Rafael Richards MD MS 2014 
Linked 
Vitals 
4 
Approach: Linked Data Foundation 
One step towards minimizing data friction between systems is to 
provide common model-neutral expression language such as RDF. 
A common exchange language
Rafael Richards MD MS 2014 
Linked 
Vitals 
5 
Summary of Linked Data Translation Process 
Rule-based mapping 
Model alignment 
Vocabulary alignment 
Common syntax within 
model-neutral medium 
(Linked Data) 
Syntactic 
translation 
Fileman 
Semantic 
translation 
Source 
Data 
Integrated 
Data 
Different Syntax 
Different Models 
Different Vocabularies 
Common Syntax 
Common Model 
Common Vocabulary 
Common Meaning 
Syntax A 
Model A 
Vocabulary A 
Syntax B 
Model B 
Vocabulary B 
Neutral 
Model
Rafael Richards MD MS 2014 
Linked 
Vitals 
6 
VISTA: Overview 
• Veterans Information Systems and Technology Architecture 
• Information system of all VA hospitals 
• Foundation of several public healthcare systems 
– VA (VISTA): 1200+ care sites 
– DoD(CHCS): 900+ care sites 
– IHS (RPMS): 500+ care sites 
– NY State: 24 hospitals 
• Most familiar EHR in U.S. 
– Over 60% of U.S.-trained physicians have used VISTA 
• Open source 
– Deployed in many other settings in U.S. and internationally 
– Many developments by open source community
Rafael Richards MD MS 2014 
Linked 
Vitals 
7 
VISTA deployments in the U.S.
Rafael Richards MD MS 2014 
Linked 
Vitals 
8 
M 
VISTA is an integrated patient-centric 
EHR. 
The data architecture of VISTA consists 
of over150 applications for clinical care 
integrated within a single common 
multidimensional database (M DB). 
In VISTA both business logic 
(Applications) and data (Database) are 
managed with within the M data engine, 
which provides the tight integration of 
applications to each other and to shared 
data. 
The data flow and integration 
agreements between VISTA applications 
(outer ring) is visualized as blue lines. 
One Patient. 
One Database. 
All Apps. 
All Data. 
Integrated. 
VISTA: A Patient-Centric EHR
Fileman: FM hierarchical-graph store 
Rafael Richards MD MS 2014 
Linked 
Vitals 
9 
VISTA is based on a hybrid NoSQL database. Unlike some NoSQL stores, VISTA is 
schema-driven, not schema-less. 
Inside every VISTA is File Manager (Fileman), a hybrid hierarchal-graph data store, 
which is overlaid on top of the M multidimensional store. A comprehensive 
definition of the types of data stored in every VA FileMan represents the VA's 
Enterprise Data Model. 
With an exposed data model, VISTA’s native schema be rendered in a standard 
definition format and analyzed for use and improvement. A schema-flexible 
information model representation language that is fully machine-processable such as 
RDF provides such capability. 
M 
M: Multidimensional NoSQL data engine 
M 
VISTA Data Model
Rafael Richards MD MS 2014 
Linked 
Vitals 
10 
VISTA Data Model 
FileMan: VistA’s data conductor 
10 
Fileman is the database management system of VistA. It 
manages the data access, models, and query of all application data in VistA. 
VistA Fileman: connects to nearly every application in VistA, with 
green (incoming) links from all applications. 
VistA Standard package (Oncology): much fewer 
connections, and mostly outgoing links (red = dependencies) 
Reference: http://code.osehra.org/vivian/vista_pkg_dep.php
Rafael Richards MD MS 2014 
Linked 
Vitals 
VISTA’s native data model is 
comprised of hierarchical 
files and subfiles, each 
which addresses a specific 
M Global storage. 
11 
VISTA Data Model
Rafael Richards MD MS 2014 
Linked 
Vitals 
12 
VISTA Query: Fileman Query Language 
FMQL is the Fileman Query Language 
that leverages the native hierarchical-graph 
model of VISTA. This provides 
real-time web-based query access to 
the entirety of VistA’s data. 
This exposes the native hierarchical 
data model of Fileman in web 
standard forms including HTML, 
JSON, and RDF. 
HTML: Hypertext markup language (visual document markup) 
JSON: Javascript object notation (data serialization / packaging) 
RDF: Resource description framework (linked data / semantics) 
JSON-LD: JSON-like serialization of Linked Data (RDF)
Rafael Richards MD MS 2014 
Linked 
Vitals 
13 
VISTA Query: HTML output 
Fileman query of VistA for vital signs with output in HTML. 
HTML output: 
Human-readable
Rafael Richards MD MS 2014 
Linked 
Vitals 
14 
VISTA Query: RDF output 
Fileman query of VistA for vital signs with output in RDF. 
RDF output: 
Machine readable
Rafael Richards MD MS 2014 
Linked 
Vitals 
15 
Linked Data: What is it? 
The World Wide Web Consortium (W3C) standard for 
semantic information integration for the Internet of Data. 
HTML (hypertext markup language) 
For humans to exchange information 
RDF (resource description framework) 
For computers to exchange information 
Linked Documents 
(Document Web) 
Linked Data 
(Semantic Web) 
enables 
enables 
“The Semantic Web [Linked Data] provides a common framework that 
allows data to be shared and reused across application, enterprise, and 
community boundaries.” 
Tim Berners-Lee, MIT Professor and Inventor of the World Wide Web 
(HTML and RDF protocols)
Linking media, geographic, publications, government, and life sciences…. 
Rafael Richards MD MS 2014 
Linked 
Vitals 
Linked Data 
This represents over 300 linked data 
sources and databases, comprising 
billions of data elements and 
millions of semantic links. 
Each on of these circles represents 
a data source, which is semantically 
linked to other data sources, 
creating one virtual federated 
queryable web of data. 
Wikipedia is one of the resources 
converted to Linked Data, and is 
called DBpedia (center circle). 
16 
Linked Data and the Web of Data 
Why not link 
healthcare?
Rafael Richards MD MS 2014 
Linked 
Vitals 
17 
VISTA Vitals in RDF 
239 instances in the sample dataset
Rafael Richards MD MS 2014 
Linked 
Vitals 
18 
VISTA Vitals in RDF
Rafael Richards MD MS 2014 
Linked 
Vitals 
19 
FHIR: Native model 
• FHIR - Observation 
• XML model in XML Schema
Rafael Richards MD MS 2014 
Linked 
Vitals 
20 
FHIR in RDF 
Automated transformation from FHIR XML Schema - RDF
Rafael Richards MD MS 2014 
Linked 
Vitals 
21 
RDF Translation Rules options 
There are many options for RDF translation. For this case study we will use 
the SPARQL Inferencing Notation (SPIN) because it is a W3C standard.
Rafael Richards MD MS 2014 
Linked 
Vitals 
22 
SPIN: SPARQL Inferencing Notation 
http://spinrdf.org/spin-architecture.html
Rafael Richards MD MS 2014 
Linked 
Vitals 
23 
SPINMap: Data mapping rules engine
Rafael Richards MD MS 2014 
Linked 
Vitals 
24 
SPINMap: Data mapping rules engine 
Motivation: 
– Simplifies mappings between different models 
Key Features: 
– Creates executable transformations
Rafael Richards MD MS 2014 
Linked 
Vitals 
25 
SPINMap: Field mapping with rules 
Easier to create deep nested structures in the target
Rafael Richards MD MS 2014 
Linked 
Vitals 
26 
SPINMap: Rules for LOINC terminology
Rafael Richards MD MS 2014 
Linked 
Vitals 
27 
SPINMap Output: Linked Vitals 
Same As
Rafael Richards MD MS 2014 
Linked 
Vitals 
28 
Summary of Translation Approach 
Rule-based mapping 
Model alignment 
Vocabulary alignment 
Common syntax within 
model-flexible medium 
(Linked Data) 
Syntactic 
alignment 
Fileman 
Semantic 
alignment 
Source 
Data 
Integrated 
Data 
Different Syntax 
Different Models 
Different Vocabularies 
Common Syntax 
Common Model 
Common Vocabulary 
Common Meaning 
Syntax A 
Model A 
Vocabulary A 
Syntax B 
Model B 
Vocabulary B 
Neutral 
Model
Rafael Richards MD MS 2014 
Linked 
Vitals 
29 
Linked Vitals: A step towards Linked Health
Rafael Richards MD MS 2014 
Linked 
Vitals 
30 
In the works.. 
 
Web-based automation 
of semantic alignment
Rafael Richards MD MS 2014 
Linked 
Vitals 
31 
VISTA-FHIR web-based translation 
The VISTA– FHIR prototype is a web-based application built with TopBraid and Semantic 
Web Page technology. The application demonstrates semantic data integration of VistA 
records and FHIR records. 
The control bar shows the steps in the demonstration 
Sub-steps are shown as 
buttons. An “Explain” button is 
provided for each sub-step. 
The number of vital signs 
records are shown across all 
patients in the dataset
Rafael Richards MD MS 2014 
Linked 
Vitals 
32 
VISTA-FHIR: Retrieve VISTA Records 
ViSTA records are retrieved directly in RDF in a neutral ontology. For the purpose of 
the demonstrator only the vital signs records are processed. 
The number of vital signs records are shown across 
all patients in the dataset
Rafael Richards MD MS 2014 
Linked 
Vitals 
33 
VISTA-FHIR: Retrieve FHIR Records 
FHIR records are normalized using FHIR and other HL7-based ontologies 
to a neutral model for healthcare patient vital signs, HPVS.
Rafael Richards MD MS 2014 
Linked 
Vitals 
34 
VISTA-FHIR: Run Inferences 
Semantic alignment of VistA and FHIR is achieved using a combination of 
transformation rules and inferencing based on SPARQL and SPIN. 
The results of the inferencing are shown here
Rafael Richards MD MS 2014 
Linked 
Vitals 
35 
Appendix 
Review of Linked Data and its features 
as a sematic interoperability language
Information Models: 
An apparent conflict between standardization and innovation? 
Standardization: need to remain static in 
order not to be disruptive for adopters. 
• Static, Brittle 
• Centralized 
• General (Common Denominator) 
• Committee-driven 
• Large, “all-or-nothing”, disruptive updates 
Rafael Richards MD MS 2014 
Linked 
Vitals 
36 
Linked Data: Accommodates both Standards and Innovation 
Innovation: requires continuous evolution 
of thousands of new information models. 
• Adaptive / Evolutionary 
• Decentralized 
• Highly Specialized, “Best of breed” 
• End-user / specialist – driven 
• Small, continuous, low-impact updates 
vs. 
What are the options? 
Centralized Model-rigid approach: For exchange of information to 
occur all models must remain fixed, and data must go through only one central 
‘broker’ model. Technologies that support this method are HL7 and XML. 
Decentralized Model-flexible approach: Multiple models peacefully 
co-exist and evolve, mediated by their ability to freely link to any model at all 
times. In this approach, all models are free to evolve AND are capable of 
resolving to a common standard model at all times. The only technology that 
currently supports this approach is RDF (Linked Data). 
http://www.carlsterner.com/research/2009_resilience_and_decentralization.shtml 
The current 
approach to 
healthcare data 
Linked Data 
supports both 
standardization 
and innovation
Rafael Richards MD MS 2014 
Linked 
Vitals 
37 
Data Integration: Legacy vs. Linked Data Approach 
Architectural Issues 
Legacy Data (HL7, XML) 
Linked Data (RDF) 
Function 
Serialization format 
Data model 
Granularity 
Message-centric (documents) 
Data-centric (data elements) 
Semantics 
Weak semantics. Extrinsic to the data. 
Depends on an external data model. 
Strong semantics. Intrinsic to the 
data. 
Data model characteristics 
Model-rigid architecture. Only the least 
common denominator of model unifies 
information; must remain unchanged to 
orchestrate. Restrictive expression. 
Model-flexible architecture. 
All data models may independently evolve. 
Maximizes expressivity. 
Data model compatibility 
No model diversity permitted 
A one-size-fits-all mega-model 
Multiple models peacefully coexist 
Data model agnostic 
Data model evolution 
Costly and difficult to evolve models. 
Due to model-rigid architecture. 
Cheaper and easier to evolve 
models. 
Due to model-flexible architecture. 
Data access method 
Downloading + Aggregating 
Linking + Federating 
Scalability: incremental effort required 
to add new data sources 
Common model must be updated 
Individual models can be 
independently and incrementally 
semantically linked.

More Related Content

Linked Vitals-20141112-v1a

  • 1. Rafael Richards MD MS 2014 Linked Vitals 1 Linked Vitals A Linked Data Translation Approach to Semantic Interoperability November 12, 2014 Dataversity Webinar Rafael M Richards MD MS Physician Informaticist Veterans Health Administratioan U.S. Department of Veterans Affairs
  • 2. Rafael Richards MD MS 2014 Linked Vitals 2 Problem Statement: General How does one semantically integrate data such as vital signs between different patient information systems?
  • 3. Rafael Richards MD MS 2014 Linked Vitals 3 Problem Statement: Specific How does one integrate vital sign data between VA VISTA electronic health record (EHR) system and a potential exchange partner using the HL7 FHIR standard? Language barriers to exchange
  • 4. Rafael Richards MD MS 2014 Linked Vitals 4 Approach: Linked Data Foundation One step towards minimizing data friction between systems is to provide common model-neutral expression language such as RDF. A common exchange language
  • 5. Rafael Richards MD MS 2014 Linked Vitals 5 Summary of Linked Data Translation Process Rule-based mapping Model alignment Vocabulary alignment Common syntax within model-neutral medium (Linked Data) Syntactic translation Fileman Semantic translation Source Data Integrated Data Different Syntax Different Models Different Vocabularies Common Syntax Common Model Common Vocabulary Common Meaning Syntax A Model A Vocabulary A Syntax B Model B Vocabulary B Neutral Model
  • 6. Rafael Richards MD MS 2014 Linked Vitals 6 VISTA: Overview • Veterans Information Systems and Technology Architecture • Information system of all VA hospitals • Foundation of several public healthcare systems – VA (VISTA): 1200+ care sites – DoD(CHCS): 900+ care sites – IHS (RPMS): 500+ care sites – NY State: 24 hospitals • Most familiar EHR in U.S. – Over 60% of U.S.-trained physicians have used VISTA • Open source – Deployed in many other settings in U.S. and internationally – Many developments by open source community
  • 7. Rafael Richards MD MS 2014 Linked Vitals 7 VISTA deployments in the U.S.
  • 8. Rafael Richards MD MS 2014 Linked Vitals 8 M VISTA is an integrated patient-centric EHR. The data architecture of VISTA consists of over150 applications for clinical care integrated within a single common multidimensional database (M DB). In VISTA both business logic (Applications) and data (Database) are managed with within the M data engine, which provides the tight integration of applications to each other and to shared data. The data flow and integration agreements between VISTA applications (outer ring) is visualized as blue lines. One Patient. One Database. All Apps. All Data. Integrated. VISTA: A Patient-Centric EHR
  • 9. Fileman: FM hierarchical-graph store Rafael Richards MD MS 2014 Linked Vitals 9 VISTA is based on a hybrid NoSQL database. Unlike some NoSQL stores, VISTA is schema-driven, not schema-less. Inside every VISTA is File Manager (Fileman), a hybrid hierarchal-graph data store, which is overlaid on top of the M multidimensional store. A comprehensive definition of the types of data stored in every VA FileMan represents the VA's Enterprise Data Model. With an exposed data model, VISTA’s native schema be rendered in a standard definition format and analyzed for use and improvement. A schema-flexible information model representation language that is fully machine-processable such as RDF provides such capability. M M: Multidimensional NoSQL data engine M VISTA Data Model
  • 10. Rafael Richards MD MS 2014 Linked Vitals 10 VISTA Data Model FileMan: VistA’s data conductor 10 Fileman is the database management system of VistA. It manages the data access, models, and query of all application data in VistA. VistA Fileman: connects to nearly every application in VistA, with green (incoming) links from all applications. VistA Standard package (Oncology): much fewer connections, and mostly outgoing links (red = dependencies) Reference: http://code.osehra.org/vivian/vista_pkg_dep.php
  • 11. Rafael Richards MD MS 2014 Linked Vitals VISTA’s native data model is comprised of hierarchical files and subfiles, each which addresses a specific M Global storage. 11 VISTA Data Model
  • 12. Rafael Richards MD MS 2014 Linked Vitals 12 VISTA Query: Fileman Query Language FMQL is the Fileman Query Language that leverages the native hierarchical-graph model of VISTA. This provides real-time web-based query access to the entirety of VistA’s data. This exposes the native hierarchical data model of Fileman in web standard forms including HTML, JSON, and RDF. HTML: Hypertext markup language (visual document markup) JSON: Javascript object notation (data serialization / packaging) RDF: Resource description framework (linked data / semantics) JSON-LD: JSON-like serialization of Linked Data (RDF)
  • 13. Rafael Richards MD MS 2014 Linked Vitals 13 VISTA Query: HTML output Fileman query of VistA for vital signs with output in HTML. HTML output: Human-readable
  • 14. Rafael Richards MD MS 2014 Linked Vitals 14 VISTA Query: RDF output Fileman query of VistA for vital signs with output in RDF. RDF output: Machine readable
  • 15. Rafael Richards MD MS 2014 Linked Vitals 15 Linked Data: What is it? The World Wide Web Consortium (W3C) standard for semantic information integration for the Internet of Data. HTML (hypertext markup language) For humans to exchange information RDF (resource description framework) For computers to exchange information Linked Documents (Document Web) Linked Data (Semantic Web) enables enables “The Semantic Web [Linked Data] provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries.” Tim Berners-Lee, MIT Professor and Inventor of the World Wide Web (HTML and RDF protocols)
  • 16. Linking media, geographic, publications, government, and life sciences…. Rafael Richards MD MS 2014 Linked Vitals Linked Data This represents over 300 linked data sources and databases, comprising billions of data elements and millions of semantic links. Each on of these circles represents a data source, which is semantically linked to other data sources, creating one virtual federated queryable web of data. Wikipedia is one of the resources converted to Linked Data, and is called DBpedia (center circle). 16 Linked Data and the Web of Data Why not link healthcare?
  • 17. Rafael Richards MD MS 2014 Linked Vitals 17 VISTA Vitals in RDF 239 instances in the sample dataset
  • 18. Rafael Richards MD MS 2014 Linked Vitals 18 VISTA Vitals in RDF
  • 19. Rafael Richards MD MS 2014 Linked Vitals 19 FHIR: Native model • FHIR - Observation • XML model in XML Schema
  • 20. Rafael Richards MD MS 2014 Linked Vitals 20 FHIR in RDF Automated transformation from FHIR XML Schema - RDF
  • 21. Rafael Richards MD MS 2014 Linked Vitals 21 RDF Translation Rules options There are many options for RDF translation. For this case study we will use the SPARQL Inferencing Notation (SPIN) because it is a W3C standard.
  • 22. Rafael Richards MD MS 2014 Linked Vitals 22 SPIN: SPARQL Inferencing Notation http://spinrdf.org/spin-architecture.html
  • 23. Rafael Richards MD MS 2014 Linked Vitals 23 SPINMap: Data mapping rules engine
  • 24. Rafael Richards MD MS 2014 Linked Vitals 24 SPINMap: Data mapping rules engine Motivation: – Simplifies mappings between different models Key Features: – Creates executable transformations
  • 25. Rafael Richards MD MS 2014 Linked Vitals 25 SPINMap: Field mapping with rules Easier to create deep nested structures in the target
  • 26. Rafael Richards MD MS 2014 Linked Vitals 26 SPINMap: Rules for LOINC terminology
  • 27. Rafael Richards MD MS 2014 Linked Vitals 27 SPINMap Output: Linked Vitals Same As
  • 28. Rafael Richards MD MS 2014 Linked Vitals 28 Summary of Translation Approach Rule-based mapping Model alignment Vocabulary alignment Common syntax within model-flexible medium (Linked Data) Syntactic alignment Fileman Semantic alignment Source Data Integrated Data Different Syntax Different Models Different Vocabularies Common Syntax Common Model Common Vocabulary Common Meaning Syntax A Model A Vocabulary A Syntax B Model B Vocabulary B Neutral Model
  • 29. Rafael Richards MD MS 2014 Linked Vitals 29 Linked Vitals: A step towards Linked Health
  • 30. Rafael Richards MD MS 2014 Linked Vitals 30 In the works.. Web-based automation of semantic alignment
  • 31. Rafael Richards MD MS 2014 Linked Vitals 31 VISTA-FHIR web-based translation The VISTA– FHIR prototype is a web-based application built with TopBraid and Semantic Web Page technology. The application demonstrates semantic data integration of VistA records and FHIR records. The control bar shows the steps in the demonstration Sub-steps are shown as buttons. An “Explain” button is provided for each sub-step. The number of vital signs records are shown across all patients in the dataset
  • 32. Rafael Richards MD MS 2014 Linked Vitals 32 VISTA-FHIR: Retrieve VISTA Records ViSTA records are retrieved directly in RDF in a neutral ontology. For the purpose of the demonstrator only the vital signs records are processed. The number of vital signs records are shown across all patients in the dataset
  • 33. Rafael Richards MD MS 2014 Linked Vitals 33 VISTA-FHIR: Retrieve FHIR Records FHIR records are normalized using FHIR and other HL7-based ontologies to a neutral model for healthcare patient vital signs, HPVS.
  • 34. Rafael Richards MD MS 2014 Linked Vitals 34 VISTA-FHIR: Run Inferences Semantic alignment of VistA and FHIR is achieved using a combination of transformation rules and inferencing based on SPARQL and SPIN. The results of the inferencing are shown here
  • 35. Rafael Richards MD MS 2014 Linked Vitals 35 Appendix Review of Linked Data and its features as a sematic interoperability language
  • 36. Information Models: An apparent conflict between standardization and innovation? Standardization: need to remain static in order not to be disruptive for adopters. • Static, Brittle • Centralized • General (Common Denominator) • Committee-driven • Large, “all-or-nothing”, disruptive updates Rafael Richards MD MS 2014 Linked Vitals 36 Linked Data: Accommodates both Standards and Innovation Innovation: requires continuous evolution of thousands of new information models. • Adaptive / Evolutionary • Decentralized • Highly Specialized, “Best of breed” • End-user / specialist – driven • Small, continuous, low-impact updates vs. What are the options? Centralized Model-rigid approach: For exchange of information to occur all models must remain fixed, and data must go through only one central ‘broker’ model. Technologies that support this method are HL7 and XML. Decentralized Model-flexible approach: Multiple models peacefully co-exist and evolve, mediated by their ability to freely link to any model at all times. In this approach, all models are free to evolve AND are capable of resolving to a common standard model at all times. The only technology that currently supports this approach is RDF (Linked Data). http://www.carlsterner.com/research/2009_resilience_and_decentralization.shtml The current approach to healthcare data Linked Data supports both standardization and innovation
  • 37. Rafael Richards MD MS 2014 Linked Vitals 37 Data Integration: Legacy vs. Linked Data Approach Architectural Issues Legacy Data (HL7, XML) Linked Data (RDF) Function Serialization format Data model Granularity Message-centric (documents) Data-centric (data elements) Semantics Weak semantics. Extrinsic to the data. Depends on an external data model. Strong semantics. Intrinsic to the data. Data model characteristics Model-rigid architecture. Only the least common denominator of model unifies information; must remain unchanged to orchestrate. Restrictive expression. Model-flexible architecture. All data models may independently evolve. Maximizes expressivity. Data model compatibility No model diversity permitted A one-size-fits-all mega-model Multiple models peacefully coexist Data model agnostic Data model evolution Costly and difficult to evolve models. Due to model-rigid architecture. Cheaper and easier to evolve models. Due to model-flexible architecture. Data access method Downloading + Aggregating Linking + Federating Scalability: incremental effort required to add new data sources Common model must be updated Individual models can be independently and incrementally semantically linked.