This document summarizes Rafael Richards' presentation on using linked data and semantic web technologies to enable semantic interoperability between health data systems like the VA's VISTA EHR and the HL7 FHIR standard. It describes the challenges of integrating data between different models and vocabularies. The approach taken was to map both VISTA and FHIR data to a common linked data model using rules, then apply semantic reasoning techniques to align the models and vocabularies. This allows the data from both systems to be queried and integrated while supporting independent evolution of the source models.
Report
Share
Report
Share
1 of 37
Download to read offline
More Related Content
Linked Vitals-20141112-v1a
1. Rafael Richards MD MS 2014
Linked
Vitals
1
Linked Vitals
A Linked Data Translation Approach
to Semantic Interoperability
November 12, 2014
Dataversity Webinar
Rafael M Richards MD MS
Physician Informaticist
Veterans Health Administratioan
U.S. Department of Veterans Affairs
2. Rafael Richards MD MS 2014
Linked
Vitals
2
Problem Statement: General
How does one semantically integrate data such as vital
signs between different patient information systems?
3. Rafael Richards MD MS 2014
Linked
Vitals
3
Problem Statement: Specific
How does one integrate vital sign data between VA VISTA electronic
health record (EHR) system and a potential exchange partner using
the HL7 FHIR standard?
Language barriers to exchange
4. Rafael Richards MD MS 2014
Linked
Vitals
4
Approach: Linked Data Foundation
One step towards minimizing data friction between systems is to
provide common model-neutral expression language such as RDF.
A common exchange language
5. Rafael Richards MD MS 2014
Linked
Vitals
5
Summary of Linked Data Translation Process
Rule-based mapping
Model alignment
Vocabulary alignment
Common syntax within
model-neutral medium
(Linked Data)
Syntactic
translation
Fileman
Semantic
translation
Source
Data
Integrated
Data
Different Syntax
Different Models
Different Vocabularies
Common Syntax
Common Model
Common Vocabulary
Common Meaning
Syntax A
Model A
Vocabulary A
Syntax B
Model B
Vocabulary B
Neutral
Model
6. Rafael Richards MD MS 2014
Linked
Vitals
6
VISTA: Overview
• Veterans Information Systems and Technology Architecture
• Information system of all VA hospitals
• Foundation of several public healthcare systems
– VA (VISTA): 1200+ care sites
– DoD(CHCS): 900+ care sites
– IHS (RPMS): 500+ care sites
– NY State: 24 hospitals
• Most familiar EHR in U.S.
– Over 60% of U.S.-trained physicians have used VISTA
• Open source
– Deployed in many other settings in U.S. and internationally
– Many developments by open source community
8. Rafael Richards MD MS 2014
Linked
Vitals
8
M
VISTA is an integrated patient-centric
EHR.
The data architecture of VISTA consists
of over150 applications for clinical care
integrated within a single common
multidimensional database (M DB).
In VISTA both business logic
(Applications) and data (Database) are
managed with within the M data engine,
which provides the tight integration of
applications to each other and to shared
data.
The data flow and integration
agreements between VISTA applications
(outer ring) is visualized as blue lines.
One Patient.
One Database.
All Apps.
All Data.
Integrated.
VISTA: A Patient-Centric EHR
9. Fileman: FM hierarchical-graph store
Rafael Richards MD MS 2014
Linked
Vitals
9
VISTA is based on a hybrid NoSQL database. Unlike some NoSQL stores, VISTA is
schema-driven, not schema-less.
Inside every VISTA is File Manager (Fileman), a hybrid hierarchal-graph data store,
which is overlaid on top of the M multidimensional store. A comprehensive
definition of the types of data stored in every VA FileMan represents the VA's
Enterprise Data Model.
With an exposed data model, VISTA’s native schema be rendered in a standard
definition format and analyzed for use and improvement. A schema-flexible
information model representation language that is fully machine-processable such as
RDF provides such capability.
M
M: Multidimensional NoSQL data engine
M
VISTA Data Model
10. Rafael Richards MD MS 2014
Linked
Vitals
10
VISTA Data Model
FileMan: VistA’s data conductor
10
Fileman is the database management system of VistA. It
manages the data access, models, and query of all application data in VistA.
VistA Fileman: connects to nearly every application in VistA, with
green (incoming) links from all applications.
VistA Standard package (Oncology): much fewer
connections, and mostly outgoing links (red = dependencies)
Reference: http://code.osehra.org/vivian/vista_pkg_dep.php
11. Rafael Richards MD MS 2014
Linked
Vitals
VISTA’s native data model is
comprised of hierarchical
files and subfiles, each
which addresses a specific
M Global storage.
11
VISTA Data Model
12. Rafael Richards MD MS 2014
Linked
Vitals
12
VISTA Query: Fileman Query Language
FMQL is the Fileman Query Language
that leverages the native hierarchical-graph
model of VISTA. This provides
real-time web-based query access to
the entirety of VistA’s data.
This exposes the native hierarchical
data model of Fileman in web
standard forms including HTML,
JSON, and RDF.
HTML: Hypertext markup language (visual document markup)
JSON: Javascript object notation (data serialization / packaging)
RDF: Resource description framework (linked data / semantics)
JSON-LD: JSON-like serialization of Linked Data (RDF)
13. Rafael Richards MD MS 2014
Linked
Vitals
13
VISTA Query: HTML output
Fileman query of VistA for vital signs with output in HTML.
HTML output:
Human-readable
14. Rafael Richards MD MS 2014
Linked
Vitals
14
VISTA Query: RDF output
Fileman query of VistA for vital signs with output in RDF.
RDF output:
Machine readable
15. Rafael Richards MD MS 2014
Linked
Vitals
15
Linked Data: What is it?
The World Wide Web Consortium (W3C) standard for
semantic information integration for the Internet of Data.
HTML (hypertext markup language)
For humans to exchange information
RDF (resource description framework)
For computers to exchange information
Linked Documents
(Document Web)
Linked Data
(Semantic Web)
enables
enables
“The Semantic Web [Linked Data] provides a common framework that
allows data to be shared and reused across application, enterprise, and
community boundaries.”
Tim Berners-Lee, MIT Professor and Inventor of the World Wide Web
(HTML and RDF protocols)
16. Linking media, geographic, publications, government, and life sciences….
Rafael Richards MD MS 2014
Linked
Vitals
Linked Data
This represents over 300 linked data
sources and databases, comprising
billions of data elements and
millions of semantic links.
Each on of these circles represents
a data source, which is semantically
linked to other data sources,
creating one virtual federated
queryable web of data.
Wikipedia is one of the resources
converted to Linked Data, and is
called DBpedia (center circle).
16
Linked Data and the Web of Data
Why not link
healthcare?
17. Rafael Richards MD MS 2014
Linked
Vitals
17
VISTA Vitals in RDF
239 instances in the sample dataset
19. Rafael Richards MD MS 2014
Linked
Vitals
19
FHIR: Native model
• FHIR - Observation
• XML model in XML Schema
20. Rafael Richards MD MS 2014
Linked
Vitals
20
FHIR in RDF
Automated transformation from FHIR XML Schema - RDF
21. Rafael Richards MD MS 2014
Linked
Vitals
21
RDF Translation Rules options
There are many options for RDF translation. For this case study we will use
the SPARQL Inferencing Notation (SPIN) because it is a W3C standard.
22. Rafael Richards MD MS 2014
Linked
Vitals
22
SPIN: SPARQL Inferencing Notation
http://spinrdf.org/spin-architecture.html
23. Rafael Richards MD MS 2014
Linked
Vitals
23
SPINMap: Data mapping rules engine
24. Rafael Richards MD MS 2014
Linked
Vitals
24
SPINMap: Data mapping rules engine
Motivation:
– Simplifies mappings between different models
Key Features:
– Creates executable transformations
25. Rafael Richards MD MS 2014
Linked
Vitals
25
SPINMap: Field mapping with rules
Easier to create deep nested structures in the target
26. Rafael Richards MD MS 2014
Linked
Vitals
26
SPINMap: Rules for LOINC terminology
27. Rafael Richards MD MS 2014
Linked
Vitals
27
SPINMap Output: Linked Vitals
Same As
28. Rafael Richards MD MS 2014
Linked
Vitals
28
Summary of Translation Approach
Rule-based mapping
Model alignment
Vocabulary alignment
Common syntax within
model-flexible medium
(Linked Data)
Syntactic
alignment
Fileman
Semantic
alignment
Source
Data
Integrated
Data
Different Syntax
Different Models
Different Vocabularies
Common Syntax
Common Model
Common Vocabulary
Common Meaning
Syntax A
Model A
Vocabulary A
Syntax B
Model B
Vocabulary B
Neutral
Model
29. Rafael Richards MD MS 2014
Linked
Vitals
29
Linked Vitals: A step towards Linked Health
30. Rafael Richards MD MS 2014
Linked
Vitals
30
In the works..
Web-based automation
of semantic alignment
31. Rafael Richards MD MS 2014
Linked
Vitals
31
VISTA-FHIR web-based translation
The VISTA– FHIR prototype is a web-based application built with TopBraid and Semantic
Web Page technology. The application demonstrates semantic data integration of VistA
records and FHIR records.
The control bar shows the steps in the demonstration
Sub-steps are shown as
buttons. An “Explain” button is
provided for each sub-step.
The number of vital signs
records are shown across all
patients in the dataset
32. Rafael Richards MD MS 2014
Linked
Vitals
32
VISTA-FHIR: Retrieve VISTA Records
ViSTA records are retrieved directly in RDF in a neutral ontology. For the purpose of
the demonstrator only the vital signs records are processed.
The number of vital signs records are shown across
all patients in the dataset
33. Rafael Richards MD MS 2014
Linked
Vitals
33
VISTA-FHIR: Retrieve FHIR Records
FHIR records are normalized using FHIR and other HL7-based ontologies
to a neutral model for healthcare patient vital signs, HPVS.
34. Rafael Richards MD MS 2014
Linked
Vitals
34
VISTA-FHIR: Run Inferences
Semantic alignment of VistA and FHIR is achieved using a combination of
transformation rules and inferencing based on SPARQL and SPIN.
The results of the inferencing are shown here
35. Rafael Richards MD MS 2014
Linked
Vitals
35
Appendix
Review of Linked Data and its features
as a sematic interoperability language
36. Information Models:
An apparent conflict between standardization and innovation?
Standardization: need to remain static in
order not to be disruptive for adopters.
• Static, Brittle
• Centralized
• General (Common Denominator)
• Committee-driven
• Large, “all-or-nothing”, disruptive updates
Rafael Richards MD MS 2014
Linked
Vitals
36
Linked Data: Accommodates both Standards and Innovation
Innovation: requires continuous evolution
of thousands of new information models.
• Adaptive / Evolutionary
• Decentralized
• Highly Specialized, “Best of breed”
• End-user / specialist – driven
• Small, continuous, low-impact updates
vs.
What are the options?
Centralized Model-rigid approach: For exchange of information to
occur all models must remain fixed, and data must go through only one central
‘broker’ model. Technologies that support this method are HL7 and XML.
Decentralized Model-flexible approach: Multiple models peacefully
co-exist and evolve, mediated by their ability to freely link to any model at all
times. In this approach, all models are free to evolve AND are capable of
resolving to a common standard model at all times. The only technology that
currently supports this approach is RDF (Linked Data).
http://www.carlsterner.com/research/2009_resilience_and_decentralization.shtml
The current
approach to
healthcare data
Linked Data
supports both
standardization
and innovation
37. Rafael Richards MD MS 2014
Linked
Vitals
37
Data Integration: Legacy vs. Linked Data Approach
Architectural Issues
Legacy Data (HL7, XML)
Linked Data (RDF)
Function
Serialization format
Data model
Granularity
Message-centric (documents)
Data-centric (data elements)
Semantics
Weak semantics. Extrinsic to the data.
Depends on an external data model.
Strong semantics. Intrinsic to the
data.
Data model characteristics
Model-rigid architecture. Only the least
common denominator of model unifies
information; must remain unchanged to
orchestrate. Restrictive expression.
Model-flexible architecture.
All data models may independently evolve.
Maximizes expressivity.
Data model compatibility
No model diversity permitted
A one-size-fits-all mega-model
Multiple models peacefully coexist
Data model agnostic
Data model evolution
Costly and difficult to evolve models.
Due to model-rigid architecture.
Cheaper and easier to evolve
models.
Due to model-flexible architecture.
Data access method
Downloading + Aggregating
Linking + Federating
Scalability: incremental effort required
to add new data sources
Common model must be updated
Individual models can be
independently and incrementally
semantically linked.