SlideShare a Scribd company logo
Patterns of Semantic Integration Riding the Next Wave April 2006 Dan McCreary President Dan McCreary & Associates [email_address] (952) 931-9198 Managed Metadata Solutions
Creative Commons 2.5 Attribution . You must attribute the work in the manner specified by the author or licensor.  Noncommercial . You may not use this work for commercial purposes.  Share Alike . If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. $ BY:
Patterns of Semantic Integration Our ever increasing understanding of solid-state physics has allowed  Moore’s Law  to proceed unabated for the last 40 years.  Exciting developments in quantum physics, nanotechnology and molecular self-assembly will continue this trend for the foreseeable future.  But why is it that an instructor can’t quickly import a database of 10,000 subject-appropriate lesson plans and quiz items into their learning-management system and dynamically adjust classroom content and assessments to individual student learning styles and interests?  The key to this and other computer-to-computer interoperability challenges lie in the difficulty computer systems have in finding and precisely exchanging data.  Enter the  Semantic Web .  The designers of the current world-wide-web realized that the gateway to this does not require faster computers and networks but instead lies in the careful publishing and exchange of data semantics (or meaning) and the precise publishing data-that-describes-data (metadata) in a machine-readable structure.  This presentation will review  patterns  that researches around the world are using to make the job of computer integration easier allowing even ultimate frisbee™ coaches access to vast amounts of structured information.
Background for Dan McCreary Computer Consultant in Minneapolis Became obsessed at a young age on computer-to-computer communications Interested in OO, XML, semantics and business strategy
Pattern Themes We learn how to create and use models of the world to discover underlying patterns of nature Computer-to-computer communication also uses models and allows us to find of underlying patterns to solve these problems
Agenda The steps required for precise exchange of information between computer systems Define “semantics��� and key concepts in the semantic web HTML, XML, RDF Discuss limitations of current HTML web and XML Show how Semantic Web technologies solve many of these problems Semantic patterns Predictions References
1970 Sci-Fi Classic: “The Forbin Project” A New Intersystem Language! Lesson: Before you take over the world you must exchange  semantically precise  metadata!
Moore’s Law Creative Commons 1.0 Courtesy of Ray Kurzweil and Kurzweil Technologies, Inc
Thesis: We Need Semantics For the next revolution in computing We don’t need faster CPUs We don’t need larger hard drives We don’t need faster networks We don’t need more HTML linking We need to link our  concepts  using semantic technologies
The Agent Vision The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users. The Semantic Web  A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities  By Tim Berners-Lee, James Hendler and Ora Lassila
Overlapping Terminology Data Warehouse Data Mining Enterprise Application Integration (EAI) Metadata Discovery Statistical Analysis Pattern Discovery Relational Database Metadata Semantic Web Business Semantics Data Dictionary HTML Web
Computer Science Is About Abstraction Time Level of Abstraction 10100101 Machine Language MOV R0, A1 BNE F32C Assembly Language DO I=1, 100 I=I+1 FORTRAN Proc(i1, i2, o1) Structured Programming Object-oriented Programming XML GUI
Person to Person Dialog Sound Words Concepts Sentences Conversation Problem Solving higher abstraction
Computer to Computer Dialog Internet XML Tags Documents/XML Schema Graphs/Ontologies/RDF/OWL Semantic Integration Agents You Are Here
Semantic Triangle Concept Referent Refers To Symbolizes Stands For “ cat” Physical Objects A pattern of neural activity in our brain Symbol Ogden, C. K., & Richards, I. A. (1923)  The Meaning of Meaning “ katze” (German) “ gato” (Spanish)
Symbols Can Only Directly Link to Concepts The link between a symbol is an  INDIRECT  link The referent MUST pass through the Concept Only symbols can be transmitted between computers Ogden, C. K., & Richards, I. A. (1923)  The Meaning of Meaning Concept Referent “ cat” Symbol
The Problem of Semantic Ambiguity Did you say you were looking for  mixed nuts ? context=food context=hardware People use  context  to derive the correct meaning.
59 meanings of "run" "run" 18 noun "senses" 41 verb "senses" tally test footrace streak play … move fast scat go operate has form … "the kids  ran  to the store" "the Yankees scored a  run  in the bottom of the 9th" "The experiment  ran  for over an hour" "her  run  of luck was just starting" "she broke mile  run  record" "the football 3 rd  down play was a  run " "13 other noun meanings…" "I would  run  from a ticking bomb." "The path  runs  up the hill." "you need training to  run  this machine." "the movie plot  runs  like this." "36 other verb meanings…" Source:  WordNet at http://wordnet.princeton.edu/ Context
Analogy: English Dictionary source:  www.m-w.com Note:  people use context to find the correct meaning. Term Metadata (data about data) Definitions
Word Senses “ run” tally test footrace streak play move fast scat go operate has form duration A single word maps To many concepts
Synonym Ring <Person> Joe Smith <Person> <Individual> Joe Smith <Individual> <Human> Joe Smith <Human> Joe Smith Many symbols for the same object Refers To Symbolizes Stands For
I’m Thinking of an Animal… It has four legs It has fur It chases mice It goes “meow” If you describe enough of the properties of a concept, you can have reasonable assurances that they are the same Note: since “concepts” are neural patterns in the brain the concept of “exact” is difficult to measure
Concept Linking Question: How can you tell if two concepts are the same if two systems don’t share the same symbol? Answer: If they have the same properties (and relationships) you can assume with reasonable probability they are the same concepts. symbol
Semantics is About Concept Linking Wouldn’t it be nice… If computers could  name  things internally or on a web site however they liked (keep using the current web) But we could always link those names back to a centralized database of  concepts Computers could do this  automatically  just like they translate domain names (www.google.com) into IP addresses (64.233.187.99) Then we could communicate precisely without dictating the names that are used inside a computer system or on a web page
HTML Sample <title> The Problem of Semantics </title> <p> This is a standard document that is sent between two computers using the  <a href=&quot;http://w3c.org/Protocols&quot;> HTTP <a>  protocol.  Note that other then the markup tags like  <b> bold </b>  there is very little that a computer can do to understand the meaning of the text. </p> Unless computers &quot;understand&quot; the words in the English language it will be very difficult for them to understand the meaning or semantics of the web.
What Computers &quot;See&quot; Today <title>  </title> <p>  <a href=&quot;http://w3c.org&quot;>  <a>    <b>  </b>    </p> Unless computers &quot;understand&quot; the words in the English language it will be very difficult for them to understand the meaning or semantics of the web.
XML allows you to create new “tags” <PersonGivenName> Joe </PersonGivenName> <PersonFamilyName> Smith </PersonFamilyName> <Address> 123 Main Street </Address> <City> Anytown </City> <State> Minnesota </State> <Phone> (651) 555-1234 </Phone> Without a data dictionary, it is difficult to know what the meaning of the data elements is.  The tags appear in patterns but what they mean is still a mystery to a computer. <tag> </tag> data
Which external computers may not understand <  >  </  > <  >  </  > <  >  </  > <  >  </  > <  >  </  > Without a “data dictionary”, it is difficult to know what the meaning of the data elements is.  The tags appear in patterns but what they mean is still a mystery to a computer.
Metadata Metadata is any data that describes  other  data Metadata is itself data and is stored in specialized structures (directed graphs) to aid comparison with other metadata A controlled store of metadata is called a “registry” Data describes RDBMS document keywords tables web navigation columns source-code org-chart product-specs Metadata
Hypertext Links and Data Element Links The Semantic Web Metadata Registry A Metadata Registry B The semantic web is about linking  conceptual  data elements in published  metadata registries The current HTML web is focused on linking published  documents  with  HTML The Hypertext Web
Enter the URI… Today's web allows documents to be accessed by people if people put links in between documents – the hypertext web But it is very difficult for machines to &quot;understand&quot; what we are saying and what we mean and what to do with the data But machines CAN determine if two URIs match: <SurName>Smith<SurName> <LastName>Smith</LastName> http://www.shared_dictionary.com/PersonGivenName MDR Hey, you both “mean” the same thing!
Subject-Verb-Object Triple Person “ Joe ” Has-a-Given-Name The person is named “Joe”. <PersonGivenName> Joe </PersonGivenName>
Triples are Almost all URIs http://MyDictionay/DataElement/Person “ Dan” http://MyDictionay/DataElement/PersonGivenName URIs can point to a standard location in a metadata registry. The “type” of link.
Sample RDF Document <?xml version=&quot;1.0&quot;?> < RDF > < Description   about =&quot; http://www.danmccreary.com/Training/Classes/Semantic_Web &quot;> < author > Dan McCreary </ author > < created > 2006-01-01 </ created > < modified >  2006-03-15 </ modified > </ Description > </ RDF >
Massive Databases of &quot;Triple Stores&quot; Triple store is: - A database with just 3 Columns - but millions/billions of rows May require specialized hardware Key Metrics: - Time to load triples into application - Time to save triples into database - Time to browse to an element - Time to configure system Sample Projects: Kowari 3Store Sesame RDF &quot;Triple Store&quot; See: http://simile.mit.edu/reports/stores/ Object Predicate Subject
Semantic Web Standards Stack Source: Tim Berners-Lee www.w3c.org http://www.w3.org/Consortium/Offices/Presentations/SemanticWeb/34.html URI/IRI Unicode XML Namespaces XML Query XML Schema RDF Model & Syntax Ontology (OWL) Rules/Query Logic Proof Trusted Semantic Web Signature Encryption
Example of Metadata Registry
Metaphor: The Translator Agent May I have a beer? Me gusteria una cerveza Customer (Spanish Only) Translation Service (Speaks Spanish and English) Internal Server (English Only) Coming right up!
Cost of Mapping Goal: create semantic maps to  a few  metadata standard, not many standards R 5 R 2 R 3 R 4 R 6 R 7 R N Mapping from one to many metadata registry to N other metadata registries: The O(N 2 ) problem R 2 R 3 R 4 R 5 R 6 R 7 R N ESB Mapping to one metadata registry The O(N) problem (ESB-Enterprise Service Bus) R 1 R 1
Semantic Mappers and Semantic Brokers Report Request In Model A Gartner: Vocabulary-based transformation XMLA: XML for Analysis Metadata Translation Service XML Response In Model A TDS In Model B Metadata Registry Model A Model B M etadata Mappings RDF Queries XML Results Data Warehouse (RDBMS) SQL or XMLA Queries In Model B
Wikipedia Rocks! It is currently burdensome to add new metadata to the registry Would like to add “Edit this data element” (ala Wikis) Ideally a “Semantic Wiki” See: Wikipedia: “Semantic Wiki”
Retrieving Data: An Evolution Shorten the time-to-report interval Allow users to &quot;browse&quot; data sets interactively Remove programmers with &quot;backlogs&quot; of reports Users frequently waited days, weeks for months to get a custom report created Monthly “Green Bar” Reports Browseable Graphical Interface (Cognos) Increasing Responsiveness
Classification and Categorization Whenever we decide to break the continuous observable world into a predefined list of categories when each category has a label we call this a categorical value.  These will then become the &quot;dimensions&quot; of our cube. &quot;red&quot; &quot;green&quot; &quot;blue&quot; George Lakoff:  Women, Fire and Other Dangerous Things: What Categories Revel about the Mind Note: NO OVERLAP!
Metadata Discovery Tools that “scan” data sources and create new ontologies or mappings to existing ontologies Metadata Registry Data Source  Mappings Relational Database
Federated Ontologies What do you do when you have more than one Ontology? 1) Combine 2) Map 3) Federate Tools for combination and federation Multiple Overlapping Ontologies
Cost of Poor Semantics IT Departments spend 40-60% of their costs on Integration 90% of integration costs are due to poor semantics If every application used and &quot;published&quot; a machine readable ontology with mappings to published ontologies integration could be almost &quot;automatic&quot;
Gartner Metadata cast into formal logics will drive interoperability, automation, cost cutting, better search capabilities and new business opportunities. Semantic Web Drives Data Management, Automation and Knowledge and Discovery Alexander Linder March 2005 G00125145
Semantic Spectrum Time/Money High Semantic Clarity Strong Semantics Weak Semantics UML, XMI Taxonomies Ontologies Thesaurus RDF XML, XSLT See also: Wikipedia/semantic spectrum Glossaries OWL Controlled Vocabularies Word/HTML Concept Maps Enterprise Data Models
Structures for Increased Semantics HTML  PDF  Word PowerPoint Excel Access Server  XML  RDBMS  RDF  Taxonomies Ontologies SOA WSDL Increased Semantic Precision Source: Network Inference
Friend of a Friend A &quot;Proof of Concept for RDF&quot; Requires each person to put an RDF file on their web pages System in place to prevent spammers from getting e-mail accounts Sample RDF vocabulary Sample FoaF file: <foaf:Person>    <foaf:name>Dan McCreary</foaf:name>    < foaf:knows >     <foaf:Person>    <foaf:name>Bill Titus</foaf:name>    </foaf:Person>    </foaf:knows> </foaf:Person>  © emode.com
Ontology Architectures One &quot;big&quot; ontology (see CycCorp cyc.com) Using a single &quot;Uber-Ontology&quot; Akin to &quot;Boiling the Ocean&quot; Compared to: Many smaller ontologies Micro-formats (RDF/A) How to combine? CYC contains over 3 Million &quot;assertions&quot; Source: cyc.com
If You Give A Kid A Hammer… … the whole world becomes a nail People solve problems with the tools they know Semantics are new tools for solving computer-to-computer communication problems Intelligent agents will be prevalent when we teach organization to publish their metadata
Cognitive Styles The way we solve problems is dependant on the tools we know how to use. Shoshana Zuboff (1988) In the Age of the Smart Machine Technology creates: - new ways of thinking - new ways of approaching and solving problems - new sets of &quot;Cognitive Styles&quot; It is  only  if we share these cognitive styles that we will be able to create a  coherent  technology strategy that everyone understands
Open The Door To The Semantic Web! Metadata publishing is hard It is a foundation upon which the Semantic Web will be built The benefits are indirect and  need strong executive sponsorship Metadata publishing is no “silver bullet” I believe it is the most direct way to get to the Semantic Web This will be the most practical way to build intelligent agents Agents Metadata Publishing
Questions & Answers If software is ever going to be able to effectively inter-operate (in ways that were not explicitly preconceived and engineered), it will be because applications  share  enough of the semantics of their data elements. Doug Lenat, Cycorp Semantic Technology Conference 2005

More Related Content

Patterns of Semantic Integration

  • 1. Patterns of Semantic Integration Riding the Next Wave April 2006 Dan McCreary President Dan McCreary & Associates [email_address] (952) 931-9198 Managed Metadata Solutions
  • 2. Creative Commons 2.5 Attribution . You must attribute the work in the manner specified by the author or licensor. Noncommercial . You may not use this work for commercial purposes. Share Alike . If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. $ BY:
  • 3. Patterns of Semantic Integration Our ever increasing understanding of solid-state physics has allowed Moore’s Law to proceed unabated for the last 40 years.  Exciting developments in quantum physics, nanotechnology and molecular self-assembly will continue this trend for the foreseeable future.  But why is it that an instructor can’t quickly import a database of 10,000 subject-appropriate lesson plans and quiz items into their learning-management system and dynamically adjust classroom content and assessments to individual student learning styles and interests?  The key to this and other computer-to-computer interoperability challenges lie in the difficulty computer systems have in finding and precisely exchanging data.  Enter the Semantic Web .  The designers of the current world-wide-web realized that the gateway to this does not require faster computers and networks but instead lies in the careful publishing and exchange of data semantics (or meaning) and the precise publishing data-that-describes-data (metadata) in a machine-readable structure.  This presentation will review patterns that researches around the world are using to make the job of computer integration easier allowing even ultimate frisbee™ coaches access to vast amounts of structured information.
  • 4. Background for Dan McCreary Computer Consultant in Minneapolis Became obsessed at a young age on computer-to-computer communications Interested in OO, XML, semantics and business strategy
  • 5. Pattern Themes We learn how to create and use models of the world to discover underlying patterns of nature Computer-to-computer communication also uses models and allows us to find of underlying patterns to solve these problems
  • 6. Agenda The steps required for precise exchange of information between computer systems Define “semantics” and key concepts in the semantic web HTML, XML, RDF Discuss limitations of current HTML web and XML Show how Semantic Web technologies solve many of these problems Semantic patterns Predictions References
  • 7. 1970 Sci-Fi Classic: “The Forbin Project” A New Intersystem Language! Lesson: Before you take over the world you must exchange semantically precise metadata!
  • 8. Moore’s Law Creative Commons 1.0 Courtesy of Ray Kurzweil and Kurzweil Technologies, Inc
  • 9. Thesis: We Need Semantics For the next revolution in computing We don’t need faster CPUs We don’t need larger hard drives We don’t need faster networks We don’t need more HTML linking We need to link our concepts using semantic technologies
  • 10. The Agent Vision The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users. The Semantic Web A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities By Tim Berners-Lee, James Hendler and Ora Lassila
  • 11. Overlapping Terminology Data Warehouse Data Mining Enterprise Application Integration (EAI) Metadata Discovery Statistical Analysis Pattern Discovery Relational Database Metadata Semantic Web Business Semantics Data Dictionary HTML Web
  • 12. Computer Science Is About Abstraction Time Level of Abstraction 10100101 Machine Language MOV R0, A1 BNE F32C Assembly Language DO I=1, 100 I=I+1 FORTRAN Proc(i1, i2, o1) Structured Programming Object-oriented Programming XML GUI
  • 13. Person to Person Dialog Sound Words Concepts Sentences Conversation Problem Solving higher abstraction
  • 14. Computer to Computer Dialog Internet XML Tags Documents/XML Schema Graphs/Ontologies/RDF/OWL Semantic Integration Agents You Are Here
  • 15. Semantic Triangle Concept Referent Refers To Symbolizes Stands For “ cat” Physical Objects A pattern of neural activity in our brain Symbol Ogden, C. K., & Richards, I. A. (1923) The Meaning of Meaning “ katze” (German) “ gato” (Spanish)
  • 16. Symbols Can Only Directly Link to Concepts The link between a symbol is an INDIRECT link The referent MUST pass through the Concept Only symbols can be transmitted between computers Ogden, C. K., & Richards, I. A. (1923) The Meaning of Meaning Concept Referent “ cat” Symbol
  • 17. The Problem of Semantic Ambiguity Did you say you were looking for mixed nuts ? context=food context=hardware People use context to derive the correct meaning.
  • 18. 59 meanings of &quot;run&quot; &quot;run&quot; 18 noun &quot;senses&quot; 41 verb &quot;senses&quot; tally test footrace streak play … move fast scat go operate has form … &quot;the kids ran to the store&quot; &quot;the Yankees scored a run in the bottom of the 9th&quot; &quot;The experiment ran for over an hour&quot; &quot;her run of luck was just starting&quot; &quot;she broke mile run record&quot; &quot;the football 3 rd down play was a run &quot; &quot;13 other noun meanings…&quot; &quot;I would run from a ticking bomb.&quot; &quot;The path runs up the hill.&quot; &quot;you need training to run this machine.&quot; &quot;the movie plot runs like this.&quot; &quot;36 other verb meanings…&quot; Source: WordNet at http://wordnet.princeton.edu/ Context
  • 19. Analogy: English Dictionary source: www.m-w.com Note: people use context to find the correct meaning. Term Metadata (data about data) Definitions
  • 20. Word Senses “ run” tally test footrace streak play move fast scat go operate has form duration A single word maps To many concepts
  • 21. Synonym Ring <Person> Joe Smith <Person> <Individual> Joe Smith <Individual> <Human> Joe Smith <Human> Joe Smith Many symbols for the same object Refers To Symbolizes Stands For
  • 22. I’m Thinking of an Animal… It has four legs It has fur It chases mice It goes “meow” If you describe enough of the properties of a concept, you can have reasonable assurances that they are the same Note: since “concepts” are neural patterns in the brain the concept of “exact” is difficult to measure
  • 23. Concept Linking Question: How can you tell if two concepts are the same if two systems don’t share the same symbol? Answer: If they have the same properties (and relationships) you can assume with reasonable probability they are the same concepts. symbol
  • 24. Semantics is About Concept Linking Wouldn’t it be nice… If computers could name things internally or on a web site however they liked (keep using the current web) But we could always link those names back to a centralized database of concepts Computers could do this automatically just like they translate domain names (www.google.com) into IP addresses (64.233.187.99) Then we could communicate precisely without dictating the names that are used inside a computer system or on a web page
  • 25. HTML Sample <title> The Problem of Semantics </title> <p> This is a standard document that is sent between two computers using the <a href=&quot;http://w3c.org/Protocols&quot;> HTTP <a> protocol. Note that other then the markup tags like <b> bold </b> there is very little that a computer can do to understand the meaning of the text. </p> Unless computers &quot;understand&quot; the words in the English language it will be very difficult for them to understand the meaning or semantics of the web.
  • 26. What Computers &quot;See&quot; Today <title>  </title> <p>  <a href=&quot;http://w3c.org&quot;>  <a>  <b>  </b>  </p> Unless computers &quot;understand&quot; the words in the English language it will be very difficult for them to understand the meaning or semantics of the web.
  • 27. XML allows you to create new “tags” <PersonGivenName> Joe </PersonGivenName> <PersonFamilyName> Smith </PersonFamilyName> <Address> 123 Main Street </Address> <City> Anytown </City> <State> Minnesota </State> <Phone> (651) 555-1234 </Phone> Without a data dictionary, it is difficult to know what the meaning of the data elements is. The tags appear in patterns but what they mean is still a mystery to a computer. <tag> </tag> data
  • 28. Which external computers may not understand <  >  </  > <  >  </  > <  >  </  > <  >  </  > <  >  </  > Without a “data dictionary”, it is difficult to know what the meaning of the data elements is. The tags appear in patterns but what they mean is still a mystery to a computer.
  • 29. Metadata Metadata is any data that describes other data Metadata is itself data and is stored in specialized structures (directed graphs) to aid comparison with other metadata A controlled store of metadata is called a “registry” Data describes RDBMS document keywords tables web navigation columns source-code org-chart product-specs Metadata
  • 30. Hypertext Links and Data Element Links The Semantic Web Metadata Registry A Metadata Registry B The semantic web is about linking conceptual data elements in published metadata registries The current HTML web is focused on linking published documents with HTML The Hypertext Web
  • 31. Enter the URI… Today's web allows documents to be accessed by people if people put links in between documents – the hypertext web But it is very difficult for machines to &quot;understand&quot; what we are saying and what we mean and what to do with the data But machines CAN determine if two URIs match: <SurName>Smith<SurName> <LastName>Smith</LastName> http://www.shared_dictionary.com/PersonGivenName MDR Hey, you both “mean” the same thing!
  • 32. Subject-Verb-Object Triple Person “ Joe ” Has-a-Given-Name The person is named “Joe”. <PersonGivenName> Joe </PersonGivenName>
  • 33. Triples are Almost all URIs http://MyDictionay/DataElement/Person “ Dan” http://MyDictionay/DataElement/PersonGivenName URIs can point to a standard location in a metadata registry. The “type” of link.
  • 34. Sample RDF Document <?xml version=&quot;1.0&quot;?> < RDF > < Description about =&quot; http://www.danmccreary.com/Training/Classes/Semantic_Web &quot;> < author > Dan McCreary </ author > < created > 2006-01-01 </ created > < modified > 2006-03-15 </ modified > </ Description > </ RDF >
  • 35. Massive Databases of &quot;Triple Stores&quot; Triple store is: - A database with just 3 Columns - but millions/billions of rows May require specialized hardware Key Metrics: - Time to load triples into application - Time to save triples into database - Time to browse to an element - Time to configure system Sample Projects: Kowari 3Store Sesame RDF &quot;Triple Store&quot; See: http://simile.mit.edu/reports/stores/ Object Predicate Subject
  • 36. Semantic Web Standards Stack Source: Tim Berners-Lee www.w3c.org http://www.w3.org/Consortium/Offices/Presentations/SemanticWeb/34.html URI/IRI Unicode XML Namespaces XML Query XML Schema RDF Model & Syntax Ontology (OWL) Rules/Query Logic Proof Trusted Semantic Web Signature Encryption
  • 38. Metaphor: The Translator Agent May I have a beer? Me gusteria una cerveza Customer (Spanish Only) Translation Service (Speaks Spanish and English) Internal Server (English Only) Coming right up!
  • 39. Cost of Mapping Goal: create semantic maps to a few metadata standard, not many standards R 5 R 2 R 3 R 4 R 6 R 7 R N Mapping from one to many metadata registry to N other metadata registries: The O(N 2 ) problem R 2 R 3 R 4 R 5 R 6 R 7 R N ESB Mapping to one metadata registry The O(N) problem (ESB-Enterprise Service Bus) R 1 R 1
  • 40. Semantic Mappers and Semantic Brokers Report Request In Model A Gartner: Vocabulary-based transformation XMLA: XML for Analysis Metadata Translation Service XML Response In Model A TDS In Model B Metadata Registry Model A Model B M etadata Mappings RDF Queries XML Results Data Warehouse (RDBMS) SQL or XMLA Queries In Model B
  • 41. Wikipedia Rocks! It is currently burdensome to add new metadata to the registry Would like to add “Edit this data element” (ala Wikis) Ideally a “Semantic Wiki” See: Wikipedia: “Semantic Wiki”
  • 42. Retrieving Data: An Evolution Shorten the time-to-report interval Allow users to &quot;browse&quot; data sets interactively Remove programmers with &quot;backlogs&quot; of reports Users frequently waited days, weeks for months to get a custom report created Monthly “Green Bar” Reports Browseable Graphical Interface (Cognos) Increasing Responsiveness
  • 43. Classification and Categorization Whenever we decide to break the continuous observable world into a predefined list of categories when each category has a label we call this a categorical value. These will then become the &quot;dimensions&quot; of our cube. &quot;red&quot; &quot;green&quot; &quot;blue&quot; George Lakoff: Women, Fire and Other Dangerous Things: What Categories Revel about the Mind Note: NO OVERLAP!
  • 44. Metadata Discovery Tools that “scan” data sources and create new ontologies or mappings to existing ontologies Metadata Registry Data Source Mappings Relational Database
  • 45. Federated Ontologies What do you do when you have more than one Ontology? 1) Combine 2) Map 3) Federate Tools for combination and federation Multiple Overlapping Ontologies
  • 46. Cost of Poor Semantics IT Departments spend 40-60% of their costs on Integration 90% of integration costs are due to poor semantics If every application used and &quot;published&quot; a machine readable ontology with mappings to published ontologies integration could be almost &quot;automatic&quot;
  • 47. Gartner Metadata cast into formal logics will drive interoperability, automation, cost cutting, better search capabilities and new business opportunities. Semantic Web Drives Data Management, Automation and Knowledge and Discovery Alexander Linder March 2005 G00125145
  • 48. Semantic Spectrum Time/Money High Semantic Clarity Strong Semantics Weak Semantics UML, XMI Taxonomies Ontologies Thesaurus RDF XML, XSLT See also: Wikipedia/semantic spectrum Glossaries OWL Controlled Vocabularies Word/HTML Concept Maps Enterprise Data Models
  • 49. Structures for Increased Semantics HTML PDF Word PowerPoint Excel Access Server XML RDBMS RDF Taxonomies Ontologies SOA WSDL Increased Semantic Precision Source: Network Inference
  • 50. Friend of a Friend A &quot;Proof of Concept for RDF&quot; Requires each person to put an RDF file on their web pages System in place to prevent spammers from getting e-mail accounts Sample RDF vocabulary Sample FoaF file: <foaf:Person> <foaf:name>Dan McCreary</foaf:name> < foaf:knows > <foaf:Person> <foaf:name>Bill Titus</foaf:name> </foaf:Person> </foaf:knows> </foaf:Person> © emode.com
  • 51. Ontology Architectures One &quot;big&quot; ontology (see CycCorp cyc.com) Using a single &quot;Uber-Ontology&quot; Akin to &quot;Boiling the Ocean&quot; Compared to: Many smaller ontologies Micro-formats (RDF/A) How to combine? CYC contains over 3 Million &quot;assertions&quot; Source: cyc.com
  • 52. If You Give A Kid A Hammer… … the whole world becomes a nail People solve problems with the tools they know Semantics are new tools for solving computer-to-computer communication problems Intelligent agents will be prevalent when we teach organization to publish their metadata
  • 53. Cognitive Styles The way we solve problems is dependant on the tools we know how to use. Shoshana Zuboff (1988) In the Age of the Smart Machine Technology creates: - new ways of thinking - new ways of approaching and solving problems - new sets of &quot;Cognitive Styles&quot; It is only if we share these cognitive styles that we will be able to create a coherent technology strategy that everyone understands
  • 54. Open The Door To The Semantic Web! Metadata publishing is hard It is a foundation upon which the Semantic Web will be built The benefits are indirect and need strong executive sponsorship Metadata publishing is no “silver bullet” I believe it is the most direct way to get to the Semantic Web This will be the most practical way to build intelligent agents Agents Metadata Publishing
  • 55. Questions & Answers If software is ever going to be able to effectively inter-operate (in ways that were not explicitly preconceived and engineered), it will be because applications share enough of the semantics of their data elements. Doug Lenat, Cycorp Semantic Technology Conference 2005