SlideShare a Scribd company logo
Creating Knowledge out of Interlinked Data




                      Intelligent Information
                            Management
                                 Collaborative Project 2010-2014
                         in Information and Communication Technologies



       Project No. 257943
       Start Date 01/09/2010




                                                                         http://lod2.eu
EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 1                        http://lod2.eu
Creating Knowledge out of Interlinked Data



     The emerging Web of Data achievements and challenges

• Web - a global, distributed platform for data, information and knowledge integration
• exposing, sharing, and connecting pieces of data, information, and knowledge on the
  Semantic Web using URIs and RDF
                                               Achievements                   Challenges
                                                 1. Extension of the Web with 1. Coherence: Relatively few,
                                                    a data commons               expensively maintained links
                                                    (currently amounting 25   2. Quality: partly low quality
July 2007     April 2008      September 2008        Billion facts)               data and inconsistencies
                                                 2. vibrant, global RTD       3. Performance: Still
                                                    community                    substantial penalties
                                                 3. Industrial uptake begins     compared to relational
                                                    (e.g. BBC, Thomson        4. Data consumption: large-
                                                    Reuters, Eli Lilly)          scale processing, schema
                                                                                 mapping and data fusion still
                                                 4. Emerging governmental
                                                                                 in its infancy
                                                    adoption in sight
                                                                              5. Usability: Missing direct end-
                                                 5. Establishing Linked Data     user tools and network effect
                                                    as a deployment path for
                                                    the Semantic Web.         These issues are closely related
 July 2009                                                                   and should ultimately lead to an
                                                                             ecosystem of interlinked
     EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 2                      knowledge!           http://lod2.eu
Creating Knowledge out of Interlinked Data
                              LOD2 in a Nutshell
Research focus
• Very large RDF data
  management
• Knowledge Enrichment &
  Interlinking
• Fusion & Information
  Quality
• Adaptive, semantic user
  interfaces
Use Cases
• Media & Publishing
• Enterprise Data Webs
• Open Gov Data
Main Result
• Integrated LOD2-Stack
  for Linked Data lifecycle
  management
Partner
Uni Leipzig, CWI, DERI
Galway, FU Berlin,
Semantic Web Company,
OpenLink, Tenforce,
Exalead, Wolters Kluwer,
OKFN                                                                   3
     EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 3     http://lod2.eu
Creating Knowledge out of Interlinked Data




               LOD2
               EC-funded collabarotive project that aims to utilize the Web as an integration platform for
               data and information



               Linked Data
               Linked Data provides the necessary basic technologies and standards to realize the goal
               of LOD2.

               Linked Open Data
               publicly accessible data which is to be integrated into the web and linked among one
               another and with non-public contents such as enterprise intranets



               Project Highlights
               Open Government Linked Data Initiative
               Common European platform publicdata.eu

               Leading Web 3.0 technologies are combined in the project in to the coherent LOD2
               stack (e.g. DBpedia, Virtuoso, Sindice, Silk)

EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 4                                             http://lod2.eu
Creating Knowledge out of Interlinked Data



WP1: Requirements, Design & LOD2 Stack Prototype

 Use Case High-Level Abstraction




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 5    http://lod2.eu
Creating Knowledge out of Interlinked Data



     WP1: Use Case Objectives
                                             Objective of WP8:
                           Applying Linked Data technologies in an enterprise stack
                                 to support Human Resources-related issues.



                                                    ENTERPRISE
                                                   APPLICATIONS

                                                      (Exalead)




Objective of WP7:                        MEDIA                  OPEN             Objective of WP9:
Supporting content-related                 &                 GOVERNMENT          Improving accessibility,
production workflows in the            PUBLISHING               DATA             findability and reusability of
media & publishing industry.                                                     Open Government Data.
                                         (Wolters                   (Open
                                          Kluwer                  Knowledge
                                         Germany)                 Foundation)




     EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 6                                             http://lod2.eu
Creating Knowledge out of Interlinked Data



WP2: Storing & Querying Very Large Knowledge Bases

Goal:
 Enabling large-scale, feature-rich & enterprise-ready Linked Data
  management solutions


Database Partners in LOD2:
 CWI - Leading open source analytics RDBMS
 OpenLink - Leading Linked data deployment platform


Technological Excellence:
 Creating and publishing metrics for choosing RDF solutions
 Bringing Column Store Technology for Business Intelligence on RDF
 Ground-breaking database innovations for RDF stores
  (Dynamic Query optimization, Adaptive Caching of Joins, Optimized Graph
   Processing, Cluster/Cloud scalability)



EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 7                          http://lod2.eu
Creating Knowledge out of Interlinked Data



WP2: Linked Open Data For Real In Your Apps

Business Advantages:
   Enrich your application with (free & rich) Linked Open Data
   RDF store technology has 10x lower deployment costs than relational for ragged data

Technological Flexibility:
  Deliver Schema-Last Flexibility and Inference at Relational Data Warehouse Cost and Performance
  Grow as you go: the LOD2 platform dynamically adapts to your usage patterns and structure of your
 data
 Integrate, resolve, align anything: Schema, instance identity

Rich Features for complex Applications:
   Advanced SPARQL and SQL query processing
   SPARQL and SQL Federation
   Full Text, Geospatial, Text Search
   Scale-Out on Clusters, Replication


EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 8                                        http://lod2.eu
Creating Knowledge out of Interlinked Data



    WP3: Goals

General Goal:
      Creation, improvement, repair of knowledge bases


Focus:
      Very large knowledge bases, diverse knowledge, web data
      Refine existing (Virtuoso Sponger, RDF Views, Triplify, D2R) triplification approaches
      Improve schema of knowledge based on data
      Fix problems in knowledge bases e.g. inconsistencies


Techniques:
      semi-automatic machine learning, ontology debugging, NLP, shallow parsing etc.




    EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 9                                          http://lod2.eu
Creating Knowledge out of Interlinked Data



WP3: Knowledge Base Improvement Cycle


Mutual Refinement Cycle
(with optional Extraction
phase)                                                Modelling
                                                      Problems

                                                              Repair

                                                            Performance
                                                             Problems


                                                                                        Disjoint-
                                           Semi-                                          ness
                        Structured                                        Definitions
                                        structured
                               Extraction                                        Enrichment


                                    Un-                                           Linkage
                                structured                                       Validation




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 10                                                 http://lod2.eu
Creating Knowledge out of Interlinked Data



WP3: Task Overview
Provenance-Aware Extraction of Linked Data from Existing Structured Formats
   Relational databases, spreadsheets, CMS, logs, XML documents
   Development of D2R Triplify, Virtuoso Sponger

Provenance-Aware Extraction of Linked Data from Unstructured and Semi-
Structured Sources
   HTML, PDF and Office documents with meta-data, wiki code, plain text
   Development of NLP2RDF and DBpedia

Knowledge Base Schema Enrichment
   Learn axioms in knowledge bases, e.g. disjointness, definitions, super-classes
   Development of ORE and DL-Learner

Knowledge Base Repair
   Fix inconsistencies, modeling problems, reasoning performance problems
   Development of ORE

Web Linkage Validator
   Reports whether knowledge base is suitable to be interlinked with others



EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 11                                  http://lod2.eu
Creating Knowledge out of Interlinked Data



WP4: Reuse, Interlinking and Knowledge Fusion (1)

Goal:             Provide open-source software components for link generation, schema
mapping,               data quality assessment and knowledge fusion.




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 12                             http://lod2.eu
Creating Knowledge out of Interlinked Data



WP4: Reuse, Interlinking and Knowledge Fusion (2)


Technological Excellence:
   Ease the creation of RDF links by using machine-learning as well as link quality assessment
  workbench
   Provide for the flexible integration of Web data based on mappings discovered on the Web
   Provide for assessing the quality of Web data and fusing high-quality data.


Expected Outcomes:
   Link discovery tools, linking assist and workbench
   Framework for publishing and discovering expressive mappings on the Web
   Data quality assessment framework providing for a wide range of different quality assessment
  policies
   Data fusion components providing various conflict resolution strategies




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 13                                        http://lod2.eu
Creating Knowledge out of Interlinked Data



WP 5: Linked Data Visualization, Browsing and Authoring (1)

WP5 aims to build on and go beyond existing approaches for realizing adaptive Web
user interfaces by:

   Automatic content adaptation

A subset of the domain knowledge is identified, possibly through some reasoning
mechanism, as relevant to the current user and context.

   Adaptive Browsing

Faceted spatial semantic browsing: reusable component for browsing spatial content
in a faceted way (also for mobiles).




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 14                         http://lod2.eu
Creating Knowledge out of Interlinked Data



WP5: Linked Data Visualization, Browsing and Authoring (2)

Adaptive Semantic Authoring


     semantic widget interface: allows the creation of small reusable interface components for domain-
    specific
       user interfaces (also for mobiles).

     adaptive widget choreography: enables the automatic generation of user interfaces.

     social networking interfaces: enable users to subscribe to arbitrary information adhering to certain
      semantically defined filter criteria.




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 15                                              http://lod2.eu
Creating Knowledge out of Interlinked Data



WP 5: Technologies & Methods (1)

   Semantic pipes:

An engine and graphical
environment for general Web
Data transformations and Mashup.


   Sig.ma:

A service and an end-user
application to access the
Web of Data as an integrated
information space.




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 16   http://lod2.eu
Creating Knowledge out of Interlinked Data



WP 5: Technologies & Methods (2)

Site Services: Site Search and Site Widgets
   Widgets (right) provide relevant information,
    from Sindice, about the topic of the site.

   Site search (below) provides a rich faceted-browsing
    functionality of the site’s widgets.




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 17        http://lod2.eu
Creating Knowledge out of Interlinked Data



WP6: Interfaces, Integration & LOD2 Stack (1)

This work package deploys the LOD2 stack, based on the requirements and
prerequisites
defined in work package 1.


The LOD2 stack will be made available as downloadable packages.

While leading work package 6, the following will be delivered:

   Integrated user-interface components
   Integrated LOD2 Stack API components
   Evaluation, Documentation, Tutorials




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 18                       http://lod2.eu
Creating Knowledge out of Interlinked Data



WP6: Interfaces, Integration & LOD2 Stack (2)

Output WP1

Use Case                                                                                                    WP7:
 Media &                                                                                                   Media &
Publishing                                                                                                Publishing
                                          Yearly releases

                   Requirements/
                                                                                             Applied on
                   Prerequisites




Use Case                                                       Generates
                                                                                 Open                       WP8:
                 Requirements/                                                                 Applied
Enterprise       Prerequisites                               packages with      Source           on        Media &
                                                            integrated tools
Data web                                                                       Package                    Publishing
                                            SAF                                LOD2 stack
                                   (Software Assembly Factory)
                   Requirements/                                               Released on
                                       Starts on 09/2011                        • 09/2012
                   Prerequisites                                                             Applied on
                                                                                • 09/2013
                                                                                • 09/2014
 Use Case                                                                                                   WP9:
Government                                                                                                 Media &
   Data                                                                                                   Publishing



EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 19                                                           http://lod2.eu
Creating Knowledge out of Interlinked Data



WP7: LOD2 for Publishers

Wolters Kluwer Deutschland (WKD):
“Semantic Technologies and Standards are an enabler for the media and publishing
industry to create added-value for their customers with reasonable costs.“

           WKD Legal & Regulatory
Companies/Brands         Products (Examples)
- Carl Heymanns Verlag   - IP, Administrative Law                WKD is part of Wolters Kluwer B.V.
- Luchterhand            - Civil, Family, Labor Law
- Werner Verlag          - Construction Law                 Customer orientation        Worldwide reach
- Carl Link              - Publications for Schools/KiTas   - Lawyers                   - Europe
- CW Haarfeld            - Public Health Insurance          - Tax Accountants           - North America
- Deutscher              - Magazin „Personalwirtschaft“     - Corporations and SMEs     - Asia/Pacific
Wirtschaftsdienst        (HR Management)                    - Fincancial institutions
- AnNoText               - SW for Lawyers and Notaries      - Health Providers          Economic success
- Trigon Data                                               - Public Sector             - Revenue 2009 EUR 3,4 bln.
                                                                                        - 18.000 Employees
             WKD Tax & Accounting                                                       - Listed Amsterdam SE

Companies/Brands         Products (Examples)
- Akademische Arbeits-   - Tax SW for Consumers
gemeinschaft Verlag
- Addison Group          - SW for Tax Accountants
- Schleupen Tax          - SW for SMEs with focus
- Wago Curadata          Controlling and Accounting



EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 20                                                       http://lod2.eu
Creating Knowledge out of Interlinked Data



WP7: WKD as a Consumer of LOD Data


 Content Supply
    Chain of
                               Content                     Composing          Publishing                   Customer
 Wolters Kluwer                              Editing                                        Sales                             Customer
                               Acquisition                 Bundling           Interfacing                  Service
  Deutschland

      (WKD)




              Content Acquisition                           Content Enrichment                             Enterprise Applications


Acquisition of LOD governmental data           Enrichment of WKD data                          Data integration in Enterprise and other
                                                                                               Costumer Applications
- Laws & Regulations                           - Enrichment with additional metadata
                                               from the LOD cloud                              - Integration of customer and WKD data
- Court cases
                                                                                               with data from the LOD cloud
                                               - Automatic Interlinking within WKD data,
- Administrative Rulings
                                               but also into the LOD cloud                     - Development of new services, e.g.
- Statistical information                                                                      around metadata economics
                                               Based on:
Based on:                                                                                      Based on:
                                               - Adequate delivery format
- Adequate delivery format                                                                     - Adequate functionality
                                               - Adequate metadata
- Adequate metadata                                                                            - Adequate APIs
                                               - Adequate functionality
- Adequate Licensing and IPR                                                                   - Adequate Licensing and IPR
                                               - Adequate Licensing and IPR


EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 21                                                                        http://lod2.eu
Creating Knowledge out of Interlinked Data



WP7: WKD as a Publisher of LOD Data


 Content Supply
    Chain of
                      Content                       Composing          Publishing                   Customer
 Wolters Kluwer                       Editing                                        Sales                             Customer
                      Acquisition                   Bundling           Interfacing                  Service
  Deutschland

    (WKD)




                                                    Cloud - Publishing                              Marketing measures


                                        Development of WKpedia                          Integration in overall marketing
                                                                                        strategy of WKD
                                        - Publishing of enriched governmental
                                        information                                     - Dissemination of LOD2 in media and
                                                                                        publishing sector
                                        - Publishing of legal domain thesauri
                                                                                        - Launching surveys
                                        - Motivating contextualisation in LOD
                                        cloud                                           - Permanent information of customers
                                        Based on:                                       - Sponsoring of conferences
                                        - Adequate functionality                        Based on:
                                        -Adequate APIs                                  - Clear scope of LOD2 project to support
                                                                                        future publishing paradigms
                                        - Adequate Licensing and IPR


EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 22                                                                   http://lod2.eu
Creating Knowledge out of Interlinked Data



WP8: Towards Linked Enterprise Data Webs (1)
Linked Enterprise Intra Data Webs can fill the gap between Intra-/Extranets and ERP
systems
Facilitates data integration along value-chains within and across enterprises
The pragmatic, incremental, vocabulary based Linked Data approach reduces data
integration costs significantly
Objectives:                                                  Web
                                                           publishing

 Promote openness and standards
  in enterprise data workflows
  and applications




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 23                              http://lod2.eu
Creating Knowledge out of Interlinked Data



WP8: Linked Enterprise Data Use Case Scenario (2)

Wage policy EBI:
     Build an application for surveying wage policy in a company, domain, sector, region, etc.


Scenarios:
     A company wants to know if its wage policy is consistent with the market (in similar and related
      companies and sectors).

     A job applicant would like to have an idea about his wage expectations according to his
    expertise, profile
      and education background

     A governmental agency would like to survey the salaries in a particular region according to an
    economic
      branch and other parameters



EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 24                                               http://lod2.eu
Creating Knowledge out of Interlinked Data



WP8: Linked Enterprise Data (3)


Targeted service:

        A Saas service with different levels of subscription

        The service is a mashup of payroll and HR data of enterprises subscribing to the
    service to
         build an index store of data facts about wages.

        Different consolidation parameters and key performance indicators (KPI) will be
    studied to
         provide relevant reports and visualisation interfaces.

        Integration of external datasets in a particular survey: public datasets in the web cloud
    or
         private datasets of participating companies.

        Privacy issues management: make private and nominative data anonymous.
EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 25                                                  http://lod2.eu
Creating Knowledge out of Interlinked Data



        WP8: Linked Enterprise Data (4)
             Preliminary overview:




                   Search and
                       EBI
                                                                                    Employees
                    interface
 Full                                                                                 and HR
 text                                                                                database
index




   Indexer


                                                                        Data
                Data            Data          Data         Data      LODification                Taxonomies
             consolidatio   enrichment,   Cleaning and    crawler       and
                  n          annotation   uniformisatio             anonymisation
                                                n


RDF
store                SPARQL                                                         Payroll software
                     endpoint
        EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 26                                     http://lod2.eu
Creating Knowledge out of Interlinked Data



WP9: Open Government Data Use Case

publicdata.eu - find and reuse datasets from local, regional and national public
bodies across Europe from a single place




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 27                           http://lod2.eu
Creating Knowledge out of Interlinked Data



WP9: Who is this for?

   Data literate citizenry
   Data journalists
   Policy experts
   Decision makers
   Mobile and web developers
   Academics / researchers
   Public bodies
   Companies
   Civic society / NGOs
   And so on...




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 28   http://lod2.eu
Creating Knowledge out of Interlinked Data



WP9: What will it involve?


     Enabling exchange of metadata between different data catalogues


     Aggregating datasets from existing data catalogues


     Creating a European community of reusers to improve metadata


     Creating mechanisms for capturing derived / related datasets


     Bridge language and topical gaps to associate related information
      from all Member States




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 29                       http://lod2.eu
Creating Knowledge out of Interlinked Data



WP10: Training, Dissemination, Community Building & Fertilization

The general aim of this work package is to establish a worldwide focal point for
academic and industry parties interested in contributing to or taking advantage of
the novel Linked Data methodologies and components, which will emerge in the
project.

In particular, our activities will be targeted at:

      informing the community of the state-of-the-art developments taking place in the
    field,

     disseminating the project results in order to foster community building and to
    create an
       impact on industry and research in Europe and worldwide,

     providing training to interested audiences in the technologies developed
    throughout the
      project
EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 30                               http://lod2.eu
Creating Knowledge out of Interlinked Data



WP10: Tasks & Timeline
Task 10.1 Training
(M 14 – M37)

   Internal face-to-face training
   External training
   PhD programme



Task 10.2 Dissemination, Community Building & Cross-Fertilization
(M1 – 48)

   Scientific dissemination
   Industrial dissemination
   Online marketing activities across all identified target groups



EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 31                   http://lod2.eu
Creating Knowledge out of Interlinked Data



 WP10: LOD2 Dissemination Resources


 Website:          http://lod2.eu

 Weblog:           http://lod2.eu/BlogPost

 Twitter:          http://twitter.com/lod2project

 SlideShare: http://www.slideshare.net/lod2project

 PUBLINK:          http://lod2.eu/Article/Publink.html




Remark: please use #lod2 on twitter for your posts & connect with account: lod2project
        many thanks in advance!!

 EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 32                        http://lod2.eu
Creating Knowledge out of Interlinked Data



PubLink – LOD2’s Linked Open Data Starter Service
• PubLink helps selected organizations with a focused consulting effort of 10-15 days
  to publish and make use out of Linked Data

• PubLink helps to evaluate the LOD2 technologies and to increase the wealth of
  Linked Data

• Yearly application deadline in Winter

• 2011 PubLink participants include:

    1.   Umweltbundesamt GmbH, Austria
    2.   Greater London Authority
    3.   Deutsch Bibliographie, Historische Kommission
    4.   The Parliament of Finland
    5.   City of Vienna
    6.   Instituto Canario de Estadística (ISTAC)

• See: http://lod2.eu/Article/Publink.html


EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 33                           http://lod2.eu
Creating Knowledge out of Interlinked Data



WP11: Exploitation and Standardization (1)


Objectives:
     Realizing the vision of the LOD2 project and use case studies
     Standardisation of LOD2 architecture
     Exploitation of knowledge and technical results



Exploitation:
     Use case studies and the industrial and end-user community partners will drive the exploitation.
     Tracking important technical and commercial in information retrieval, data management including
    news
      and media.
     Publish exploitation plan identifying opportunities, benefits and impact of LOD2 consortium.




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 34                                             http://lod2.eu
Creating Knowledge out of Interlinked Data



WP11: Exploitation and Standardization (2)


Interlectual Property Rights (IPR):
   Core component of the LOD2 stack will be published under open-source license.
   Domain adoptions of LOD2 stack considered on case-by-case basis to protect IPR.
   Strategy ensures that all components of LOD2 are royalty-free.


Standardization:
   Actively participating in appropriate standards bodies.
   Establishing a W3C Linked data interest group.


Orchestration with other projects:
   Encourage take-up of LOD2 technologies by other projects.
   Foster input from other EU projects relevant for the development of LOD2




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 35                                   http://lod2.eu
Creating Knowledge out of Interlinked Data



WP12: Fact Sheet

Project                                                              Means
   Instrument:    Large-scale Integrating Project
   Objective:     Intelligent Information Management                 Total Budget:      8,58 M€
   Call:          FP7-ICT-2009-5                                     Total Funding:     6,45 M€
   Duration:             09/2010 – 08/2014                           Total Resources:        844 PM




Consortium
 Universität Leipzig (Coordinator)
     Centrum Wiskunde & Informatica
     National University of Ireland in Galway
     Freie Universität Berlin
     OpenLink Software
     Semantic Web Company
     TenForce
     Exalead
     Wolters Kluwer Deutschland                      10 Partners from 7 European
     Open Knowledge Foundation                       Countries




EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 36                                              http://lod2.eu
Creating Knowledge out of Interlinked Data



Contact

Address                                       Coordinator

University of Leipzig                       Dr Sören Auer
Faculty of Mathematics and Computer Science Scientific Project Leader
Institute of Computer Science               Phone:+49 (341) 97-32367
Department of Business Information Systems
                                            Fax: +49 (341) 97-32329
Postfach 100920                             Email: auer@uni-leipzig.de
04009 Leipzig                               http://www.informatik.uni-leipzig.de/~auer
Germany
                                              Nadine Jänicke
                                              Project Manager
                                              Phone:+49 (341) 97-32310
                                              Fax: +49 (341) 97-32329
                                              Email: jaenicke@uni-leipzig.de
                                              http://bis.informatik.uni-leipzig.de/NadineJaenicke




Thanks for your attention!                                                                          http://lod2.eu
LOD2 Title . 02.09.2010 . Page 37                                                                    http://lod2.eu

More Related Content

LOD2 - Creating Knowledge out of Interlinked Data - General Presentation

  • 1. Creating Knowledge out of Interlinked Data Intelligent Information Management Collaborative Project 2010-2014 in Information and Communication Technologies Project No. 257943 Start Date 01/09/2010 http://lod2.eu EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 1 http://lod2.eu
  • 2. Creating Knowledge out of Interlinked Data The emerging Web of Data achievements and challenges • Web - a global, distributed platform for data, information and knowledge integration • exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF Achievements  Challenges 1. Extension of the Web with 1. Coherence: Relatively few, a data commons expensively maintained links (currently amounting 25 2. Quality: partly low quality July 2007 April 2008 September 2008 Billion facts) data and inconsistencies 2. vibrant, global RTD 3. Performance: Still community substantial penalties 3. Industrial uptake begins compared to relational (e.g. BBC, Thomson 4. Data consumption: large- Reuters, Eli Lilly) scale processing, schema mapping and data fusion still 4. Emerging governmental in its infancy adoption in sight 5. Usability: Missing direct end- 5. Establishing Linked Data user tools and network effect as a deployment path for the Semantic Web. These issues are closely related July 2009 and should ultimately lead to an ecosystem of interlinked EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 2 knowledge! http://lod2.eu
  • 3. Creating Knowledge out of Interlinked Data LOD2 in a Nutshell Research focus • Very large RDF data management • Knowledge Enrichment & Interlinking • Fusion & Information Quality • Adaptive, semantic user interfaces Use Cases • Media & Publishing • Enterprise Data Webs • Open Gov Data Main Result • Integrated LOD2-Stack for Linked Data lifecycle management Partner Uni Leipzig, CWI, DERI Galway, FU Berlin, Semantic Web Company, OpenLink, Tenforce, Exalead, Wolters Kluwer, OKFN 3 EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 3 http://lod2.eu
  • 4. Creating Knowledge out of Interlinked Data LOD2 EC-funded collabarotive project that aims to utilize the Web as an integration platform for data and information Linked Data Linked Data provides the necessary basic technologies and standards to realize the goal of LOD2. Linked Open Data publicly accessible data which is to be integrated into the web and linked among one another and with non-public contents such as enterprise intranets Project Highlights Open Government Linked Data Initiative Common European platform publicdata.eu Leading Web 3.0 technologies are combined in the project in to the coherent LOD2 stack (e.g. DBpedia, Virtuoso, Sindice, Silk) EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 4 http://lod2.eu
  • 5. Creating Knowledge out of Interlinked Data WP1: Requirements, Design & LOD2 Stack Prototype Use Case High-Level Abstraction EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 5 http://lod2.eu
  • 6. Creating Knowledge out of Interlinked Data WP1: Use Case Objectives Objective of WP8: Applying Linked Data technologies in an enterprise stack to support Human Resources-related issues. ENTERPRISE APPLICATIONS (Exalead) Objective of WP7: MEDIA OPEN Objective of WP9: Supporting content-related & GOVERNMENT Improving accessibility, production workflows in the PUBLISHING DATA findability and reusability of media & publishing industry. Open Government Data. (Wolters (Open Kluwer Knowledge Germany) Foundation) EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 6 http://lod2.eu
  • 7. Creating Knowledge out of Interlinked Data WP2: Storing & Querying Very Large Knowledge Bases Goal:  Enabling large-scale, feature-rich & enterprise-ready Linked Data management solutions Database Partners in LOD2:  CWI - Leading open source analytics RDBMS  OpenLink - Leading Linked data deployment platform Technological Excellence:  Creating and publishing metrics for choosing RDF solutions  Bringing Column Store Technology for Business Intelligence on RDF  Ground-breaking database innovations for RDF stores (Dynamic Query optimization, Adaptive Caching of Joins, Optimized Graph Processing, Cluster/Cloud scalability) EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 7 http://lod2.eu
  • 8. Creating Knowledge out of Interlinked Data WP2: Linked Open Data For Real In Your Apps Business Advantages:  Enrich your application with (free & rich) Linked Open Data  RDF store technology has 10x lower deployment costs than relational for ragged data Technological Flexibility:  Deliver Schema-Last Flexibility and Inference at Relational Data Warehouse Cost and Performance  Grow as you go: the LOD2 platform dynamically adapts to your usage patterns and structure of your data  Integrate, resolve, align anything: Schema, instance identity Rich Features for complex Applications:  Advanced SPARQL and SQL query processing  SPARQL and SQL Federation  Full Text, Geospatial, Text Search  Scale-Out on Clusters, Replication EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 8 http://lod2.eu
  • 9. Creating Knowledge out of Interlinked Data WP3: Goals General Goal:  Creation, improvement, repair of knowledge bases Focus:  Very large knowledge bases, diverse knowledge, web data  Refine existing (Virtuoso Sponger, RDF Views, Triplify, D2R) triplification approaches  Improve schema of knowledge based on data  Fix problems in knowledge bases e.g. inconsistencies Techniques:  semi-automatic machine learning, ontology debugging, NLP, shallow parsing etc. EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 9 http://lod2.eu
  • 10. Creating Knowledge out of Interlinked Data WP3: Knowledge Base Improvement Cycle Mutual Refinement Cycle (with optional Extraction phase) Modelling Problems Repair Performance Problems Disjoint- Semi- ness Structured Definitions structured Extraction Enrichment Un- Linkage structured Validation EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 10 http://lod2.eu
  • 11. Creating Knowledge out of Interlinked Data WP3: Task Overview Provenance-Aware Extraction of Linked Data from Existing Structured Formats  Relational databases, spreadsheets, CMS, logs, XML documents  Development of D2R Triplify, Virtuoso Sponger Provenance-Aware Extraction of Linked Data from Unstructured and Semi- Structured Sources  HTML, PDF and Office documents with meta-data, wiki code, plain text  Development of NLP2RDF and DBpedia Knowledge Base Schema Enrichment  Learn axioms in knowledge bases, e.g. disjointness, definitions, super-classes  Development of ORE and DL-Learner Knowledge Base Repair  Fix inconsistencies, modeling problems, reasoning performance problems  Development of ORE Web Linkage Validator  Reports whether knowledge base is suitable to be interlinked with others EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 11 http://lod2.eu
  • 12. Creating Knowledge out of Interlinked Data WP4: Reuse, Interlinking and Knowledge Fusion (1) Goal: Provide open-source software components for link generation, schema mapping, data quality assessment and knowledge fusion. EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 12 http://lod2.eu
  • 13. Creating Knowledge out of Interlinked Data WP4: Reuse, Interlinking and Knowledge Fusion (2) Technological Excellence:  Ease the creation of RDF links by using machine-learning as well as link quality assessment workbench  Provide for the flexible integration of Web data based on mappings discovered on the Web  Provide for assessing the quality of Web data and fusing high-quality data. Expected Outcomes:  Link discovery tools, linking assist and workbench  Framework for publishing and discovering expressive mappings on the Web  Data quality assessment framework providing for a wide range of different quality assessment policies  Data fusion components providing various conflict resolution strategies EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 13 http://lod2.eu
  • 14. Creating Knowledge out of Interlinked Data WP 5: Linked Data Visualization, Browsing and Authoring (1) WP5 aims to build on and go beyond existing approaches for realizing adaptive Web user interfaces by:  Automatic content adaptation A subset of the domain knowledge is identified, possibly through some reasoning mechanism, as relevant to the current user and context.  Adaptive Browsing Faceted spatial semantic browsing: reusable component for browsing spatial content in a faceted way (also for mobiles). EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 14 http://lod2.eu
  • 15. Creating Knowledge out of Interlinked Data WP5: Linked Data Visualization, Browsing and Authoring (2) Adaptive Semantic Authoring  semantic widget interface: allows the creation of small reusable interface components for domain- specific user interfaces (also for mobiles).  adaptive widget choreography: enables the automatic generation of user interfaces.  social networking interfaces: enable users to subscribe to arbitrary information adhering to certain semantically defined filter criteria. EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 15 http://lod2.eu
  • 16. Creating Knowledge out of Interlinked Data WP 5: Technologies & Methods (1)  Semantic pipes: An engine and graphical environment for general Web Data transformations and Mashup.  Sig.ma: A service and an end-user application to access the Web of Data as an integrated information space. EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 16 http://lod2.eu
  • 17. Creating Knowledge out of Interlinked Data WP 5: Technologies & Methods (2) Site Services: Site Search and Site Widgets  Widgets (right) provide relevant information, from Sindice, about the topic of the site.  Site search (below) provides a rich faceted-browsing functionality of the site’s widgets. EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 17 http://lod2.eu
  • 18. Creating Knowledge out of Interlinked Data WP6: Interfaces, Integration & LOD2 Stack (1) This work package deploys the LOD2 stack, based on the requirements and prerequisites defined in work package 1. The LOD2 stack will be made available as downloadable packages. While leading work package 6, the following will be delivered:  Integrated user-interface components  Integrated LOD2 Stack API components  Evaluation, Documentation, Tutorials EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 18 http://lod2.eu
  • 19. Creating Knowledge out of Interlinked Data WP6: Interfaces, Integration & LOD2 Stack (2) Output WP1 Use Case WP7: Media & Media & Publishing Publishing Yearly releases Requirements/ Applied on Prerequisites Use Case Generates Open WP8: Requirements/ Applied Enterprise Prerequisites packages with Source on Media & integrated tools Data web Package Publishing SAF LOD2 stack (Software Assembly Factory) Requirements/ Released on Starts on 09/2011 • 09/2012 Prerequisites Applied on • 09/2013 • 09/2014 Use Case WP9: Government Media & Data Publishing EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 19 http://lod2.eu
  • 20. Creating Knowledge out of Interlinked Data WP7: LOD2 for Publishers Wolters Kluwer Deutschland (WKD): “Semantic Technologies and Standards are an enabler for the media and publishing industry to create added-value for their customers with reasonable costs.“ WKD Legal & Regulatory Companies/Brands Products (Examples) - Carl Heymanns Verlag - IP, Administrative Law WKD is part of Wolters Kluwer B.V. - Luchterhand - Civil, Family, Labor Law - Werner Verlag - Construction Law Customer orientation Worldwide reach - Carl Link - Publications for Schools/KiTas - Lawyers - Europe - CW Haarfeld - Public Health Insurance - Tax Accountants - North America - Deutscher - Magazin „Personalwirtschaft“ - Corporations and SMEs - Asia/Pacific Wirtschaftsdienst (HR Management) - Fincancial institutions - AnNoText - SW for Lawyers and Notaries - Health Providers Economic success - Trigon Data - Public Sector - Revenue 2009 EUR 3,4 bln. - 18.000 Employees WKD Tax & Accounting - Listed Amsterdam SE Companies/Brands Products (Examples) - Akademische Arbeits- - Tax SW for Consumers gemeinschaft Verlag - Addison Group - SW for Tax Accountants - Schleupen Tax - SW for SMEs with focus - Wago Curadata Controlling and Accounting EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 20 http://lod2.eu
  • 21. Creating Knowledge out of Interlinked Data WP7: WKD as a Consumer of LOD Data Content Supply Chain of Content Composing Publishing Customer Wolters Kluwer Editing Sales Customer Acquisition Bundling Interfacing Service Deutschland (WKD) Content Acquisition Content Enrichment Enterprise Applications Acquisition of LOD governmental data Enrichment of WKD data Data integration in Enterprise and other Costumer Applications - Laws & Regulations - Enrichment with additional metadata from the LOD cloud - Integration of customer and WKD data - Court cases with data from the LOD cloud - Automatic Interlinking within WKD data, - Administrative Rulings but also into the LOD cloud - Development of new services, e.g. - Statistical information around metadata economics Based on: Based on: Based on: - Adequate delivery format - Adequate delivery format - Adequate functionality - Adequate metadata - Adequate metadata - Adequate APIs - Adequate functionality - Adequate Licensing and IPR - Adequate Licensing and IPR - Adequate Licensing and IPR EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 21 http://lod2.eu
  • 22. Creating Knowledge out of Interlinked Data WP7: WKD as a Publisher of LOD Data Content Supply Chain of Content Composing Publishing Customer Wolters Kluwer Editing Sales Customer Acquisition Bundling Interfacing Service Deutschland (WKD) Cloud - Publishing Marketing measures Development of WKpedia Integration in overall marketing strategy of WKD - Publishing of enriched governmental information - Dissemination of LOD2 in media and publishing sector - Publishing of legal domain thesauri - Launching surveys - Motivating contextualisation in LOD cloud - Permanent information of customers Based on: - Sponsoring of conferences - Adequate functionality Based on: -Adequate APIs - Clear scope of LOD2 project to support future publishing paradigms - Adequate Licensing and IPR EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 22 http://lod2.eu
  • 23. Creating Knowledge out of Interlinked Data WP8: Towards Linked Enterprise Data Webs (1) Linked Enterprise Intra Data Webs can fill the gap between Intra-/Extranets and ERP systems Facilitates data integration along value-chains within and across enterprises The pragmatic, incremental, vocabulary based Linked Data approach reduces data integration costs significantly Objectives: Web publishing  Promote openness and standards in enterprise data workflows and applications EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 23 http://lod2.eu
  • 24. Creating Knowledge out of Interlinked Data WP8: Linked Enterprise Data Use Case Scenario (2) Wage policy EBI:  Build an application for surveying wage policy in a company, domain, sector, region, etc. Scenarios:  A company wants to know if its wage policy is consistent with the market (in similar and related companies and sectors).  A job applicant would like to have an idea about his wage expectations according to his expertise, profile and education background  A governmental agency would like to survey the salaries in a particular region according to an economic branch and other parameters EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 24 http://lod2.eu
  • 25. Creating Knowledge out of Interlinked Data WP8: Linked Enterprise Data (3) Targeted service:  A Saas service with different levels of subscription  The service is a mashup of payroll and HR data of enterprises subscribing to the service to build an index store of data facts about wages.  Different consolidation parameters and key performance indicators (KPI) will be studied to provide relevant reports and visualisation interfaces.  Integration of external datasets in a particular survey: public datasets in the web cloud or private datasets of participating companies.  Privacy issues management: make private and nominative data anonymous. EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 25 http://lod2.eu
  • 26. Creating Knowledge out of Interlinked Data WP8: Linked Enterprise Data (4) Preliminary overview: Search and EBI Employees interface Full and HR text database index Indexer Data Data Data Data Data LODification Taxonomies consolidatio enrichment, Cleaning and crawler and n annotation uniformisatio anonymisation n RDF store SPARQL Payroll software endpoint EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 26 http://lod2.eu
  • 27. Creating Knowledge out of Interlinked Data WP9: Open Government Data Use Case publicdata.eu - find and reuse datasets from local, regional and national public bodies across Europe from a single place EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 27 http://lod2.eu
  • 28. Creating Knowledge out of Interlinked Data WP9: Who is this for?  Data literate citizenry  Data journalists  Policy experts  Decision makers  Mobile and web developers  Academics / researchers  Public bodies  Companies  Civic society / NGOs  And so on... EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 28 http://lod2.eu
  • 29. Creating Knowledge out of Interlinked Data WP9: What will it involve?  Enabling exchange of metadata between different data catalogues  Aggregating datasets from existing data catalogues  Creating a European community of reusers to improve metadata  Creating mechanisms for capturing derived / related datasets  Bridge language and topical gaps to associate related information from all Member States EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 29 http://lod2.eu
  • 30. Creating Knowledge out of Interlinked Data WP10: Training, Dissemination, Community Building & Fertilization The general aim of this work package is to establish a worldwide focal point for academic and industry parties interested in contributing to or taking advantage of the novel Linked Data methodologies and components, which will emerge in the project. In particular, our activities will be targeted at:  informing the community of the state-of-the-art developments taking place in the field,  disseminating the project results in order to foster community building and to create an impact on industry and research in Europe and worldwide,  providing training to interested audiences in the technologies developed throughout the project EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 30 http://lod2.eu
  • 31. Creating Knowledge out of Interlinked Data WP10: Tasks & Timeline Task 10.1 Training (M 14 – M37)  Internal face-to-face training  External training  PhD programme Task 10.2 Dissemination, Community Building & Cross-Fertilization (M1 – 48)  Scientific dissemination  Industrial dissemination  Online marketing activities across all identified target groups EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 31 http://lod2.eu
  • 32. Creating Knowledge out of Interlinked Data WP10: LOD2 Dissemination Resources Website: http://lod2.eu Weblog: http://lod2.eu/BlogPost Twitter: http://twitter.com/lod2project SlideShare: http://www.slideshare.net/lod2project PUBLINK: http://lod2.eu/Article/Publink.html Remark: please use #lod2 on twitter for your posts & connect with account: lod2project many thanks in advance!! EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 32 http://lod2.eu
  • 33. Creating Knowledge out of Interlinked Data PubLink – LOD2’s Linked Open Data Starter Service • PubLink helps selected organizations with a focused consulting effort of 10-15 days to publish and make use out of Linked Data • PubLink helps to evaluate the LOD2 technologies and to increase the wealth of Linked Data • Yearly application deadline in Winter • 2011 PubLink participants include: 1. Umweltbundesamt GmbH, Austria 2. Greater London Authority 3. Deutsch Bibliographie, Historische Kommission 4. The Parliament of Finland 5. City of Vienna 6. Instituto Canario de Estadística (ISTAC) • See: http://lod2.eu/Article/Publink.html EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 33 http://lod2.eu
  • 34. Creating Knowledge out of Interlinked Data WP11: Exploitation and Standardization (1) Objectives:  Realizing the vision of the LOD2 project and use case studies  Standardisation of LOD2 architecture  Exploitation of knowledge and technical results Exploitation:  Use case studies and the industrial and end-user community partners will drive the exploitation.  Tracking important technical and commercial in information retrieval, data management including news and media.  Publish exploitation plan identifying opportunities, benefits and impact of LOD2 consortium. EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 34 http://lod2.eu
  • 35. Creating Knowledge out of Interlinked Data WP11: Exploitation and Standardization (2) Interlectual Property Rights (IPR):  Core component of the LOD2 stack will be published under open-source license.  Domain adoptions of LOD2 stack considered on case-by-case basis to protect IPR.  Strategy ensures that all components of LOD2 are royalty-free. Standardization:  Actively participating in appropriate standards bodies.  Establishing a W3C Linked data interest group. Orchestration with other projects:  Encourage take-up of LOD2 technologies by other projects.  Foster input from other EU projects relevant for the development of LOD2 EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 35 http://lod2.eu
  • 36. Creating Knowledge out of Interlinked Data WP12: Fact Sheet Project Means  Instrument: Large-scale Integrating Project  Objective: Intelligent Information Management  Total Budget: 8,58 M€  Call: FP7-ICT-2009-5  Total Funding: 6,45 M€  Duration: 09/2010 – 08/2014  Total Resources: 844 PM Consortium  Universität Leipzig (Coordinator)  Centrum Wiskunde & Informatica  National University of Ireland in Galway  Freie Universität Berlin  OpenLink Software  Semantic Web Company  TenForce  Exalead  Wolters Kluwer Deutschland 10 Partners from 7 European  Open Knowledge Foundation Countries EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 36 http://lod2.eu
  • 37. Creating Knowledge out of Interlinked Data Contact Address Coordinator University of Leipzig Dr Sören Auer Faculty of Mathematics and Computer Science Scientific Project Leader Institute of Computer Science Phone:+49 (341) 97-32367 Department of Business Information Systems Fax: +49 (341) 97-32329 Postfach 100920 Email: auer@uni-leipzig.de 04009 Leipzig http://www.informatik.uni-leipzig.de/~auer Germany Nadine Jänicke Project Manager Phone:+49 (341) 97-32310 Fax: +49 (341) 97-32329 Email: jaenicke@uni-leipzig.de http://bis.informatik.uni-leipzig.de/NadineJaenicke Thanks for your attention! http://lod2.eu LOD2 Title . 02.09.2010 . Page 37 http://lod2.eu

Editor's Notes

  1. The LOD2 consortium partners bring the essential knowledge and tools to build the LOD2 stack …