SlideShare a Scribd company logo
A Semantic Big Data
Companion
Stefano Bortoli
bortoli@okkam.it
Flavio Pompermaier
pompermaier@okkam.it
The company (briefly)
• Okkam is
– a SME based in Trento, Italy.
– Started as spin-off of the
University of Trento and FBK (2010)
• Okkam core business is
– large-scale data integration using
semantic technologies and
an Entity Name System
• Okkam operative sectors
– Services for public administration
– Services for restaurants (and more)
– Research projects
• FP7, H2020, and Local agencies
Who we are
• Stefano Bortoli, PhD
– works as technical director and researcher at Okkam S.R.L.
(Trento, Italy). His research and development interests are in the
area of Information Integration, with special focus in entity-
centric applications exploiting semantic technologies.
• Flavio Pompermaier, MSc.
– works as senior software engineer at Okkam S.R.L. (Trento, Italy).
Flavio is a passionate developer working with state of the art
technologies, combining semantic with big data technologies.
What we do
Why we need Flink
Entiton data model
Database record
RDF statement
Triplestore
NOSQL
& Index
+
Quad
provenance IRI
predicate
object
object Type
Subject
local IRI
Subject
ENS IRI
RDF Type
Expensive
datawearhouse
Why we are here
• We want to build and manage (very) large
entity-centric knowledge bases
• We endorsed Flink since Stratosphere as data
processing framework (during DOPA FP7)
• Our use cases for Apache Flink:
– Domain reasoning (Flink + Parquet + Thrift)
– RDF data lifecycle (Flink + Parquet + Jena/Sesame )
– RDF data intelligence (Flink + ELKiBi)
– Duplicate record detection (Flink + HBase + Solr)
– Entiton Record linkage (Flink + MongoDB + Kryo)
– Telemetry analysis (Flink + MongoDB + Weka)
Come to our session!
• We are the last presenting, don’t let us ALONE!
• We are hiring! (maybe ;-)

More Related Content

Flink Case Study: OKKAM

  • 1. A Semantic Big Data Companion Stefano Bortoli bortoli@okkam.it Flavio Pompermaier pompermaier@okkam.it
  • 2. The company (briefly) • Okkam is – a SME based in Trento, Italy. – Started as spin-off of the University of Trento and FBK (2010) • Okkam core business is – large-scale data integration using semantic technologies and an Entity Name System • Okkam operative sectors – Services for public administration – Services for restaurants (and more) – Research projects • FP7, H2020, and Local agencies
  • 3. Who we are • Stefano Bortoli, PhD – works as technical director and researcher at Okkam S.R.L. (Trento, Italy). His research and development interests are in the area of Information Integration, with special focus in entity- centric applications exploiting semantic technologies. • Flavio Pompermaier, MSc. – works as senior software engineer at Okkam S.R.L. (Trento, Italy). Flavio is a passionate developer working with state of the art technologies, combining semantic with big data technologies.
  • 5. Why we need Flink Entiton data model Database record RDF statement Triplestore NOSQL & Index + Quad provenance IRI predicate object object Type Subject local IRI Subject ENS IRI RDF Type Expensive datawearhouse
  • 6. Why we are here • We want to build and manage (very) large entity-centric knowledge bases • We endorsed Flink since Stratosphere as data processing framework (during DOPA FP7) • Our use cases for Apache Flink: – Domain reasoning (Flink + Parquet + Thrift) – RDF data lifecycle (Flink + Parquet + Jena/Sesame ) – RDF data intelligence (Flink + ELKiBi) – Duplicate record detection (Flink + HBase + Solr) – Entiton Record linkage (Flink + MongoDB + Kryo) – Telemetry analysis (Flink + MongoDB + Weka)
  • 7. Come to our session! • We are the last presenting, don’t let us ALONE! • We are hiring! (maybe ;-)