DBpedia's Triple Pattern Fragments

DBpedia's 
Triple Pattern Fragments
Ruben Verborgh

Building applications with 
Linked Data from the Web 
should become as realistic 
and easy as using Web APIs.

<a class="twitter-‐timeline"
href="https://twitter.com/twitterdev"
data-‐widget-‐id="YOUR-‐WIDGET-‐ID-‐HERE">
Tweets by @twitterdev
</a>
<script>
window.twttr=(function(d,s,id){var
js,fjs=d.getElementsByTagName(s)[0],
t=window.twttr||{};
if(d.getElementById(id))return; 
js=d.createElement(s);js.id=id;js.src=
"https://platform.twitter.com/widgets.js";
fjs.parentNode.insertBefore(js,fjs);
t._e=[];t.ready=function(f)
{t._e.push(f);};return t;}
(document,”script","twitter-‐wjs"));
</script>
30 Followers you know
Tweet
to Message
Tomasz
Pluskiewicz @tpluscode · Jan 28
LinkedDataFragments retweeted
I finally really understood @LDFragments
videolectures.net/iswc2014_verbo… via @videolectures
2 2
Miel
Vander
Sande @Miel_vds · Jan 23
There’s a @LDFragments #Java client bit.ly/1yWVCWW & server
bit.ly/1AXPpVK. Under develop, help welcome! #semweb #LinkedData
View summary2 4
Ruben
Verborgh @RubenVerborgh · Jan 22
My #ISWC2014 talk “Querying datasets on the Web with high
availability”: explains #RDF #API trade-offs.
videolectures.net/iswc2014_verbo… @LDFragments
3 9
Ruben
Verborgh @RubenVerborgh · Jan 12
“The @LDFragments server design lets us publish and query @OrgRef
data using a free Heroku instance.”
—@tedlawless lawlesst.github.io/notebook/orgre…
1 2
Ted
Lawless @tedlawless · Jan 11
Notebook: mapping @OrgRef to RDF and publishing with @LDFragments.
Next up matching to @VIVOcollab instances.
lawlesst.github.io/notebook/orgre…
9 11
LinkedDataFragments @LDFragments · Dec 17
First Triple Pattern Fragments server running on Heroku, set up by
@tedlawless. github.com/LinkedDataFrag… #affordable #LinkedData
View summary1 2
LinkedDataFragments @LDFragments · Nov 26
Querying Linked Data in HTML5 has never been easier: via a simple tag,
you can query in a streaming or polling way github.com/tomayac/polyme
…
View summary5 9

The easy part is covered.
All triples look the same: 
subject – predicate – object.
That’s a major advantage 
over 15.000+ diﬀerent APIs.

But is it also realistic?
<95%
MORE THAN HALF 
of public SPARQL endpoints
AVAILABILITY
Buil-Aranda – Hogan – Umbrich – Vandenbussche 
SPARQL Web-Querying Infrastructure: Ready for Action?

We design simpler server interfaces 
to publish Linked Data at low cost,
making clients solve complex queries.
We host a DBpedia interface,
and hope to inspire you to 
build cool DBpedia applications.

About DBpedia’s fragments
Usage analysis so far
Fragments in the future

Classic Linked Data publishing: 
simple client, hard-working server.
client server
SELECT * WHERE {
?person a dbpedia-owl:Writer;
rdfs:label ?name;
dbpedia-owl:almaMater[ rdfs:label "Trinity College, Dublin"@en ].
FILTER LANGMATCHES(LANG(?name), "EN")
}

SPARQL endpoints have to work hard 
in comparison to other servers.
server
highly individualized requests
low cacheability
large per-request processing cost

SPARQL endpoints might bring 
high costs for possibly low gain.
server
“Fine, we’ll publish a data dump.”
Data dumps don’t allow 
Web applications on live data.
You already provide data for free, 
is it realistic to pay for users’ queries, too?

There’s more to Linked Data publishing 
than just the two extremes.
data 
dump
SPARQL 
endpoint
high server efforthigh client effort
Triple Pattern 
Fragments
server

Future Linked Data publishing: 
simple servers, clever clients.
client server
?person a dbpedia-owl:Writer.
Triple Pattern 
Fragments

Triple Pattern Fragment servers 
cannot work hard, by definition.
server
highly reusable requests
high cacheability
low per-request processing cost

Complex queries are efficiently 
executed on the client-side.
client
SELECT * WHERE {
?person a dbpedia-owl:Writer;
rdfs:label ?name;
dbpedia-owl:almaMater dbpedia:Trinity_College_Dublin.
}
Query now at fragments.dbpedia.org.
In your browser—in pure JavaScript!

data (paged)
controls (other fragments)
metadata (total count)

Pingdom
155.702
Chrome
285.385
GoogleBot
332.177
TPF Client
3.219.756
The Client for Node.js is leading, 
but other bots (and humans) follow.

JSON
2.867
283.035
HTML
316.183
TriG
833.341
Turtle
2.810.532
(anything)
Turtle has been most popular, 
but TriG will lead soon.

Expired
164.105
Hit
1.249.571
Miss
2.838.720
More than a fourth of all requests 
was served directly from the cache.

Oct 2014 Nov 2014 Dec 2014 Jan 2015 Feb 2015
267.196
1.038.867
357.305
2.433.045
157.805
in 4 days
17 Oct 2014 – 4 Feb 2015
November was a very busy month; 
but February might top that!

“Type“ fragments were most common, 
then “all”, and specific subclasses.
?s rdf:type ?o ?s ?p ?o <s> rdfs:subClassOf ?o
111.773
148.411
155.706

DBpedia fragments had 99.99% uptime, 
having < 5 mins downtime per month.
1 request every minute by Pingdom
20 did not return
5 of which due to planned maintenance

We plan to extend the interface 
with features that improve querying.
data 
dump
SPARQL 
endpoint
high server efforthigh client effort
Triple Pattern 
Fragments

APPS
No more excuses—start building! 
DBpedia is queryable and 99.9% up.
It’s time to make
on top of live DBpedia data.

Instead of making apps with Web APIs 
we’ll build apps from Linked Data.
APPS
WEB
on top of live DBpedia data.

Dublin
SPARQL 
endpoint
Dublin
The classical approach: 
we ask, we wait, we act.

Dublin DublinDublin
The Web approach: 
we ask—and act as results arrive.

by iMinds – Ghent University

your app?
client = new ldf.FragmentsClient('http://fragments.dbpedia.org/2014/en');
stream = new ldf.SparqlIterator(query, { fragmentsClient: client });
stream.on('data', doSomethingNice);
easy & realistic
3 lines of JavaScript
& fun

fragments.dbpedia.org
@RubenVerborgh

DBpedia's Triple Pattern Fragments

Related slideshows

More Related Content

DBpedia's Triple Pattern Fragments