BrainSpa Paper

BrainSpa – A Web Application for Exploring
Knowledge using SPARQL

Eugen Ignat1, Sabin Pochiscan1, Radu Simionescu1, Simona – Adina Toderas1
1
Artificial Intelligence and Computational Linguistics Department, Faculty of Computer
Science - Iasi, Romania
eugen.ignat@info.uaic.ro
sabin.pochiscan@info.uaic.ro
radu.simionescu@info.uaic.ro
simona.toderas@info.uaic.ro

Abstract. The World Wide Web is a dynamic environment that everyone (from
user to expert) is excited about. Now entering a third stage of life, it faces new
challenges: finding good ways to model knowledge about things by attaching
meta-data to data itself. This paper focuses on how the WWW is becoming a
less ambiguous space and what are the many ways to take advantage of new
features becoming available for no cost except one's interest. A particular case
study is done on BrainSpa – our own web application for exploring various
SPARQL endpoints and sharing queries with members of the semantic lovers
community.

Keywords: RDF, SPARQL, web application, PHP, concepts, modeling, RAP,
OAuth, query, tag, endpoint, prefix, semantic, WWW.

1 Introduction

Some well known web jokes flirt with the idea of Google-ing your car keys or
mismatched socks. Fortunately for the on-line man, that era may not be so far away,
given the recent effort put into moving the WWW into a semantic driven zone for
modeling concepts and linking information.
In order to satisfy the need for organized, unambiguous information, certain
organizations like W3C have taken the initiative to develop specifications, languages
and technologies that are free to use and more than appealing for the task of
annotating knowledge from any domain. Making use of such innovative technologies
allows users to develop all kinds of interesting and creative applications that certainly
prove useful in a wide range of domains (given today's demand for automating as
many processes as possible).
BrainSpa adheres to the above mentioned group of applications – it is an
interesting tool in the form of a web application that allows its consumer to explore
knowledge available in the World Wide Web (in the form of RDF files) using
SPARQL without explicitly having to know the query language. The querying is done
by completing an on-line form with requested information either in an anonymous

way or by logging in to one's account. Obviously, having an account provides more
benefits than using BrainSpa in the anonymous fashion: registered users are presented
with the opportunity to save or share their queries on the server side, and even store
queries and results on their local computers. Also, registering for this service is
extremely easy and does not require memorizing another pair of user-name /
password credentials; anyone can log in using an existing Twitter, Gmail, Yahoo! or
YouTube account, thanks to the advantages brought by OAuth [1] – an open protocol
that allows secure API authorization in a simple manner. This way, BrainSpa is not
just a client – server solution for browsing the annotated data available online, but the
foundation for a community-driven environment.
More about the technologies involved in the project, among with other
information, will be presented in the remaining of this paper, as following: coming up
next is a study regarding the current situation of semantic resources and applications
available so far, while section 3 lists and overviews everything involved in the actual
development of BrainSpa; in the fourth chapter, a few use-cases for the application
are mentioned, together with the relevant diagrams; the paper concludes with chapter
5 followed by a list of references.

2 Overview

The World Wide Web is slowly moving into a semantic-driven zone and this can
easily be proven by presenting the new technologies and services freely available to
attend to the task of modeling knowledge, annotating data and exploring concepts.
Specifications for semantic markup / modeling like RDF or the lightweight micro-
formats are gaining popularity as more and more tools that try to improve the
“internet surfing” experience make their way on the market. For example, Firefox
extensions (Tails, Operator) have been developed for exploring or operating with
micro-formats. Similarly to the HTML Validator, W3C offers validating and
visualizing services for RDF documents. Also, there is a big number of recently
developed semantic frameworks available for many different programming
languages:
• D2R Server, Joseki, Sesame and Mulgara for Java,
• RAP and ARC for PHP,
• 4store, OpenLink Virtuoso and Oracle Spatial 11g for C/C++,
• RDFStore for C and Perl,
and many more, all having specific methods for reading / parsing data from semantic
formats, storing RDF triples to a database, creating queries or accessing endpoints.
Just like for storing information in the traditional database manner, RDF knowledge
comes with a querying solution – the SPARQL query language. SPARQL (named as
a recursive acronym that stands for SPARQL Protocol and RDF Query Language) can
be tasted at the various endpoints that offer user interface for this purpose. The
biggest project offering a query solution is DBpedia – a project aimed at extracting
structured information from available Wikipedia information. W3C offers an up-to-
date and accurate list of SPARQL endpoints for exploring content from a wide range
of domains.

All these are just a few basic examples of technological advancement achieved in
the semantic web area. Of course, more tools are available and can be created by
developers willing to contribute in the progress of the WWW, and BrainSpa tries to
be such a tool.

3 Architecture

BrainSpa was created by following the general guidelines of software engineering.
After an in-depth analysis of the requirements came the architectural and detailed
design of the desired software product. The coding process evolved in a modular
style, followed by an incremental integration of the system. In the validation step, two
questions were asked and successfully answered - “Are we building the right
product?” and “Are we building the product right?” - in order to test if the initial
requirements were fully respected and implemented in a functional manner. The last
step, maintenance – having the longest lifespan – starts after the product is deployed
and ends with the author's loss of interest in it.
The following sub-chapters provide more information regarding everything
involved in the actual development of BrainSpa.

3.1 Technologies

This section provides an overview of all technologies involved in the creation of
BrainSpa. They were chosen based on accessibility, (lack of) price, interoperability
and position towards freeware / open-source.

3.1.1 Dropbox

In the planning stage, one of the first issues we had to deal with was how to share
files (for source code, scripts, images, documentation, etc.) between the developers in
a versatile yet time saving manner. With the though in mind that BrainSpa is a
relatively small project compared to the mammoths of the IT industry, we considered
that adopting version control in an SVN manner would more likely separate the
members of the team rather than making them work together. So we found a (at least
in our opinion) better solution in Dropbox [2] – a free service for sharing files among
users.

Since we were already using Dropbox for personal projects, the existing accounts
needed only some new shared folders which would store files for specific tasks
(images, actual projects with libraries or documentation). Because Dropbox has file

history and versioning and supports operations like restore, the shared data is exposed
to no risk of accidentally deleting / changing any vital information.

3.1.2 Creately

Another issue encountered in the planning stage was finding the best tool for tasks
like visually modeling the database schema or creating use-case diagrams. We had
experience with ArgoUML and Creately [3] from which we chose the later because
(unlike ArgoUML) it is available as a web application, it provides visual elements for
creating a wide range of diagrams (that can be exported as images or PDFs) and it
provides the opportunity to share either files or entire projects among Creately users.

Taking advantage of this service enables developers to save time by focusing more
on how to project / model information in a visual way without having to worry about
versioning, sharing or letting someone know that a diagram content has / needs to be
changed somewhere on the trunk of the project. Last but not least, Creately is
appealing due to it's modern, eye-candy design of layout and components (an
advantage presented in the form of relevant diagrams in the following sections of this
paper).

3.1.3 280 Slides

Another extremely useful tool for on-line, collaborative work is 280 Slides [4] – the
free web application for creating, saving, editing and sharing presentations in the
easiest way possible.

280 Slides proved useful in creating a beautiful, quality presentation for the
BrainSpa project, presentation that was contributed to easily by every member of the
team, since it was always backed-up and available online.

3.1.4 Open Office

A considerable rival for Microsoft Office, OpenOffice [5] is the “free and open
productivity suite” available for just a download and the execution of a clean installer.
Due to it's lack of price and the fact that it is open-source, this office solution is not

only very popular, but also compatible with any operating system or bureaucratic
task.

Among the available applications of OpenOffice, Writer was used for the creation
of the present document that conforms to LNCS standards.

3.1.5 CodeIgniter

Because the server-side of BrainSpa, developed in PHP, must deal with complex tasks
for database operations, session / cookies management, keeping the views and the
data separated with the use of controllers, the project cannot do without a powerful
PHP framework for web applications.

CodeIgniter [6] is the best candidate as it provides a simple yet powerful
environment with minimal configuration and maximal resourcefulness through its
large number of libraries. The framework not only proves excellent performance
results, but also provides all necessary resources for completing tasks like database
administration and session management. Also, it is open-source and based on the
Model-View-Controller design pattern, very popular especially when it comes to
building web applications.

3.1.6 Zend Framework

The Zend Framework [7] is another powerful solution for building PHP web
applications. It also provides significant resources for common database or session
management tasks in the OOP and MVC fashion, but it is most popular for it's “use-it-
all” framework statute.

Because Zend Framework has a more than friendly attitude towards the modern,
Web 2.0 applications and web services – it provides ways for consuming widely
available APIs from leading vendors like Google, Amazon, Yahoo! or Flickr – our
project makes use of it, together with OAuth, for enabling users to log in on BrainSpa
using an existing account from Yahoo!, Google, Twitter or YouTube.

3.1.7 OAuth

OAuth is an open protocol for secure API authorization in a simple and standard
method from desktop and web applications. What it does is allowing the user to grant
access to his private resources (located in one site – the Service Provide) to another
site (the Consumer) without sharing the user's identity.

The work-flow of OAuth implementations is consistent for most service providers
and adheres to the following steps:
• the developers signs up to the service provider in order to get a consumer
key and a shared secret;
• the provider gives the developer a request token;
• the application redirects the user to the service provider web site in order to
obtain user authorization;
• given the user authorization, the service provider redirects back to the
application;
• upon receiving a request token and OAuth verifier, the service provider
grants an access token and a token secret that can be taken advantage of
until they expire.

3.1.8 RAP

Because BrainSpa is a semantic-oriented web application, some extra operations are
involved in the overall functionality of the system, operations regarding sending a
SPARQL query to an endpoint and receiving an RDF result that will be transformed
into visual-appealing format. This is were RAP [8] can play its role as a powerful
RDF API for PHP with some interesting features like:
• methods for manipulating RDF models as a set of RDF triples or resources
or through vocabulary specific methods,
• integrated RDF/XML, N3, N-TRIPLE, TriX parsers and serializers,
• in-memory / database storage,
• SPARQL query engine and client library,
• integrated RDF server (similar to the Joseki RDF server),
these being just a few.
RAP is the most suitable software package for parsing, querying, manipulating,
serializing and saving RDF models.

3.2 Development

Regarding the model used in developing BrainSpa, a predominant XP (Extreme
Programming) technique was adopted by the team. There was no hierarchical
distribution among the members, a collaborative working style was encouraged, and
each of us was able to bring their contribution to the project by turning to profit
personal skills. The initial task was devised into a number of issues that we could
work on alone or in pairs, and we met regularly (both on-line and in person) to
discuss so far progress and future directions to follow.

3.2.1 Responsibilities

In order to adhere to IT standards and survive on the market, the project, initially
called WebSpa, needs a strong identity. All marketing aspects (name change, logo,
diagrams, documentation, presentation, speeches) together with some architectural
responsibilities, database design, testing and research were handled by Adina
Toderas.

The User Interface (developed using HTML + CSS, JavaScript and jQuery) and
client-side aspects were Sabin Pochiscan's responsibilities.
Last but not least, server-side aspects (querying, saving, OAuth, RAP, etc.) were
dealt with by the pair of the last two members in the team, Eugen Ignat and Radu
Simionescu, with occasional help from Adina Toderas (for testing).
Each of the four authors had the opportunity to make use of their personal skills
and work on what they enjoyed most / were good at. This is the biggest immaterial
reward one can ask for when it comes to school or career.

3.2.2 Coding

In order to make use of advantages like modularity, re-usability, polymorphism,
inheritance and abstraction, the well known Object Oriented coding style was
adopted. Also, because BrainSpa is a web application that benefits of data persistence
(saving user information, queries, tags, descriptions to the database on the server side)
and having a rather complex user interface, it is implemented in the guidelines of the
Model-View-Controller design pattern. The MVC states that data and view should be
separated within a software entity, and should only communicate with each other

using a special controller developed specifically for that data and that view. In other
words, while the Model and the View are quite often reusable, the Controller is not.
In our project, information regarding users and their queries is saved in a database
storage system (the Model of MVC). Figure 1 depicts the schema for the mentioned
database.
The View of the MVC is the user interface itself – a visual interactive space that
the user utilizes in order to communicate with BrainSpa and take advantage of its
features and capabilities. The UI (Fig. 2, 3 and 4) is composed of a web page which
handles different functionality tasks like logging in, registering, querying an endpoint
or browsing through existing public queries. The project interface is developed as a
RIA (Rich Interface Application) – similarly to a desktop application, it provides as
much functionality as possible within the same window of interaction. Also, the main
module of BrainSpa, which handles the construction of SPARQL queries, is inspired
from the “View” module in Drupal (Fig. 5) that has similar functionality - generating
a MySQL query without explicitly knowing the MySQL query language.
Development for BrainSpa was done mostly using NetBeans, the Java integrated
development environment that can be user for coding in many other languages
besides Java, languages such as JavaScript, PHP, Python, Ruby, C, C++, Scala or
Clojure. Because the IDE works anywhere if there is a Java Virtual Machine installed,
it is a platform independent working environment. A screen-shot of the project in
NetBeans is shown in Figure 6.

Fig. 1. Diagram for the database schema (done using Creately service) of the BrainSpa project.

Fig. 2. BrainSpa user interface – query builder.

Fig. 3. BrainSpa user interface – query builder in action.

Fig. 4. BrainSpa user interface – results.

Fig. 5. Drupal “View” module for generating MySQL queries.

Fig. 6. BrainSpa source files as seen in NetBeans.

As it was mentioned in a previous section, the sharing and version control process
for BrainSpa source files was handled by Dropbox, a free, lightweight service for on-
line backup and file sync. Because of this, a collaborative working style was adopted
by the team members – frequent meetings (both on-line and in person), working in
pairs, etc.
The last diagram of this sub-chapter, Figure 7, represents a detailed deconstruction
of the regular SPARQL query; all aspects involved in the query are shown in a tree-
like structure in order to reflect and argument the display of user interface elements
involved in generating an interrogation.

Fig. 7. Detailed deconstruction of a SPARQL query.

4 Use-cases

No matter how efficient and functional a software package is, it must prove to have
some meaningful use to the target audience, it must practically answer the question to
a problem that is of interest to a certain group of people. The presented project aims to
offer solutions for exploring knowledge modeled with the use of the RDF
specifications, knowledge available at endpoints that the query will reach and
interrogate, thus obtaining a result to give back to the user. The target audience is
composed of users having a small amount of technical knowledge in IT and that are

fond of web and semantic technologies, but it can be extended to a wider class of
users with no IT background that want to come across valid knowledge.
BrainSpa can be invoked either in an anonymous manner or with an account, the
later being preferred since it provides more feature that are community oriented.

4.1 Anonymous use

Users can access the BrainSpa web application in an anonymous fashion (without
registering with an existing account) but the functionality is limited, as the project
aims to be community oriented. The only available option is filling the query form in
order to compose and send a SPARQL request to an endpoint and receive the results
(displayed to the user in table manner). A relevant use-case diagram is presented in
Figure 8, showing how the actors involved in the scenario (the User, the System –
BrainSpa – and a SPARQL query endpoint interact with each other.

Fig. 8. Use-case diagram for an anonymous connection to BrainSpa web application.

4.2 Registered use

One can make use of the BrainSpa services fully by using an account. One of the
most interesting parts of the project is the fact that a user does not need to actually

register with BrainSpa and memorize another pair of user-name / password
credentials. The project takes advantage of the OAuth protocol which means that
anyone having a Yahoo!, Google, YouTube or Twitter account can log in to the web
application using that account. Once logged in, the number of options available
increases.
A possible use-case scenario is the following: somebody wants to find accurate
information regarding a certain subject (for example, comments about Romania) and,
after obtaining the results, store them together with the query on the local computer.
All one needs to do is complete the form available online in order to generate a query
like the following:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-
schema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?info WHERE
{ <http://dbpedia.org/resource/Romania>
rdfs:comment ?info . }

After the endpoint processes the received query, it sends a result back to the web
application that displays the received information in a table fashion. Both the results
(as an RDF file) and the query can be saved to the local file system only with a few
clicks.
Another use-case scenario is the following: a registered user wants to save his
queries in order to use them in the future (Since the information available on-line and
modeled as RDF will continue to change and hopefully change, it is evident that a
today's result to a query will look different then the result of the same query executed
in a month from now.). Also, the user wants to attach both tags and a description to
his query in order to distinguish his saved information more easily. This is also
possible at the cost of just a few clicks in the user interface. Even more, the user is
presented with the option of saving his query either private or public, which brings us
to the next use-case scenario.
A user wants to browse through the existing shared queries. This is made possible
through the help of a search form that can receive as input the tags or / and
description key words one wants to filter by. Upon searching, the application will
interrogate the public queries stored in the database and display the results to the user
in the user interface. After reviewing them, he can eventually chose to execute and
see the results.
But this is not all that BrainSpa has to offer. Other important features are the
possibility of favoring specific query information like endpoints and prefixes: each
user is provided with lists of his favorite endpoints and prefixes, and the possibility of
adding or removing entries from those lists is of course made available.
Figure 9 presents the complete use-case scenarios diagram. Each major operation
possible within BrainSpa is represented by an oval use-case element, while the arrows
indicate the direction followed by each operation.

Fig. 9. Use-case diagram for a user logging in with an account to BrainSpa web application.

5 Conclusions

The idea of browsing the World Wide Web in an intelligent, concept driven manner is
extremely appealing but seems to be far from happening in the next few years.
However, important progress has been made in domains closely related to the
problem at hand, and we are not that far from using a search engine than knows how
to distinguish between the Java programming language, the Java island and the Java
coffee.
Certain organizations like W3C have taken the initiative to develop specifications,
languages and technologies that are free to use and more than appealing for the task
of annotating knowledge from any domain. Making use of such innovative
technologies allows users to develop all kinds of interesting and creative applications
that certainly prove useful in a wide range of domains (given today's demand for
automating as many processes as possible).
BrainSpa adheres to the above mentioned group of applications – it is an
interesting tool in the form of a web application that allows its consumer to explore
knowledge available in the World Wide Web (in the form of RDF files) using
SPARQL without explicitly having to know the query language. The authors meant to
develop a tool that tries to improve the on-line experience of a user fond of web and
semantic technologies. The development process of the project was complex and
helped every team member enrich their knowledge and technical experience.
Research has been done on a large amount of technologies (besides the ones
mentioned in the present paper), so that the most suitable of them may be chosen to
help obtain good functionality and performance within the software application.
As the use-cases demonstrated, BrainSpa represents the first step in building a
community of users that are interested in innovation and technological advancement.
Hopefully, with time, it will evolve more features and gain a large number of
members. Even at the current stage, the authors believe it to be an interesting and
useful tool that can be later integrated in solving more complex problems encountered
in the semantic web area of research.

References

1. OAuth, http://oauth.net/
2. Dropbox, https://www.dropbox.com
3. Creately, http://creately.com/
4. 280 Slides, http://280slides.com/
5. OpenOffice, http://www.openoffice.org/
6. CodeIgniter, http://codeigniter.com/
7. Zend Framework, http://zendframework.com/
8. RAP, an RDF API for PHP, http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/
9. Twitter API Wiki, http://apiwiki.twitter.com
10.Authentication and Authorization for Google APIs,
http://code.google.com/apis/accounts/docs/OAuth.html
11.Programmer's Reference Guide to Zend Framework and OAuth,
http://framework.zend.com/manual/en/zend.oauth.introduction.html

12.Yahoo! OAuth authorization model, http://developer.yahoo.com/oauth/
13.Developer's guide to Youtube Data API,
http://code.google.com/apis/youtube/2.0/developers_guide_protocol.html
14.Code Recipes, http://code.activestate.com/recipes/
15.JSON in JavaScript, http://www.json.org/js.html

BrainSpa Paper

Related slideshows

More Related Content

BrainSpa Paper