SlideShare a Scribd company logo
Deliverable 4.6 Contextualisation solution and implementation
Matei Mancas, Fabien Grisard, François Rocca (UMONS)
Dorothea Tsatsou, Georgios Lazaridis, Pantelis Ieronimakis, Vasileios Mezaris (CERTH)
Tomáš Kliegr, Jaroslav Kuchař, Milan Šimůnek, Stanislav Vojíř (UEP)
Werner Halft, Aya Kamel, Daniel Stein, Jingquan Xie (FRAUNHOFER)
Lotte Belice Baltussen (SOUND AND VISION)
Nico Patz (RBB)
31.10.2014
Work Package 4: Contextualisation and Personalization
LinkedTV
Television Linked To The Web
Integrated Project (IP)
FP7-ICT-2011-7. Information and Communication Technologies
Grant Agreement Number 287911
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 2/96
Dissemination level1 PU
Contractual date of delivery 30th
September 2014
Actual date of delivery 31st
October 2014
Deliverable number D4.6
Deliverable name Contextualisation solution and implementation
File LinkedTV_D4.6.docx
Nature Report
Status & version v1.0
Number of pages 96
WP contributing to the de-
liverable
WP 4
Task responsible UMONS
Other contributors UEP
CERTH
FRAUNHOFER IAIS
SOUND AND VISION
RBB
Author(s) Matei Mancas, Fabien Grisard, François Rocca (UMONS)
Dorothea Tsatsou, Georgios Lazaridis, Pantelis Ieronimakis,
Vasileios Mezaris (CERTH)
Tomáš Kliegr, Jaroslav Kuchař, Milan Šimůnek, Stanislav Vojíř
(UEP)
Werner Halft, Aya Kamel, Daniel Stein, Jingquan Xie
(FRAUNHOFER)
Lotte Belice Baltussen (SOUND AND VISION)
1 • PU = Public
• PP = Restricted to other programme participants (including the Commission Services)
• RE = Restricted to a group specified by the consortium (including the Commission Services)
• CO = Confidential, only for members of the consortium (including the Commission Services)
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 3/96
Nico Patz (RBB)
Reviewer Jan Thomsen, CONDAT
EC Project Officer Thomas Küpper
Keywords Contextualization, Implementation, Ontology, semantic user
model, interest tracking, context tracking, preference learning,
association rules, user modelling, context
Abstract (for dissemination) This deliverable presents the WP4 contextualisation final im-
plementation. As contextualization has a high impact on all the
other modules of WP4 (especially personalization and recom-
mendation), the deliverable intends to provide a picture of the
final WP4 workflow implementation.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 4/96
Table of contents
1 Contextualisation overview ........................................................... 10
1.1 History of the document ...................................................................................... 12
1.2 List of related deliverables................................................................................... 12
2 Contextualisation and LinkedTV Scenarios ................................. 13
2.1 Personalization-aware scenarios......................................................................... 13
2.1.1 TKK Scenarios ...................................................................................... 13
2.1.2 RBB Scenarios ...................................................................................... 17
2.1.2.1 Nina, 33, urban mom............................................................................. 17
2.1.2.2 Peter, 65, retired.................................................................................... 20
2.2 Context-aware scenarios..................................................................................... 22
3 The Core Technology..................................................................... 24
4 Core Reference Knowledge ........................................................... 25
4.1 LUMO v2............................................................................................................. 25
4.1.1 LUMO-arts............................................................................................. 27
5 Implicit user interactions ............................................................... 29
5.1 Behavioural features extraction........................................................................... 29
5.1.1 Head direction validation: the setup............................................................ 34
5.1.2 Head direction validation: some results ...................................................... 35
5.2 Communication of behavioural features with the LinkedTV player....................... 39
5.3 Communication of LinkedTV player with GAIN/InBeat module ............................ 41
5.3.1 API description ........................................................................................... 41
5.3.1.1 Player Actions ......................................................................................... 42
5.3.1.2 User Actions............................................................................................ 43
5.3.1.3 Application specific actions...................................................................... 43
5.3.1.4 Contextual Features ................................................................................ 43
5.4 InBeat ................................................................................................................. 44
5.4.1 Import of annotations for media content................................................. 45
5.4.2 Support of contextualization .................................................................. 46
5.4.3 InBeat Recommender System............................................................... 47
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 5/96
5.4.3.1 Components.......................................................................................... 47
5.4.3.2 Recommender Algorithms ..................................................................... 48
5.4.3.3 In-Beat: Matching Preference Rules with Content.................................. 48
5.4.3.4 InBeat: Ensemble as combination of multiple recommenders................ 48
6 User model...................................................................................... 50
6.1 Linked Profiler contextual adaptation................................................................... 50
6.2 Scenario-based user models............................................................................... 52
6.2.1 TKK scenario user profiles..................................................................... 53
6.2.2 RBB scenario ........................................................................................ 54
7 Core Recommendation .................................................................. 58
7.1 LiFR reasoner-based recommendation and evaluation ....................................... 58
7.1.1 LiFR performance evaluation................................................................. 62
7.1.2 Bringing recommendations to the general workflow............................... 64
8 The Experimental Technology....................................................... 65
9 Experimental Reference Knowledge............................................. 66
9.1 LUMOPedia ........................................................................................................ 66
9.1.1 Design considerations ........................................................................... 66
9.1.1.1 Dedicated Temporal-aware Relational Schema..................................... 66
9.1.1.2 Reasoning with Open World and Closed World Assumptions................ 67
9.1.1.3 Unified Ontology for User and Content Modelling .................................. 69
9.1.2 Statistics of the LUMOPedia knowledge base ....................................... 69
9.1.3 LUMOPedia Browser as the frontend .................................................... 70
9.1.3.1 The class taxonomy............................................................................... 71
9.1.3.2 The schema and property definitions..................................................... 71
9.1.3.3 The instances with temporal constraints................................................ 72
9.1.4 Backend with JavaEE and PostgreSQL................................................. 72
9.1.5 Summary............................................................................................... 74
10 Experimental Explicit User Interaction ......................................... 75
10.1 LUME: the user profile editor............................................................................... 75
10.1.1 System requirements............................................................................. 75
10.1.2 HTML5-based frontend design .............................................................. 76
10.1.2.1 Manage user models............................................................................. 77
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 6/96
10.1.3 NodeJS powered service layer .............................................................. 81
10.1.4 Data Management with PostgreSQL...................................................... 82
10.1.5 Summary............................................................................................... 83
11 Experimental Recommendation .................................................... 84
11.1 Personal Recommender...................................................................................... 84
11.1.1 System design....................................................................................... 84
11.1.1.1 Incrementally Building the Knowledge Base .......................................... 84
11.1.1.2 Materialise the Semantics via Enrichment ............................................. 85
11.1.1.3 Semantic Recommendation Generation ................................................ 86
11.1.1.4 The sample video base.......................................................................... 87
11.1.1.5 Related web contents............................................................................ 87
11.1.2 Personal Recommender – the prototype frontend.................................. 88
11.1.3 The RESTful web services .................................................................... 90
11.1.4 Summary............................................................................................... 93
12 Conclusions & Future Work .......................................................... 94
13 Bibliography ................................................................................... 95
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 7/96
List of Figures
Figure 1: The WP4 core implicit personalization and contextualization workflow which is
under implementation and will be a part of the final demonstrator.................................11
Figure 2: The WP4 extended workflow containing both the core and experimental
(LUMOPedia, LUME, LSF Recommender) modules.....................................................11
Figure 3: Literary work products (article, novel, etc) were moved under the Intangible > Work
category in v2, as opposed to under the Topic > Literature category in v1. They were
related to Literature via the hasSubtopic property in a corresponding axiom ................26
Figure 4: Object properties in LUMO v2 and the semantics, domain and range of property
“authorOf“.....................................................................................................................27
Figure 5: Extract of the LUMO-arts expansion......................................................................28
Figure 6: User racking for default mode (left) and seated mode (right). ................................30
Figure 7: Seated tracking with face tracking. ........................................................................31
Figure 8: The different action units given by the Microsoft SDK and their position on the face
[WYR]...........................................................................................................................32
Figure 9: Action units on the left and expression discrimination on the right .........................32
Figure 10: Three different degrees of freedom: pitch, roll and yaw [FAC]. ............................33
Figure 11: User face windows with head pose estimation and age estimation......................33
Figure 12: The user is placed in front of the TV, and covers his head with a hat with infrared
reflectors for the Qualisys system.................................................................................34
Figure 13: Setup for facial tracking recording. The Kinect for the head tracking algorithm is
marked in green. We can also see the infrared reflectors for the Qualisys on the TV
corners. ........................................................................................................................35
Figure 14: Mean correlation with the reference for the pitch depending on the distance from
TV.................................................................................................................................36
Figure 15: Mean correlation with the reference for the yaw depending on the distance from
TV.................................................................................................................................36
Figure 16: Mean correlation with the reference for the roll depending on the distance from TV
.....................................................................................................................................37
Figure 17: Mean RMSE (in degrees) for the pitch depending on the distance to TV.............37
Figure 18: Mean RMSE (in degrees) for the yaw depending on the distance to TV. .............38
Figure 19: Mean RMSE (in degrees) for the roll depending on the distance. ........................38
Figure 20: pause action performed by Rita at 32s of video...................................................42
Figure 21: bookmark of specific chapter performed by Rita ..................................................42
Figure 22: view action of presented enrichment performed by Rita ......................................42
Figure 23: User Action Example: user Rita logged in............................................................43
Figure 24: Application Specific Actions Example: user Rita opens a new screen (TV, second
screen ...) .....................................................................................................................43
Figure 25: Rita started looking at second screen device at 15th second of video .................44
Figure 26: Example of annotation send from player along with event ...................................46
Figure 27: Example of "keepalive" event for propagation of context .....................................47
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 8/96
Figure 28: LiFR’s time performance for topic detection. Points denote each content item’s
processing time. The line shows the polynomial trendline (order of 6) of the data points.
.....................................................................................................................................63
Figure 29: Relational database schema hosting the LUMOPedia knowledge base...............68
Figure 30: Histogram of the curated instance relations.........................................................70
Figure 31: LUMOPedia Browser - the web-based frontend of the LUMOPedia knowledge
base .............................................................................................................................71
Figure 32: The defined and inherited properties for the class "movie" ..................................72
Figure 33: Architecture of the LUMOPedia Browser application ...........................................73
Figure 34: The revised architecture of LUME .......................................................................76
Figure 35: A screenshot of the LUME user profile editor.......................................................78
Figure 36: Add an instance as a UME in LUME....................................................................78
Figure 37: Add a class with constraints as a UME in LUME .................................................79
Figure 38: Natural Language interface in LUME...................................................................79
Figure 39: Eliminate the semantic misunderstanding by selecting the structured information
.....................................................................................................................................80
Figure 40: The confirmation popup dialog for the deletion of a UME.....................................80
Figure 41: Fast filtering and ranking the UME list .................................................................81
Figure 42: The database schema for storing the user models ..............................................83
Figure 43: UML deployment diagram of the Personal Recommender...................................85
Figure 44: The illustration of the enrichment process for an instance ...................................86
Figure 45: Top 30 LUMOPedia instances used in the video annotations..............................87
Figure 46: One screenshot of the video base displayed in the Personal Recommender
frontend ........................................................................................................................88
Figure 47: One screen shot of the Personal Recommender .................................................89
Figure 48: The enrichments of the entity "Berlin"..................................................................90
List of Tables
Table 1: History of the document..........................................................................................12
Table 2: OSC messages from Interest Tracker to the player and the produced action .........23
Table 3: List of behavioural and contextual features available from Interest Tracker. The
features can be sent through HTTP or websocket protocols.........................................39
Table 4: Description of REST service used for tracking of interactions.................................41
Table 5: GAIN output example. prefix d_r_ indicates that the feature corresponds to a
DBpedia resource from the English DBpedia, the prefix d_o_ to a concept from the
DBpedia Ontology. For DBpedia in other languages, the prefix is d_r_lang_. ...............44
Table 6: GAIN interaction for Nina: bookmarking a media item while in the company of her
kids...............................................................................................................................50
Table 7: Nina’s interaction serialized in her (previously empty) user profile ..........................51
Table 8: An example of a user profile of the first use case....................................................59
Table 9: Use case 1: Precision, recall, f-measure for the recommendation of the 73 manually
annotated content items, over the 7 manual user profiles .............................................60
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 9/96
Table 10: Use case 2: Precision, recall, f-measure for the recommendation of the 50
automatically annotated RBB content items, over the 5 manual user profiles ...............60
Table 11: Average precision, recall, f-measure of the automatic annotation of 50 RBB videos
in comparison with the ground truth annotations...........................................................60
Table 12: Use case 3/RBB scenario: Precision, recall, f-measure for the recommendation of
the 4 automatically annotated RBB chapters, over the Nina and Peter manual user
profiles..........................................................................................................................61
Table 13: Use case 3/TKK scenario: Precision, recall, f-measure for the recommendation of
the 9 automatically annotated RBB chapters, over the Anne, Bert, Michael and Rita
manual user profiles .....................................................................................................62
Table 14: Time performance and memory consumption of LiFR, FiRE and FuzzyDL on global
GLB calculation ............................................................................................................63
Table 15: Statistics of the LUMOPedia knowledge base.......................................................70
Table 16: The list of all RESTful services implemented in LUME servíce layer.....................82
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 10/96
1 Contextualisation overview
This deliverable deals with contextualizing content information, a process which is used for
more efficient user profile personalization. As contextualization impacts the entire workflow of
WP4, in this deliverable, advances in the final personalization and contextualization workflow
implementation is detailed in two steps.
The first step is the core workflow which is already implemented or in the process of imple-
mentation (Figure 1) within the LinkedTV workflow. The core workflow comprises of implicit
personalization and contextualization, and subsequent concept and content recommenda-
tion, and will be demonstrated by LinkedTV partners.
The second step is an extended experimental branch, consisting of an optional explicit per-
sonalization and contextualization approach which is optimized and that can be used for test-
ing with available REST services (Figure 2).
This deliverable is structured around those two steps to describe the different blocks visible
in Figure 2 and Figure 1.
Chapter 2 illustrates personalization and contextualization within the 3 LinkedTV scenarios.
To this end, the 3 scenarios are summarized and their link with contextualisation and per-
sonalization (for the two first) and contextualization (only for the third one) is shown.
Chapter 3 introduces the core personalization and contextualization workflow and details the
chapters that deal with this workflow (chapters 4, 5, 6, 7).
Chapter 4 presents updates on the core background knowledge and to this end it describes
the LUMO v2 ontology and its arts and artefacts oriented expansion, namely LUMO-arts.
Chapter 5 focuses on the implicit contextualized user tracking and preference extraction,
which comprises of the attention/context tracker and Inbeat mainly through its GAIN and PL
module.
Chapter 6 describes the process of setting up of a contextualized user model using infor-
mation from the core LUMO ontology (Chapter 4) and the implicit contextualization (Chapter
5). It also presents the final user profiles of the personas presented in Chapter 2.
Chapter 7 deals with providing and evaluating content recommendations based on the user
models presented in Chapter 6. This process if conducted via the core recommender which
is based on the LiFR reasoner. In addition, it presents evaluations on the reasoner’s algo-
rithmic efficiency.
Chapter 8 introduces the optional experimental branch and details the chapters that deal with
this workflow (chapters 9, 10, 11).
Chapter 9 deals with the optional knowledge base called LUMOPedia.
Chapter 10 talks about the optional explicit preference induction of the LUME module.
Chapter 11 details the optional Personal Recommender module (LSF).
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 11/96
While the present deliverable describes how the WP4 workflow is implemented, tests on real
data going through the entire pipeline will be detailed in the next deliverable (D4.7 about Val-
idation).
Compared to previous deliverables exposing the contextualization and personalization ideas
at a conceptual level, in the present deliverable the final pipeline is set up with all the neces-
sary technical details needed for implementation (Figure 1).
Figure 1: The WP4 core implicit personalization and contextualization workflow which is under implementa-
tion and will be a part of the final demonstrator.
Figure 2: The WP4 extended workflow containing both the core and experimental (LUMOPedia, LUME, LSF
Recommender) modules.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 12/96
1.1 History of the document
Table 1: History of the document
Date Version Name Comment
2014/05/21 V0.1 Matei Mancas Empty document with initial ToC to be discussed
2014/08/8 V0.2 Tomas Kliegr UEP sections 5.3, 5.4
2014/08/14 V0.3 Daniel Stein FhG sections
2014/08/15 V0.4 Lotte Baltus-
sen
Added section on contextualisation and scenari-
os from Sound and Vision use case perspective
2014/08/27 V0.4.1 Daniel Stein Minor update FhG sections
2014/09/12 V0.5 Matei Mancas Adding attention tracker validation
2014/09/22 V0.6 Nico Patz Adding RBB scenario
2014/10/15 V0.7 Dorothea
Tsatsou
Adding CERTH chapters
2014/10/22 V0.8 Tomas Kliegr 1st
QA addressed for UEP sections
2014/10/24 V0.9 Dorothea
Tsatsou
1st
QA addressed for CERTH sections
2014/10/27 V1.0 Matei Mancas Fusion/Formatting/Ready for final QA
2014/10/29 V1.0.1 Dorothea
Tsatsou
Final QA addressed for CERTH sections, format-
ting and spell check.
2014/10/30 V1.0.3 Jaroslav Ku-
car
Final QA addressed for UEP sections
2014/10/30 V1.0.4 Matei Mancas Final QA addressed
1.2 List of related deliverables
This deliverable is related to the previous ones which focus on contextualization and person-
alization (D4.2, D4.4 and D4.5). There are also links with the proposed scenarios in terms of
personalization and contextualization information with deliverable D6.4. Also, there is a con-
nection with D2.6 regarding the topic detection module which utilizes components inherent of
WP4 tools. Finally, the communication between the final results of the WP4 workflow, namely
the recommendations, and the platform are described in D5.6.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 13/96
2 Contextualisation and LinkedTV Scenarios
LinkedTV proposes 3 different scenarios which are detailed in the deliverable D6.4. Two of
them use the entire WP4 pipeline and they are referred in the following sections as “person-
alization-aware” scenarios. The third one does not use the personalization but it uses the
Interest and Context trackers which are the first brick of the WP4 pipeline (Figure 1). This
scenario which is not personalization-aware but only context-aware is referred as “context-
aware” scenario.
In the following sections, the scenarios are summarized and their interaction with WP4 in
terms of context and/or personalization are shown.
2.1 Personalization-aware scenarios
2.1.1 TKK Scenarios
The Sound and Vision scenarios are based on the programme Tussen Kunst & Kitsch
(henceforth: TKK) by Dutch public broadcaster AVRO. In the show, people can bring in art
objects, which are then appraised by experts, who give information about e.g. the object’s
creator, creation period, art style and value. The general aim of the scenarios is to describe
how the information need of the Antiques Roadshow viewers can be satisfied from both their
couch and on-the-go, supporting both passive and more active needs. Linking to external
information and content, such as Europeana [EUR], museum collections but also auction
information has been incorporated. These scenarios (three in total) can be found in full in
D6.4 Scenario demonstrator v2. The personas and scenario summaries are provided below,
after which the specific personalization issues and an example of a personalised profile will
be provided.
Rita: Tussen Kunst & Kitsch lover (young, medium media literacy)
• Name and occupation: Rita, administrative assistant at Art History department of the
University of Amsterdam
• Age: 34
• Nationality / place of residence: Dutch / Amsterdam
• Search behaviour: Explorative
• Digital literacy: Medium
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 14/96
1. Rita logs in to the LinkedTV application, so she can bookmark chapters that interest
her.
2. Rita is interested to find out more about the host Nelleke van der Krogt.
3. Rita wants more information on the location of the programme, the Museum Martena
and the concept of period rooms.
4. Rita wants more information on an object, the Frisian silver tea jar, and Frisian silver
in particular.
5. Rita wants to bookmark this information to look at more in-depth later.
6. Rita wants to learn more about painter Jan Sluijters and the art styles he and his con-
temporaries represent.
7. Rita wants to plan a visit to the Museum Martena.
8. Rita invites her sister to join her when she visits the Museum Martena.
9. Rita checks the resources she’s added to her favourites.
10. Rita sends a link to all chapters with expert Emiel Aardewerk to her sister.
11. Rita switches off.
Bert and Anne: Antiques Dealer, volunteer (older, high + low media literacy)
• Name and occupation: Bert, antiques dealer. Anne, volunteer at retirement home.
• Age: Bert - 61, Anne - 59
• Nationality / place of residence: Dutch / Leiden
• Search behaviour: Bert - Focused. Anne - Explorative
• Digital literacy: Bert - High. Anne – Low
1. Bert sees a chapter about a statuette from the late 17th century, which is worth
12,5K, which is similar to a statuette he recently bought.
2. Bert bookmarks this chapter, so he can view it and the information sources related to
it later on.
3. Bert immediately gets the chance to do so, because a chapter about a watch is next,
something he doesn’t really care for.
4. Anne is however very interested in the watch chapter: it depicts gods from Greek my-
thology and she want to brush up on her knowledge. She asks Bert to bookmark the
information to the Greek gods to read later.
5. Anne would like to know more on why tea was so valuable (more than its silver con-
tainer!) in the 18th century. Bert bookmarks the silver tea jar chapter for her.
6. Bert and Anne read and watch the additional information related to the wooden statue
chapter and the Greek mythology after the show. Bert has sent the latter to Anne
through email, so she can read it on her own device.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 15/96
Michael: library manager (middle-aged, high media literacy)
• Name and occupation: Michael, library manager at a public library.
• Age: 55
• Nationality / place of residence: Dutch / Laren
• Search behaviour: Explorative and focussed
• Digital literacy: High
1. Michael comes home late and has missed the latest Tussen Kunst & Kitsch episode.
He logs in to the LinkedTV application and starts watching it from there.
2. He skips the first chapter that doesn’t interest him, and then starts watching one
about a Delftware plate.
3. He likes Delftware, and sends the chapter to the main screen to explore more infor-
mation about the plate on his tablet. It turns out doesn’t like this specific plate much.
4. He selected a related chapter filmed at De Porceleyne Fles, a renowned Delftware
factory in Delft. This is about a plate he does like.
5. He adds relevant Delftware chapters to his “Delftware” playlist.
6. After this, there’s a chapter on a silver box, which reminds him of silver box he inher-
ited from his grandparents.
7. Michael sees a link to similar content related to the chapter and finds another box
similar to the one he owns. He bookmarks the chapter and shares the link via Twitter.
Personalization in the S&V scenarios
In order to make clear how personalization appears in these user scenarios, the Michael
scenario summary is expanded below to indicate 1) which concepts the persona finds inter-
esting (or not) and 2) to make clear how the persona acts in various situations.
CONCEPTS:
• Michael is interested in [boxes] [made out of silver] and that were made in [Europe].
o Short term preference: 100%
o Middle term preference: 90%
o Long term preference: 85%
• Michael would like to learn more about the [Jewish] [Sukkot] Festival, in which the
spice [etrog] place an important role.
o Short term preference: 60%
o Middle term preference: 20%
o Long term preference: 5%
• When Michael really likes a type of art object, like [Delftware plates] made at [De
Porcelyne Fles], he wants to see [all related chapters] from TKK.
o Short term preference: 100%
o Middle term preference: 80%
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 16/96
o Long term preference: 60%
• Michael is not interested in [Delftware plates] with [oriental depictions].
o Short term preference: -100%
o Middle term preference: -50%
o Long term preference: -50%
• Michael wants to learn more about the [designer] [Jan Eisenlöffel], who made objects
in the [art nouveau]-related style [New Art], since Michael really loves art nouveau.
o Short term preference: 90%
o Middle term preference: 80%
o Long term preference: 70%
• Michael bookmarks the top three recommended TKK chapters related to [art nou-
veau] to watch later.
o Short term preference: 90%
o Middle term preference: 60%
o Long term preference: 40%
• Michael is not interested in [African] [masks].
o Short term preference: -85%
o Middle term preference: -85%
o Long term preference: -85%
• Michael is interested in [paintings] with value [over 10,000 euros]
o Short term preference: 90%
o Middle term preference: 90%
o Long term preference: 90%
SITUATIONAL CONTEXT
Rita
When Rita is not very interested in a chapter, she likes to use that time to [pick up her
weights] and does [some weight-lifting] until the chapter is over.
Bert and Anne
• When Anne likes a chapter, but Bert doesn’t, he will [look away from the main screen
and browse the web on this tablet], whereas Anne will keep watching the main
screen.
• When Anne doesn’t like a chapter, but Bert does, she will [get up and make a coffee],
whereas Michael will keep watching the main screen.
Michael
• When Michael [views TKK with his wife] they specifically like to plan [visits to the mu-
seum] in which the episode is recorded.
• When Michael has [missed an episode], and a chapter come up that he doesn’t find
interesting (e.g. one on an [African mask], he will [skip to the next chapter].
• However, when he watches an episode [together with his wife], he will [not skip the
chapter], because she like to see the whole show, so he will then not [watch the tele-
vision screen] but [use his tablet to surf or check his mail].
• Sometimes Michael uses the TKK Linked Culture app to [browse through the show’s
archive] based on his interests (e.g. ‘art nouveau’), [bookmark chapters related to his
interest] and then [watches the bookmarked chapters one after the other].
• When Michael is watching a chapter while [browsing the TKK archive], and he sees
related information he likes on the second screen, he will [click the related infor-
mation], [pause the episode] and [resume it when he’s checked out the information].
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 17/96
2.1.2 RBB Scenarios
The RBB scenarios are based on the daily news show RBB AKTUELL which is being en-
riched with the help of the LinkedTV process. A combination of devices (HbbTV set or set-top
box plus optional tablet) allows different depth of (extra) information to cater for varying in-
formation needs. Levels of information can be chosen step by step:
1. Watching the “pure” news show, users will see notifications in the top right corner
whenever extra information is available
2. If a user is interested in this extra information, s/he can get an introductory text about
this person, location or topic easily on the TV screen with just a push of a button
3. Whenever a user feels that this introduction was not enough, s/he will pick up the tab-
let and find more information and links to further resources in the second screen app
The following describes how different users apply these steps differently according to differ-
ent interests and information needs as well as different device preferences. [Entities] will be
followed by an interest value in (XY %).
2.1.2.1 Nina, 33, urban mom
Nina, now 33, is a young, urban mom. Her baby is growing and getting more active so Nina
has to be even more flexible, also with respect to where she is watching the news and when
she is consuming additional information; e.g. the baby is sleeping or playing in her room, but
Nina has to keep an eye on her and be able to pause the interactive LinkedNews at any time.
The tablet is her main screen, not only because she is young and innovative and keeps play-
ing around with the tablet any free minute to escape from her daily responsibilities, but also
because it makes her more mobile.
Nina's show always has that chapter first on the list which was detected as the most relevant
according to her profile settings - but switching back to default view is very simple. According
to her preferences Nina will receive the topics in the order described in the following (#1, #2,
#4, #5, #8, #9, #7, #3, #10, #6, #11).
How Nina is watching the show of 02 June 2014
Nina is generally not interested in the [host] (0%), and this guy, [Arndt Breitfeld] (0%) doesn't
change her mind.
1. Chapter #1: New Reproach against BER Management
Nina is generally interested in Berlin politics ([Berlin] AND [politics]: 90%) and has been
watching the developments around [BER airport] (80%) closely. She found it especially inter-
esting that [Wowereit] (90%) and the BER holding [Flughafengesellschaft Berlin-
Brandenburg BER] (60%) invited [Hartmut Mehdorn] to become BER top manager, although
under his lead [Deutsche Bahn] (30%) had almost gone bankrupt.
She is very interested in [Federal politics] (90%), but only to a limited extent when it comes to
the [Ministry of Transport] and its Minister [Alexander Dobrindt] (40%).
She doesn't like the Pirates party [Die Piraten] (-40%), incl. [Martin Delius] (30%), but they
started playing an interesting role in German politics, so she couldn't afford to miss what they
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 18/96
are saying - in effect that would mean that she would not be interested in reading back-
ground information, but she would not want to miss news items where they give their com-
ments!
2. Chapter #2: Danger of a Blackout?
Would police, fire and other rescue service still work, if all electricity went off? [Emergency
Management] (70%) This is definitely a matter of importance for an urban mother! She con-
sumes all the information about [Energy] (80%) / [Energy AND Security] (80%), the Berlin
[Fire Dept.] (50%), Berlin [police] (50%), Berlin's public transport service providers [Berliner
Verkehrsbetriebe BVG] (75%) and [S-Bahn Berlin GmbH] (85%).
Nina really can't stand Berlin's Senator of the Interior, [Frank Henkel] (-70%), but as she is
very interested in such security issues she fights the wish to skip and listens to what he has
to say.
Listening to [Christopher Lauer] (-40%), another member of the Pirates party [Die Piraten] (-
40%), is equally hard for her to bear, so, as she thinks that this news item seems to be done
anyway, she eventually skips to the next chapter.
Then there is this expert interview: As an ecology-minded person, Nina is interested in hear-
ing about how the much discussed [Energy Transition] (90%) can even foster her need for
energy security. She picks up her tablet again to check what it might hold for her and takes
the time to check the enrichments on the German Institute for Economic Research [Deutsch-
es Institut für Wirtschaftsforschung DIW] (50%) which is here represented by the expert in-
terviewee, [Claudia Kemfert] (0%). When the discussion turns to [Renewable Energies]
(80%) and the expert defends these with strong and convincing arguments, Nina's interest
unexpectedly rises and she picks up the app's enrichments on [Claudia Kemfert] (50%).
3. Chapter 4: Brown Coal in Brandenburg
Nina is wondering a little why a news item on [Brandenburg] (20%) should be ranked so high
on her preference list, but soon she realises that this is about [Renewable Energy] (80%),
[Greenpeace] (90%) and [people's rights] (60%), so she listens intensely. Seeing that politi-
cians of [SPD] (60%) and [Die Linke] (65%) act against their own promises really makes her
angry, but the fact that people from the area will be relocated [Umsiedlung] (50%) from their
homes to other places is even more annoying. Nina is interested in the mentioned plans both
on the pullout from fossil energies and the relocation of whole villages, so she checks the
enrichments to learn more.
The [Renewable Energy] (80%) expert again. Nina liked her arguments in the other interview
so she stays interested.
4. Chapter 5: Refugee Camps in Berlin
Nina has followed the story of the refugees [Flüchtlinge] (70%) on Oranienplatz and the de-
velopment of the discussions closely. [Human Rights] (70%) and the stories of refugees and
how they are treated is always interesting for her. Usually she likes to look at the situation in
other countries, now she is very interested in seeing what is going on in Berlin and Germany.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 19/96
5. Chapter 8: New RBB Smart Apps
Here is a new app!? Of course, Nina is interested in [Smartphone]s (65%), [Tablet]s (70%)
and other [New Media] apps and devices, so she listens carefully how the new apps intend to
enable user participation.
6. Chapter 9: Arthaus Festival celebrates 20th anniversary of Arthaus Films
[Die Blechtrommel] (70%) always used to be one of her favourite movies and Nina loves
going to the [Cinema] (80%).
[Günter Grass] (65%) has been discussed a lot in past years for his diverse history: he
seems to have been in the [SS] (50%) and thus a servant to [Nationalsozialismus] (National
Socialism) (65%), but in the 1970s and 1980s he used to be famous for his left-wing activi-
ties.
7. Chapter 7: Short News Block 2
[Charity] (70%) is always a nice topic, so Nina keeps her attention high while watching this
short news item on a campaign where people leave their change behind at the supermarket
cash desk, so it can be transferred to kid’s foster homes.
And here is another heart-melting activity: someone supported the building of a hospice for
end-of-life care for [children] (95%) and [youths] (75%). How could Nina not support this!?
[Science] (50%) is generally a topic which needs to be handled carefully, but Nina is definite-
ly not interested in huge telescopes in Arizona's deadlands.
8. Chapter 3: Short News Block 1
Nina is shocked that Berlin's [police] (50%) apparently keeps records of mental illnesses and
even transferable diseases like HIV. What about the [German Constitution]'s (70%) [first arti-
cle] (90%) ("Human dignity shall be inviolable. To respect and protect it shall be the duty of
all state authority.")??? This is an outrageous provocation of this basic law!
Another car accident; bitter but nothing to look behind the scenes. Before she could even
consider skipping, the spot was over.
Oh, but this accident between a bicycle and a car happened just around the corner, in her
neighborhood in [Prenzlauer Berg] (90%)! Maybe she even knew this guy? She quickly
thinks who she might have to call, but of course, the news doesn't mention names in such
events.
9. Chapter 10: Medien Morgen
[Glienicker Brücke] (70%) is a beautiful spot between [Potsdam] (20%) and [Berlin] (70%),
but Nina neither likes [Steven Spielberg] (-50%) nor [Tom Hanks] (-30%) too much.
Hearing about the [Geisterbahnhof] (unused station) underneath [Kreuzberg]'s (60%) Dres-
dner Straße really makes Nina curious and she is very interested in checking available en-
richments
10. Chapter 6: Public Viewing on the Sofa
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 20/96
Nina loves [Brazil] (75%), but is not interested in [Soccer] (-40%), but she absolutely detests
[FIFA] (-95%) and what they did to take as much as possible out of the [FIFA World Cup] (-
100%). Therefore, Nina ignores this news chapter which was automatically sorted at the end
of her list.
2.1.2.2 Peter, 65, retired
Since Peter retired he is mainly interested in culture and sports and everything that happens
around him, in Potsdam and the region of western Brandenburg. Peter knows what is going
on, but he is always interested in taking a closer look.
When it comes to personalization, Peter is rather conservative. He trusts the editors that
when they deem something very important, it will be very important. To him an editor is al-
most like a director: he (or she) has a tension curve in mind and Peter wouldn't want to de-
stroy this, so his settings read: “Editor's order”.
How Peter is watching the show of 02 June 2014
Peter generally likes looking at the Information cards for speakers/anchormen. This young
man [Arndt Breitfeld] (50%) seems to be new in the anchorman position, but Peter thinks that
he may have seen him before - so he checks the tablet for background information on the
young man.
1. Chapter 1: New Reproach against BER Management
Peter is generally interested in regional topics [Brandenburg] (90%) and especially in [BER]
(75%), as it is the nearest airport. Furthermore, this is about corruption, that is to say that
these people like [Jochen Großmann] (0%) waste our public money and that is absolutely
unacceptable! Who is this guy, anyway? Peter had never heard of him before, so it is time to
check this new rbb Tablet Service!
Even Federal Minister [Alexander Dobrindt] (-20%) [Federal politics] (40%) now joins the dis-
cussion and announces action in this tragedy.
Peter is not particularly fond of the representatives of the Pirates party [Die Piraten] (-70%),
but this guy [Martin Delius] (30%) surprisingly speaks out Peter's thoughts!
2. Chapter 2: Danger of a Blackout?
Would police, fire and other rescue service still work, if all electricity went off? [Emergency
Management] (90%) is definitely an issue for everyone! Peter listens closely and meanwhile
bookmarks the additional information on his favourite topics: [Fire Rescue] (90%), [Police]
(80%) and [Technology] (95%).
[Berlin] (0%)! It's always Berlin! Does anyone care about the weak infrastructure in [Bran-
denburg] (95%)? They always talk about [Berliner Verkehrsbetriebe BVG] (-80%) and [S-
Bahn Berlin GmbH] (30%), which at least operates trains to Potsdam, too. The ways are
much longer between the small townships in the country and the network of busses and
trains is by far weaker.
Peter switches to the next spot to see, if it brings anything about [Potsdam] (90%)
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 21/96
3. Chapter 3: Short News Block 1
[Berlin] (0%) again! But the Short News are usually too short to skip, so Peter stays with
it.Hm, so Berlin's [Police] (80%) keep records of people with mental illnesses and Transfera-
ble Diseases? Yea, so what!? That is absolutely logical and fair, because they have to know
about these special dangers, or not?
Oh, a car accident on the Autobahn A10 near [Ludwigsfelde] (80%)! That is actually quite
nearby! [Brandenburg] (95%).
Oh, a biker got killed in an accident!? No one knows why this man, 42, unexpectedly
changed from bike track to the road, but these bikers are crazy, anyway!
4. Chapter 4: Brown Coal in Brandenburg
This next chapter is about the [Social Democrats] (60%) and the [Socialists] (85%) who rule
in [Brandenburg] (95%) and how they lied to get voted! Peter is truly disappointed that even
his preferred party cannot be trusted!
While Peter is still checking the tablet for information about what the people of the region
think, a new spot about refugees in [Berlin] (0%), [Kreuzberg] (-70%), starts.
5. Chapter 5: Refugee Camps in Berlin
As Peter is not at all interested in what the Hippies do in Berlin's streets, he quickly pushed
the Arrow Up and skips to the next spot by pushing the Arrow Left.
6. Chapter 6: Public Viewing on the Sofa
Peter is not so much into [sports] (40%), let alone into [soccer] (20%), but with the [FIFA
World Cup] (55%) coming, it may be worth listening and indeed...
This looks like a lot of fun: People can bring their sofas into the football stadium and meet
there for public viewings! Sitting on the sofa and not being alone – how could he not love the
idea!? But, unfortunately, the Stadium at [Alte Försterei] (0%) in [Berlin] (40%) [Köpenick]
(20%) is much too far away and he has no idea how to get his sofa on the green! But he likes
the idea.
7. Chapter 7: Short News Block 2
Peter had seen this [charity] (65%) campaign at the supermarket and he likes this grey-
haired guy, but somehow he still didn't get how he could do any good, i.e. how he could help
in this campaign. The tablet certainly has links to further information, so Peter quickly grabs it
and pushes the „Charity“ box with the image of this famous guy to the bookmarks section at
the top to check it later.
[Science] AND [Technology] (80%) has always been a favourite topic for Peter, so he is es-
pecially proud that scientists from [Potsdam] (90%) now send a huge telescope or something
to [America] (-40%). This may help these Ami guys see that Potsdam is much bigger than
they thought!
8. Chapter 8: New rbb Smart Apps
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 22/96
There is the nice young man again, announcing that rbb's news shows, both the one for [Ber-
lin] (0%) and the one for [Brandenburg] [85%), now launched apps for Tablets [Technology]
(80%). Peter listens closely, trying to understand what makes these better than the one he is
using just now – probably it's the option to send comments and even Photos or videos if you
happen to witness any accident or so. Now that sounds nice, so Peter quickly bookmarks this
spot for download information, so he may try them later. …and it is also nice to see the
speakers and moderators of RBB like [Tatjana Jury] (80%), [Dirk Platt] (60%), [Cathrin Böh-
me] (70%) and [Sascha Hingst] (90%) and even some people from behind the scenes, like
[Christoph Singelnstein] (0%), the main editor in chief.
9. Chapter 9: Arthaus Festival celebrates 20th anniversary of Arthaus Films
“[Die Blechtrommel]”? (30%) by [Gunter Grass] (-65%). Yes, Peter had heard this book title
numerous times, but he doesn't know much about it as he preferred reading East-German
books at the time. SO, he calls up the Information Cards on the TV screen again to get a first
notion and see, if he should explore further. After the first bits of information he decides, he
has seen enough. Eventually, Peter closes the service and the TV in general to go and check
the bookmarks he had made during the show.
2.2 Context-aware scenarios
The proposed artistic scenarios managed by the UMONS partner explore various opportuni-
ties arising from the merge of LinkedTV technologies and media arts. They aim to present
demonstrations of current achievement and trigger conceptual ideas from several media art-
ists. Artists who participated to the call for projects were assisted by UMONS to define the
outline, and then refine their projects to use in the most relevant way the technologies devel-
oped for LinkedTV. From the three retained scenarios, one had a specific interest in identify-
ing context and behaviours. This scenario does not make use of personalization techniques,
but it uses contextual features (number of people, looking at the main/second screen, joint
attention, viewing time) on viewer’s reactions provided by the interest and context tracker
developed in WP4.
This scenario is called Social Documentary and is detailed in the deliverable D6.4, section
4.2. Shortly, it consists in an interactive and evolving artistic installation created to navigate
through a collection of multimedia content. The project has emerged after the social events
that happened in the Gezi Park in Istanbul (Turkey, June 2013) and during which a lot of con-
tent has been produced by both TV channels and protestors themselves. The artists, four
former students from Istanbul Technical University, wanted to re-use some of this content
and present it “as-it” in their installation. They also wanted to use the visitors’ behaviour as
an input of their system, to compare it to the behaviour of people on the images and violence
of the videos. The attention of the visitors is used as a Facebook “like”, each time a visitor
watches the video for at least 5.5 seconds, the rate of this video is increased. The rate is di-
rectly linked to the probability to display this video later to other visitors. In case of joint atten-
tion - two visitors looking at the same screen - the player adds a Tweet from our selection
(tagged with #GeziPark, #occupyGezi, etc.) on the main screen.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 23/96
The installation software is divided in three parts communicating through OSC2 protocol. This
protocol which is UDP based is easily accessible using a lot of softwares used by media art-
ists. The player part, programmed in Processing (www.processing.org) a simplified Java plat-
form extensively used by media artists, receives messages from a Reactable software and
the Interest Tracker developed in WP4 and manages the video feedback (through video pro-
jection and effects). We use a MS Kinect sensor and a modified version of the Interest
Tracker to send specific messages using the OSC protocol. These messages (see Table 2)
inform the player about the number of visitors in the room, if they look to the main screen
(projection wall) or the second screen (projected on a table in front of them), and the time
they have spent looking at it, related to the attention level [HAW05]. Only the two visitors
closest to the Kinect sensor are taken into account.
Table 2: OSC messages from Interest Tracker to the player and the produced action
OSC message Values Description Player effect
/context/nbusers [0..6] Number of users tracked
in the room
Stop the video if 0
/context/facetrackedusers userID [0..5],
(userID [0..5])
IDs of the users for whom
we have a face tracking
Updates internal
state
/context/jointattention 0, 1 1 if the face tracked users
are watching the same
screen, 0 else
Add tweet on the
main screen
/context/user/attention userID [0..5],
screenID [0, 1],
attention [0..3]
Attention level of a user
when it changes
Update currently
played video rate
/context/user/coordinates userID [0..5],
screenID [0, 1], x[-
1..1], y[-1..1]
Approximate gaze coordi-
nates on the screen for a
single user, not used by
the player
none
2 http://opensoundcontrol.org/
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 24/96
3 The Core Technology
The implicit personalization and contextualization workflow is displayed in Figure 1, which is
also available below. This consists of the core WP4 technology which will be fully imple-
mented and tested throughout with LinkedTV data in the next deliverable (D4.7).
Concerning the modules communication implementation (Figure 1), the Kinect-based behav-
ioural Interest/Context Tracker sends events (through a HTTP protocol) to the player. The
player enriches the events with the video ID and time when the events occurred and passes
them to the GAIN module using the GAIN API (which is also HTTP-based), to enable retriev-
al of the specific media fragment for which an Interest/Context event was manifested. In ad-
dition to the behavioural Interest/Context events, the player also sends player interaction
events (like pause, play, bookmark etc…) using the same channel to GAIN.
The GAIN module fuses this data and provides a singular measure of user interest for all the
entities describing a given media fragment and in a given context (alone, with other people,
etc…). In addition, the PL module detects associations between entities for a given user,
which it formulates in association rules. The communication between these modules is de-
tailed in Chapter 5.
This information is sent using a RESTful service to the model building step, namely the
Linked Profiler. This step comprises conveying entities into the LUMO “world” via the LUMO
Wrapper utility and using this data to progressively learn user preferences based on the
Simple Learner component. Communication of the implicit tracking and the Linked Profiler
module is detailed in Chapter 6.
Finally, the user models created by the Linked Profiler are passed onto the LiFR-based rec-
ommender, which matches user preferences to candidate concepts and content and as a
result provides recommendations over this data. Recommendation results are lastly provided
to the LinkedTV platform as described in Chapter 7.
The core pipeline provides all the functionalities needed by the three scenarios of LinkedTV
described in Chapter 2. Additional experimental modules can be used for explicit model
management for example and they will be detailed in Chapters 8 to 11.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 25/96
4 Core Reference Knowledge
In year 3 of LinkedTV, the ontologies engineered to consist of the background knowledge for
the personalization and contextualization services were revised and updated. To this end, a
new version of LUMO [TSA14a] (v2)3 was released, along with an arts and artefacts expan-
sion, namely LUMO-arts.
4.1 LUMO v2
As described in deliverable D4.4, ch. 2, LUMO serves as a uniform, lightweight schema with
well-formed semantics and advanced concept interrelations which models the networked
media superdomain from an end-user’s perspective. It balances between being not too ab-
stract or too specific, in order to scale well and maintain the decidability of formal reasoning
algorithms. These traits might make LUMO useful to a plurality of semantic services de-
signed to manage/deliver content to an end user, even passed its use within LinkedTV and
besides or alongside personalization. Semantic search, semantic categorization, semantic
profiling, semantic matchmaking and recommendation technologies that are hindered by the
volume and inconsistency of other vocabularies and/or need to take advantage of advanced
inferencing algorithms might benefit from reusing LUMO as their background ontology. Its
reusability from semantic technologies is further strenghtened by its connection to the most
prominent LOD vocabularies, as described in D2.6, ch. 5, which prevail in Semantic Web
applications.
In the scope of personalisation and contextualisation in LinkedTV, LUMO aims to (a) homog-
enize implicitly tracked user interests under a uniform user-pertinent vocabulary, (b) express
and combine content information with contextual features under this common vocabulary and
(c) provide hierarchical and non-taxonomical concept connections at the schema level that
will enable advanced semantic matchmaking between user profiles and candidate content,
both in the concept as well as at the content filtering layer.
LUMO is primarly the schema behind the implicit tracking core workflow, where it is used to
formulate both user preferences as well as homogenize information about the content, but
can also be used in the explicit tracking branch (ch. 8-11), to express user interests.
In comparison to v1, v2 of LUMO has been updated and extended on 4 layers:
1) New concepts
New concepts (classes) were added at the schema level, to enhance completeness of the
ontology. This provides greater coverage of (a) the relevant concept space and (b) the new-
est versions of the vocabularies that WP2 uses to annotate content and LUMO maps to (re-
fer to D2.6, ch. 5 for LUMO mappings to other vocabularies).
3 For LUMO engineering principles, design decisions, core ontology/v1 presentation, refer to D4.4, ch. 2.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 26/96
Covering these vocabularies is important, since WP2 annotations are the information from
which implicit user preferences are built and that is used to determine relevance of content
to the user profiles. However, this extension and coverage was not exhaustive, to stay in line
with the primary LUMO design decision: maintain the ontology lightweight yet at the same
time meaningful from a user perspective.
Therefore, over 100 new classes were added, mostly under the “Agent“, “Tangible“ and “In-
tangible“ categories. These were based mostly on the updated schema of the DBPedia on-
tology (version 2014) [LEH14] and in part for including some concepts within YAGO2
[BIE13] relevant to the news scenario.
2) New axioms
New concepts brought along the need to model new non-taxonomical concept relations. To
this end, several new universal quantification4 axioms were added, in order to maintain con-
nections under the “Agent“, “Tangible“ and “Intangible“ categories with related “Topics“,
based on the “hasTopic“ and “hasSubtopic“ LUMO properties.
3) Revised semantics
Semantics of existing concepts in v1 have been revised and updated to better reflect their
hierarchical and non-taxonomical relations in the ontology. E.g. concepts in the “Topics”
subhierarchy were deemed as belonging under the “Tangible” or “Intangible” subhierarchies,
but connected to their related topics with the “hasTopic” and “hasSubtopic” relations. An ex-
ample can be seen in Figure 3. In addition, some concepts of v1 were omitted as too specif-
ic in the interest of maintaining the ontology lightweight.
Figure 3: Literary work products (article, novel, etc) were moved under the Intangible > Work category in v2,
as opposed to under the Topic > Literature category in v1. They were related to Literature via the hasSubtop-
ic property in a corresponding axiom
4 I.e., relations of the form: entity ⊑ ∀has(Sub)Topic.topic, where entity is subsumed by the Agent, Tangible and
Intangible concepts/categories of LUMO and topic is subsumed by the Topics concept/category of LUMO. Cf.
D4.4, ch, 2.1.2 for more details.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 27/96
4) New object properties
In the interest of accommodating the needs of the explicit profiling branch (ch. 8-11) of WP4,
which demands for extensive concept interconnections at the schema level, almost 30 new
object properties were added in the ontology, with corresponding semantics and domain/
range attributes assigned to them. Figure 4 illustrates the object properties in v2, and an ex-
ample of the semantics, domain and range of property “authorOf”.
Figure 4: Object properties in LUMO v2 and the semantics, domain and range of property “authorOf“
The update in version 2 brings LUMO to 929 classes, 38 object properties and more than
130 universal quantification axioms.
4.1.1 LUMO-arts
In the interest of accommodating the LinkedTV cultural scenario (TKK scenario), an expan-
sion of LUMO was engineered to provide more detailed coverage of the arts and artefacts
domain. In order to maintain a reduced concept space, this expansion was modeled sepa-
rately from the core, more generic, LUMO v2, but was built as an extension of the core hier-
archy.
This expansion was heavily based on the Arts & Architecture Thesaurus (AAT)5. The recent
release of AAT as LOD enabled to advise a well-formed hierarchy, while we adjusted the
semantics to the core LUMO v2 schema.
5 http://www.getty.edu/research/tools/vocabularies/aat/about.html
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 28/96
The TKK scenario partners (S&V) have carefully examined the AAT and, out of the vast in-
formation in the vocabulary, defined several facets that were deemed the most relevant to
describe TKK content. To this end, LUMO-arts models details on materials, clothing, furnish-
ing, art styles etc which outline the contents of the TKK scenario. An extract of the ontology
can be seen in Figure 5. The next version of LUMO-arts will delve deeper into the TKK sce-
nario requirements and, based on AAT, will expand more on the requirements of the context-
aware artistic scenario.
Figure 5: Extract of the LUMO-arts expansion
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 29/96
5 Implicit user interactions
The user interactions can be implicitly captured from the user behaviour. The features which
are captured come from a sensor watching the users (like the Microsoft Kinect sensor
[KIN10]) or from the player (logging the user actions). All those interactions are linked to a
video shot and sent to the GAIN module which process them into a value of “interest” quanti-
fying the way the user is interested (or not) in the current video shot.
5.1 Behavioural features extraction
As already stated in previous deliverables, the use of implicit and contextual features which
can be extracted by analysing the behaviour of the people watching the TV such as the
number of people watching, their engagement from the body language, their emotions or the
viewing direction bring crucial additional information for content personalization and adapta-
tion.
One issue in using cameras to watch people while they view TV is of course the degree of
acceptance and the ethical issues which are brought in there by this situation. Nevertheless,
the camera or the depth sensor are not recorded by any means and all the processing is
made in real time. Only precise features which can be controlled are sent to the system.
Moreover, the degree of acceptance of cameras watching people is higher and higher since
the Xbox systems using the Kinect sensor invaded a lot of houses for game purposes. Ex-
tension from games to TV is very natural and already happened since Microsoft proposed an
add-on for XBOX One to watch TV and use the Kinect gesture recognition capabilities6. Also
Google just acquired Flutter7, a company which provides webcam-based gestures and this
feature could be added to Chromecast. Apple also bought Primesense8, the company which
built up the first Kinect version, and this one could be used as a new feature for Apple TV.
Finally classical TV manufacturers like Samsung already propose cameras for communica-
tion or gesture control. This trend might be a first step in the use of cameras and depth sen-
sors for TV in the future with further steps which are to go beyond gestural controls (already
available on the market) where specific features will be acquired to enhance and personalize
the TV experience. Viewer acceptance of cameras inside homes is also growing from simple
games to TVs. In this context, the work on the Interest/Context tracker based on the first ver-
sion of the Kinect sensor is very important as new contextual features will need to be used by
the WP4 pipeline to enhance personalization. WP4 will test the information brought by such
kind of future TV sensors and see how much it can enhance the TV experience personaliza-
tion.
6 http://www.polygon.com/2014/8/7/5979055/xbox-one-digital-tv-tuner-europe
7 http://www.bbc.com/news/technology-24380202
8 http://www.forbes.com/sites/anthonykosner/2013/11/26/apple-buys-primesense-for-radical-refresh-of-apple-tv-
as-gaming-console/
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 30/96
More details about the Kinect sensor used for the Interest/Context tracker are available in
previous deliverables of WP4 (as D4.4).The Kinect sensor is a low-cost depth and RGB
camera. It contains two CMOS sensors, one for the RGB image (640 x 480 pixels at 30
frames per second) and another for the infrared images from which the depth map is calcu-
lated, based on the deformation of an infrared projected pattern. The depth sensor has an
optimal utilisation in a range of 1.2 meter (precision better than 10 mm) to 3.5 m (precision
better than 30 mm) [GON13].
The main use of the Kinect is the user tracking. It allows tracking up to 6 users and giving a
skeletal tracking up to two users to follow their actions in the Microsoft SDK 1.8 which we
used. It is possible to detect and track several points of the human body and reconstruct a
“skeleton” of the user (see Figure 6). Skeletal Tracking is able to recognize users standing or
sitting (Figure 7), and it is optimized for users facing the Kinect; sideways poses imply higher
chances to have tracking loss or errors.
Figure 6: User racking for default mode (left) and seated mode (right).
To be correctly tracked, the users need to be in front of the sensor, making sure the sensor
can see their head and upper body. There tracking is possible when the users are seated.
The seated tracking mode is designed to track people who are seated on a chair or couch,
only the upper body is tracking (arm, neck and head). The default tracking mode, in contrast,
is optimized to recognize and track people who are standing and fully visible to the sensor,
this mode gives legs tracking. We thus used for the interest/context tracking the standing
version of the skeleton as in a TV configuration, there are a lot of chances to have hidden
legs.
We can estimate user’s engagement by his sitting position. From the skeleton we use extract
torso orientation to determine whether the user is leaning forward or backward.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 31/96
Other features are compute based on the skeleton features: the child/adults discrimination
and the attention and the user interest. Regarding the age, the distance between the two
shoulders is used and it is compared to a statistical set of anthropometry of child body size.
The development of the body, and shoulder width, specifies whether the person is closer to
the body of a child or of an adult (Figure 11).
In addition to the skeleton features, Microsoft provides a Face Tracking module with the Ki-
nect SDK since the version 1.5. These SDKs can be used together to “create applications
that can track human faces in real time” To achieve face tracking, at least the upper part of
the user's Kinect skeleton had to be tracked in order to identify the position of the head (Fig-
ure 8).
Figure 7: Seated tracking with face tracking.
Based on the face tracking, it’s also possible to extract facial feature to obtain more infor-
mation about the users faces. The Microsoft SDK gives 6 facial features and they are called
“animation units”. The tracking quality may be affected by the image quality of the RGB input
frames (that is, darker or fuzzier frames track worse than brighter or sharp frames). Also,
larger or closer faces are tracked better than smaller faces. The system estimate the basic
information of the user’s head: the neutral position of their mouth, brows, eyes, and so on.
The Action Units represents the difference between the actual face and the neutral face.
Each AU is expressed as a numeric weight varying between -1 and +1.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 32/96
Figure 8: The different action units given by the Microsoft SDK
and their position on the face [WYR]
Based on these action units, it’s possible to compute and discriminates basics expressions.
Nevertheless, in real-life TV conditions, the sensor is not precise enough to provide usable
information about the precise emotions. We managed to provide relatively reliable infor-
mation about the discrimination between the neutral pose and “non-neutral” pose (Figure 9).
The system can provide events on emotion changes without being precise enough to provide
the exact emotion of the viewer.
Figure 9: Action units on the left and expression discrimination on the right
Finally, the last and most important extracted feature is the head direction which is very close
to the eye gaze (see previous deliverable of WP4, D4.4 for more details).
The Get3DPose() method returns two tables of three float numbers. The first one contains
the Euler rotation angles in degrees for the pitch, roll and yaw as described in Figure 10, and
the second contains the head position in meters. All the values are calculated relatively to the
sensor which is the origin for the coordinates.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 33/96
Figure 10: Three different degrees of freedom: pitch, roll and yaw [FAC].
The technique used to estimate the rotations and facial features tracking of the head is not
described by Microsoft, but the method uses the RGB image and depth map. The head posi-
tion is located using 3D skeleton only on the depth map. The head pose estimation itself is
mainly achieved on the RGB images. Consequently, the face tracking hardly works in bad
light conditions (shadow, too much contrast, etc.). This drawback will be solved in Kinect
version 2, where the head tracking is made using the depth map and the infra-red image,
much less sensitive to illumination changes.
Based on the head pose estimation, it is possible to know where the user is looking (main
screen, second screen or elsewhere). Based on the duration of a screen watching by the
user, a measure of attention is given [HAW05]:
• Gaze not taken into account if shorter than 1.5 seconds
• Orienting attention (1.5s to 5s)
• Engaged attention (5s to 15s)
• Starring (more than 15 s)
In addition to those single-user features, if two viewer look at the same screen, the attention
mode becomes “joint attention” which might show mutual interest for the content display on
the screen.
Figure 11: User face windows with head pose estimation and age estimation.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 34/96
The features described here are summarized in the table of section 5.2. A first validation test
was led on the head direction feature and it shows best robustness compared to other state-
of-the-art available methods. This validation is detailed in the next section.
5.1.1 Head direction validation: the setup
For the validation of the head direction feature, the results obtained by the Kinect sensor are
compared with an accurate measurement of the head movements (Figure 12). This ground
truth was obtained thanks to an optical motion capture system from Qualisys [QUA]. The
used setup consists of eight cameras, which emit infrared light and which track the position
of reflective markers placed on the head. Qualisys Track Manager Software (QTM) provides
the possibility to define a rigid body and to characterize the movement of this body with six
degrees of freedom (6DOF: three Cartesian coordinates for its position and three Euler an-
gles - roll, pitch and yaw - for its orientation).
Figure 12: The user is placed in front of the TV, and covers his head with a hat with infrared reflectors for the
Qualisys system.
The Qualisys system produces marker-based accurate data in real-time for object tracking at
about 150 frames per second. The infrared light and marker do not interfere with RGB image
and with infrared pattern from the Kinect. The choice of Qualisys as reference has been done
especially in order to compare markerless methods without interferences. This positioning is
shown on Figure 13. The angles compute from the different methods are the Euler angles.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 35/96
Figure 13: Setup for facial tracking recording. The Kinect for the head tracking algorithm is marked in green.
We can also see the infrared reflectors for the Qualisys on the TV corners.
We made several recordings with 10 different candidates. Each one performs the same head
movement sequence (verticals, horizontals, diagonals and rotations) at 5 different distances
from the screen: 1.20m, 1.50m, 2m, 2.5m and 3m. Movements performed are conventional
movements that people make when facing a TV screen (pitch, roll, and yaw; combination of
these movements; slow and fast rotation). Six of the candidates had very light skin, others
had darker skin color. Some of the candidates had bears and others not.
5.1.2 Head direction validation: some results
After having synchronized the results obtained by the Kinect SDK and the reference, as the
sampling frequencies are different, we have interpolated reference values to obtain points at
the same moments for the two systems. To make the comparison with the reference com-
puted by the Qualisys, we use two metrics: the Root Mean Square Error (RMSE) and the
correlation (CC).
The correlation is a good indicator used to establish the link between a set of given values
and its reference. It is interesting to analyse the average candidates’ correlation value ob-
tained for each distance from the TV screen. If the correlation value is equal to 1, the two
signals are the same. If the correlation is between 0.5 and 1, we consider a strong depend-
ence. The 0 value shows that the two signals are independent and de -1 value correspond to
the opposite of the signal. Figure 14 shows the correlation for pitch, Figure 15 for yaw and
Figure 16 for roll. The curve from KinectSDK is compared with the reference obtained with
the Qualisys system. The pitch, roll and yaw are described in Figure 10.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 36/96
Figure 14: Mean correlation with the reference for the pitch depending
on the distance from TV.
On Figure 14, we observe that the pitch (up-down movements) of the KinectSDK has a good
correlation (0.84) at a distance of 1m20. And it decrease with the distance under the
correlation value of 0.5 from 2 meters.
Figure 15: Mean correlation with the reference for the yaw depending
on the distance from TV.
For the second angle, the yaw (right-left movement), we have on the Figure 15 good results
for the KinectSDK with values upper than 0.9 for 1m20, 1m50 and 2m. Then de values de-
crease from 0.85 for 2m50 to 0.76 for 3m. We can consider that the values of the correlation
for the KinectSDK are pretty good. Thus the yaw measure is much more reliable than the
pitch measure.
0
0,25
0,5
0,75
1
1m20 1m50 2m 2m50 3m
KinectSDK
0
0,25
0,5
0,75
1
1m20 1m50 2m 2m50 3m
KinectSDK
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 37/96
Figure 16: Mean correlation with the reference for the roll depending on the distance from TV
The KinectSDK has good correlation as the roll curve (0.93 to 0.7) on the Figure 16. After
watching the correlation values, it is also interesting to look at the mean error made by each
system. Indeed, a method with a big correlation and low RMSE is considered very good for
head pose estimation. Figure 17 shows the RMSE for pitch, Figure 18 for yaw and Figure 19
for roll.
Figure 17: Mean RMSE (in degrees) for the pitch depending on the distance to TV.
We observe on Figures 17 to 19 that the RMSE obviousely increases with distance to the TV
(more precisely to the Kinect sensor located on the TV). For the pitch (Figure 17), the
KinectSDK is good at 1m20 with a RMSE of 5.9 degrees. The error grows with the ditance to
12 degrees after 2.5 m.
0
0,25
0,5
0,75
1
1m20 1m50 2m 2m50 3m
KinectSDK
0
2
4
6
8
10
12
14
1m20 1m50 2m 2m50 3m
Kinect SDK
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 38/96
Figure 18: Mean RMSE (in degrees) for the yaw depending on the distance to TV.
On the yaw, we observe on Figure 18 higher mean error (from 10 to 12), but this error growth
with the ditance is very small.
Figure 19: Mean RMSE (in degrees) for the roll depending on the distance.
In the case of roll (Figure 19), the RMSE the KinectSDK gives values around 10 degrees with
a smaller error at 2m for KinectSDK.
While it is possible to extractn the roll has less interest in the LinkedTV Project where mainly
yaw and pitch are used. The correlation is good and place the LinkedTV interest trakcer on
the top of the state of the art in the filed. Moreover, all those values will become even better
when using the Kinect One, second version of the Kinect sensor.
0
2
4
6
8
10
12
14
1m20 1m50 2m 2m50 3m
Kinect SDK
0
5
10
15
20
25
1m20 1m50 2m 2m50 3m
Kinect SDK
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 39/96
5.2 Communication of behavioural features with the LinkedTV
player
The Interest Tracker software offers several measures of contextual and behavioural fea-
tures, such as the number of people in the room, their position or basic expression analysis.
All these features which are computed in real time without any need of data recordings can
be sent to the LinkedTV player. The features are computed at an average frequency of 30
per second. To avoid sending too many messages to the player, only feature variations
(events) will be sent through the network.
The player gets the messages and adds the current video ID and time and forward the event
to InBeat via GAIN API for user profiling and personalization. Several network protocols have
been implemented in the software. We implemented HTTP (POST and GET) and websocket
communication with respectively cURL9 and easywebsocket10 libraries. Another version of
the Interest Tracker has been developed for artistic scenarios and this one uses the OSC
protocol (see section 2.2 and D6.4 for more details). The list of features and their format for
HTTP (GET) and websocket protocol are detailed in Table 3.
Table 3: List of behavioural and contextual features available from Interest Tracker. The features can be sent
through HTTP or websocket protocols.
Feature Name Value
Number of detected people in
front of the TV
Recognized_Viewers_NB 0, 1, 2
Websocket:
{"interaction":{"type":"context"},"attributes":{"action":"Recognized_Viewers_NB","value":[0,1,2]}}
HTTP Get:
http://baseUrl?Recognized_Viewers_NB=[0,1,2]
The screen the user is currently
watching (for each user)
[HAW05]
Viewer_Looking 0 = viewer does not look to any screen
(maybe someone called him or he is
simply doing something else in front of
the TV)
1 = viewer looks to the main screen
from more than 1.5 seconds (if less
than 1.5 s nothing is sent as this corre-
sponds with the hazard peak or the
monitoring looks). From 1.5 seconds
we have "orienting" looks.
9 http://curl.haxx.se/
10 https://github.com/dhbaird/easywsclient
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 40/96
2 = main screen from more than 5
seconds. From 5 seconds we have
"engaged" looks.
3 = main screen from more than 15
seconds. From 5 seconds we have
"staring" looks.
4 = second screen from more than 1.5
seconds
5 = second screen from more than 5
seconds
6 = second screen from more than 15
seconds
Websocket:
{"interaction":{"type":"context"},"attributes":{"action":"Viewer_Looking","value":[0,1,2,3,4,5,6],
"confidence":[0..1]},"user":{"id":[1,2,3,4,5,6]}}
HTTP Get:
http://baseUrl?UserID=[userID]&Viewer_Looking=[0,1,2,3,4,5,6]
There are two people in front of
the TV and both are looking at
the same screen
Viewer_Joint_Looking 1 if true, 0 else
Websocket:
{"interaction":{"type":"context"},"attributes":{"action":"Viewer_Joint_Looking","value":[0,1], "confidence":[0..1]}}
HTTP Get:
http://baseUrl?Viewer_Joint_Looking=[0,1]
There are only adults in front of
the TV
Viewer_adults 1 if true, 0 else
Websocket:
{"interaction":{"type":"context"},"attributes":{"action":"Viewer_adults","value":[0,1],"confidence":[0..1]}}
HTTP Get:
http://baseUrl?Viewer_Adults=[0,1]
Basic emotion analysis (for each
user)
Viewer_emotion 0 if neutral, 1 else
Websocket:
{"interaction":{"type":"context"},"attributes":{"action":"Viewer_emotion","value":[0,1],"confidence":[0..1]}}
HTTP Get:
http://baseUrl?UserID =[userID]&Viewer_Emotion[0,1]
User lean forward/backward Viewer_engagement 0 = lean backward
1 = lean forward
2 = unknown (HTTP only)
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 41/96
Websocket:
{"interaction":{"type":"context"},"attributes":{"action":"Viewer_engagement","value":[0,1],"confidence":[0..1]}}
HTTP Get:
http://baseUrl?UserID=[userID]&Viewer_Engagement[0,1,2 ]
In the examples, the value of baseUrl can be any valid URL which will handle such messag-
es in the player interface. For testing and debugging, we used http://httpbin.org/get, an ad-
dress which returns exactly the data it receives.
5.3 Communication of LinkedTV player with GAIN/InBeat module
Communication between the LinkedTV player and the GAIN module of InBeat is performed
by using REST API calls from player to the GAIN module. The API is designed to handle
multiple types of interactions including standard player actions (e.g. play, pause, bookmark,
view of enrichment …), user actions (login, bookmark ...), platform specific actions (add or
remove second screen …) and contextual features (e.g. viewer looking, number of persons
…). In this section, we will describe new version of the API and the communication format for
all previously described actions.
The communication was tested using a Noterik player simulator, which emulates user actions
in the player, which generate respective calls to the GAIN API.
5.3.1 API description
The first version of the GAIN API was described in D4.2 – User profile schema and profile
capturing. Since GAIN became part of InBeat [KUC13] there has been a change in the base
URL of the API, and we also updated the exchange format in order to support more types of
interaction.
Table 4: Description of REST service used for tracking of interactions
Track interaction
Description POST /gain/listener
HTTP Method POST
Content-Type application/json
URI http://inbeat.eu/gain/listener
cURL curl -v --data @data.json http://inbeat.eu/gain/listener
data.json Format is described in following section.
Status codes 201 – Created
400 – Bad request
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 42/96
5.3.1.1 Player Actions
There is no change in the format for actions generated by user’s operation of the remote con-
trol (or player control buttons or gestures in general). All such events should be assigned
value player in the category attribute, and the action attribute is set to one value from
the enumeration of possible actions. The location attribute specifies the time passed since
the start of the video.
Figure 20: pause action performed by Rita at 32s of video
Figure 20 presents an example of action “pause” performed by Rita at 32s of video. Exam-
ples of action “bookmark“ of a specific chapter and “view” of enrichment presented by player
to the user are described in Figure 21 and 22 respectively.
Figure 21: bookmark of specific chapter performed by Rita
Figure 22: view action of presented enrichment performed by Rita
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 43/96
5.3.1.2 User Actions
User actions are actions performed by a user that are not connected to any multimedia con-
tent (in contrast to player actions). For user actions, the objectId attribute is set to empty
string value. Each type of interaction is specified by category and action attributes.
Figure 23: User Action Example: user Rita logged in
5.3.1.3 Application specific actions
Application specific actions are actions invoked by the player. Figure 22 presents the typical
example for an application specific action, which is opening of a new screen by the user. The
format is similar to User action, the only one difference is in the category and the action
attributes. These actions are also not connected to specific multimedia content.
Figure 24: Application Specific Actions Example:
user Rita opens a new screen (TV, second screen ...)
5.3.1.4 Contextual Features
Contextual features are completely a new type of interaction. This type is identified by value
context in the type attribute of the communication format. Combination of action and
value attributes specify the type of the contextual feature. The example depicted on Figure
23 describes user Rita, who started looking (“Viewer_looking”) at a second screen device
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 44/96
(“value=2”) at the 15th second of video. Other contextual features and their possible values
are described in D4.4.
Figure 25: Rita started looking at second screen device at 15th second of video
5.4 InBeat
The InBeat platform is composed of three main modules: GAIN (General Analytics INtercpe-
tor), module for tracking and aggregating the user interactions, Preference Learning module
for analyzing user preferences, and Recommender System module providing on-demand
recommendation.
All components expose independent RESTful APIs, which allow to create custom workflows.
Within the scope of this deliverable we focus on the GAIN module, which processes the in-
teractions sent by the LinkedTV player for a given user and generates aggregated output,
which is consumed by further WP4 components to build user profiles. GAIN logic combines
multiple interest clues it derives from the interactions into a single scalar Interest attribute.
GAIN aggregates all types of interactions including player actions (play, bookmark, view of
enrichment …) and contextual features mainly provided by the Kinectbased Interest/Context
tracker. Similarly, GAIN also performs the aggregation of the content of the shot, based on
the entity annotation it receives either along with the interactions from the player or from the
platform, into a feature vector usually corresponding to DBpedia or NERD concepts.
Table 5: GAIN output example. prefix d_r_ indicates that the feature corresponds to a DBpedia resource from
the English DBpedia, the prefix d_o_ to a concept from the DBpedia Ontology. For DBpedia in other lan-
guages, the prefix is d_r_lang_.
User_id d_r_ North_Korea … d_o_SoccerPlayer … c_userlooking … interest
1 0 0.9 1 0.3
2 1 1 0 1
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 45/96
GAIN module became part of the InBeat service (see next section for more details). Howev-
er, GAIN has the same purpose and goals – tracking and aggregating user interactions.
GAIN uses a specific JSON format (details in the previous section) as input; the main output
is a tabular form of aggregated data. In this section we describe new features supported by
the latest release – support for contextualization, import of annotations for media content and
tabular format serialization. The developments in the GAIN module are reported in Section
5.4.1 and 5.4.2.
The InBeat Preference learner module is a standalone component which wraps EasyMiner
and LISp-Miner as the underlying learning stack. We have experimented also with alternative
learning backends, but the EasyMiner/LISp-Miner stack seems to provide the best experi-
ence in terms of web-based user interface and preserving compatibility with existing
LinkedTV components.
The InBeat Recommender systems module (InBeat RS) has been developed simultaneously
to other components of InBeat (GAIN as component for collecting and aggregating user
feedback, Preference learning component that learns user preferences). This component is
not part of main workflow of LinkedTV platform, but it was used as a development tool to test
GAIN and PL modules, and to participate in benchmarking contests. The developments in
the InBeat-RS module are reported in Section 5.4.3.
5.4.1 Import of annotations for media content
In order to reduce communication between GAIN and the LinkedTV platform and to over-
come issues with updating annotations of media on the GAIN side, we designed the ap-
proach for sending annotations along with interactions. Figure 24 demonstrates the format
for sending description of the object (chapter) the user interacted with. The LinkedTV player
can provide annotation of the content played with entities, since the entity information is
available in the player. GAIN module supports attachment of entities to interactions that are
sent from the player.
This approach should solve issues with updating annotations in GAIN module. GAIN needs
annotations on its input to perform the aggregations. If there will be no annotation for played
media content in internal storage of GAIN, it will lead to incorrect aggregations or delays
caused by on demand fetch of data from the LinkedTV platform. Each annotation needs to
be sent only once during the viewer session since it is cached in the GAIN module. This ap-
proach allows to reduce the amount of data communicated between the player and GAIN.
Another advantage is in temporal aspects of annotations – for next session can the platform
provide updated annotations for specific media content. This approach allows active adapta-
tion to new or updated content.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 46/96
Figure 26: Example of annotation send from player along with event
5.4.2 Support of contextualization
In this section, we describe progress on the support for contextualization. Contextual fea-
tures supported in GAIN were introduced in the Deliverable D4.4. However, the communica-
tion format did not account for the following situation: the viewer is watching screen without
any interactions or changes in context.
In this case, the tracking module does not have any information about the content that was
on the screen, since no events that would have this information attached are raised by the
player, and cannot provide the correct output. For this specific situation we designed
“keepalive“ interaction type that will provide data and descriptions for each shot. This interac-
tion is raised by the player even if there is no explicit user action or change in context to noti-
fy GAIN about the content being played. GAIN interprets this type of interaction as a simple
“copy previous state” command. Figure 25 provides description of data format implemented
in GAIN.
Example: Viewer Rita would like to watch media content with 1…N. She presses “play” but-
ton and attention tracker recognizes that she is watching the screen. Both interactions are
sent to GAIN as an interaction with event “Play” and context “Viewer_looking=1”. She is
watching the screen carefully without any interactions for all the remaining shots. Without
support of “keepalive” interaction, GAIN could be able to derive interest clues only from the
first shot and its annotations. On the other hand, when the player sends “keepalive” for each
remaining shot, GAIN propagates “Viewer_looking=1” context to all these shots. Each of
these “notified” shots is afterwards included to the final output and interest value can be cal-
culated based on the propagated values.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 47/96
Figure 27: Example of "keepalive" event for propagation of context
5.4.3 InBeat Recommender System
Recommendations are one of key features of personalization provided by LinkedTV platform.
In this section we will briefly introduce and describe InBeat Recommender System (InBeat
RS) that is available in the platform. InBeat RS consumes inputs from both GAIN and Prefer-
ence Learning modules and provides recommendation as its outputs. The InBeat Recom-
mender Systems participated in the RecSys’13 News Recommender Challenge (2nd
place)
and the CLEF NewsReel Challenge’14 (3rd
place).
5.4.3.1 Components
The Interest Beat (InBeat) recommender consists of several components described below.
Recommendation Interface module obtains requests for recommendation, which compris-
es the user identification, identification of the currently playing mediafragment (seed me-
diafragment), and description of user context. As a response, the module returns a ranked
list of enrichment content.
Recommender algorithms module covers set of algorithms that can be used in LinkedTV
platform.
BR Engine module finds the rules matching the seed content vector, aggregates their con-
clusions, and returns a predicted single value of interest.
BR Ranking module combines the estimated user interest in the individual enrichment con-
tent item produced by the BR Engine with the importance of the entity, for which the enrich-
ment item was found.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 48/96
5.4.3.2 Recommender Algorithms
InBeat RS contains implementations of several baseline algorithms and experimental imple-
mentations of specific algorithms that fit to the LinkedTV workflow. InBeat RS can provide
recommendations based on following algorithms:
• Most recent – recommendations based on simple heuristic that selects a set of new-
est items from all available candidates.
• Most interacted – only top “viewed“ items are selected.
• Content-based similarity – a set of most similar items to the item that user is currently
viewing.
• Collaborative filtering – both user-based and item-based versions are available.
• Matching Preference Rules with Content
• Rule-based similarity of users and their contextual
• Ensemble – combining of algorithms. See next sections for more details.
The most recent and most interacted methods are described in [KUC13], the rule based simi-
larity algorithm is described in [KUC14]. The details of the “Matching preference rules with
content” algorithm are given in Subs. 5.4.3.3., and the details of the ensemble method in
Subs. 5.4.3.4.
5.4.3.3 In-Beat: Matching Preference Rules with Content
Preference rules learned with EasyMiner via the In-Beat Preference Learner (see D4.4 Sec-
tion 4.2.2) can be directly used to rank relevant enrichment content according to user prefer-
ences, in addition to further processing of these rules by SimpleLearner. In this section, we
introduce In-Beat Recommender, an experimental recommender, which serves for direct
recommendation based on preference rules learnt with EasyMiner and stored in the Rule
Store.
5.4.3.4 InBeat: Ensemble as combination of multiple recommenders
Existence of different recommender algorithms that can have different quality in different sit-
uations leads to idea of combining algorithms in order to get better overall quality. One algo-
rithm can have better quality of recommendations in the morning, since users can be inter-
ested in the newest content in the morning. The second algorithm can provide better recom-
mendations for youth in the evening.
InBeat RS deals with combining using ensemble based on Multi-Armed Bandit algorithm
[KUL00]. The core of ensemble uses probabilistic distributions to decide which algorithm is
probably the best for the specific situation. At the beginning all algorithms have the same
probability of selection. One of them is randomly selected and its recommendations are pre-
sented to the user. If the user chooses one of the recommended items, it is interpreted as
positive case. The probability associated with the selected algorithm is increased and for all
others, it is decreased. User can also provide negative feedback. The probability of the algo-
rithm that provided the recommendation is then decreased.
Contextualisation solution and implementation D4.6
© LinkedTV Consortium, 2014 49/96
The selection of the recommendation algorithm is affected by the modified probabilities. A
more successful algorithm has a higher probability of selection. Since one algorithm can be
successful at the beginning and its quality can rapidly get down later, ensemble also deals
with the level of conservativeness. Ensemble supports different strategies in order to change
speed of adaptation to new situation. It allows forgetting previous states and evolution of en-
semble in time.
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation
LinkedTV Deliverable D4.6 Contextualisation solution and implementation

More Related Content

LinkedTV Deliverable D4.6 Contextualisation solution and implementation

  • 1. Deliverable 4.6 Contextualisation solution and implementation Matei Mancas, Fabien Grisard, François Rocca (UMONS) Dorothea Tsatsou, Georgios Lazaridis, Pantelis Ieronimakis, Vasileios Mezaris (CERTH) Tomáš Kliegr, Jaroslav Kuchař, Milan Šimůnek, Stanislav Vojíř (UEP) Werner Halft, Aya Kamel, Daniel Stein, Jingquan Xie (FRAUNHOFER) Lotte Belice Baltussen (SOUND AND VISION) Nico Patz (RBB) 31.10.2014 Work Package 4: Contextualisation and Personalization LinkedTV Television Linked To The Web Integrated Project (IP) FP7-ICT-2011-7. Information and Communication Technologies Grant Agreement Number 287911
  • 2. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 2/96 Dissemination level1 PU Contractual date of delivery 30th September 2014 Actual date of delivery 31st October 2014 Deliverable number D4.6 Deliverable name Contextualisation solution and implementation File LinkedTV_D4.6.docx Nature Report Status & version v1.0 Number of pages 96 WP contributing to the de- liverable WP 4 Task responsible UMONS Other contributors UEP CERTH FRAUNHOFER IAIS SOUND AND VISION RBB Author(s) Matei Mancas, Fabien Grisard, François Rocca (UMONS) Dorothea Tsatsou, Georgios Lazaridis, Pantelis Ieronimakis, Vasileios Mezaris (CERTH) Tomáš Kliegr, Jaroslav Kuchař, Milan Šimůnek, Stanislav Vojíř (UEP) Werner Halft, Aya Kamel, Daniel Stein, Jingquan Xie (FRAUNHOFER) Lotte Belice Baltussen (SOUND AND VISION) 1 • PU = Public • PP = Restricted to other programme participants (including the Commission Services) • RE = Restricted to a group specified by the consortium (including the Commission Services) • CO = Confidential, only for members of the consortium (including the Commission Services)
  • 3. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 3/96 Nico Patz (RBB) Reviewer Jan Thomsen, CONDAT EC Project Officer Thomas Küpper Keywords Contextualization, Implementation, Ontology, semantic user model, interest tracking, context tracking, preference learning, association rules, user modelling, context Abstract (for dissemination) This deliverable presents the WP4 contextualisation final im- plementation. As contextualization has a high impact on all the other modules of WP4 (especially personalization and recom- mendation), the deliverable intends to provide a picture of the final WP4 workflow implementation.
  • 4. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 4/96 Table of contents 1 Contextualisation overview ........................................................... 10 1.1 History of the document ...................................................................................... 12 1.2 List of related deliverables................................................................................... 12 2 Contextualisation and LinkedTV Scenarios ................................. 13 2.1 Personalization-aware scenarios......................................................................... 13 2.1.1 TKK Scenarios ...................................................................................... 13 2.1.2 RBB Scenarios ...................................................................................... 17 2.1.2.1 Nina, 33, urban mom............................................................................. 17 2.1.2.2 Peter, 65, retired.................................................................................... 20 2.2 Context-aware scenarios..................................................................................... 22 3 The Core Technology..................................................................... 24 4 Core Reference Knowledge ........................................................... 25 4.1 LUMO v2............................................................................................................. 25 4.1.1 LUMO-arts............................................................................................. 27 5 Implicit user interactions ............................................................... 29 5.1 Behavioural features extraction........................................................................... 29 5.1.1 Head direction validation: the setup............................................................ 34 5.1.2 Head direction validation: some results ...................................................... 35 5.2 Communication of behavioural features with the LinkedTV player....................... 39 5.3 Communication of LinkedTV player with GAIN/InBeat module ............................ 41 5.3.1 API description ........................................................................................... 41 5.3.1.1 Player Actions ......................................................................................... 42 5.3.1.2 User Actions............................................................................................ 43 5.3.1.3 Application specific actions...................................................................... 43 5.3.1.4 Contextual Features ................................................................................ 43 5.4 InBeat ................................................................................................................. 44 5.4.1 Import of annotations for media content................................................. 45 5.4.2 Support of contextualization .................................................................. 46 5.4.3 InBeat Recommender System............................................................... 47
  • 5. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 5/96 5.4.3.1 Components.......................................................................................... 47 5.4.3.2 Recommender Algorithms ..................................................................... 48 5.4.3.3 In-Beat: Matching Preference Rules with Content.................................. 48 5.4.3.4 InBeat: Ensemble as combination of multiple recommenders................ 48 6 User model...................................................................................... 50 6.1 Linked Profiler contextual adaptation................................................................... 50 6.2 Scenario-based user models............................................................................... 52 6.2.1 TKK scenario user profiles..................................................................... 53 6.2.2 RBB scenario ........................................................................................ 54 7 Core Recommendation .................................................................. 58 7.1 LiFR reasoner-based recommendation and evaluation ....................................... 58 7.1.1 LiFR performance evaluation................................................................. 62 7.1.2 Bringing recommendations to the general workflow............................... 64 8 The Experimental Technology....................................................... 65 9 Experimental Reference Knowledge............................................. 66 9.1 LUMOPedia ........................................................................................................ 66 9.1.1 Design considerations ........................................................................... 66 9.1.1.1 Dedicated Temporal-aware Relational Schema..................................... 66 9.1.1.2 Reasoning with Open World and Closed World Assumptions................ 67 9.1.1.3 Unified Ontology for User and Content Modelling .................................. 69 9.1.2 Statistics of the LUMOPedia knowledge base ....................................... 69 9.1.3 LUMOPedia Browser as the frontend .................................................... 70 9.1.3.1 The class taxonomy............................................................................... 71 9.1.3.2 The schema and property definitions..................................................... 71 9.1.3.3 The instances with temporal constraints................................................ 72 9.1.4 Backend with JavaEE and PostgreSQL................................................. 72 9.1.5 Summary............................................................................................... 74 10 Experimental Explicit User Interaction ......................................... 75 10.1 LUME: the user profile editor............................................................................... 75 10.1.1 System requirements............................................................................. 75 10.1.2 HTML5-based frontend design .............................................................. 76 10.1.2.1 Manage user models............................................................................. 77
  • 6. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 6/96 10.1.3 NodeJS powered service layer .............................................................. 81 10.1.4 Data Management with PostgreSQL...................................................... 82 10.1.5 Summary............................................................................................... 83 11 Experimental Recommendation .................................................... 84 11.1 Personal Recommender...................................................................................... 84 11.1.1 System design....................................................................................... 84 11.1.1.1 Incrementally Building the Knowledge Base .......................................... 84 11.1.1.2 Materialise the Semantics via Enrichment ............................................. 85 11.1.1.3 Semantic Recommendation Generation ................................................ 86 11.1.1.4 The sample video base.......................................................................... 87 11.1.1.5 Related web contents............................................................................ 87 11.1.2 Personal Recommender – the prototype frontend.................................. 88 11.1.3 The RESTful web services .................................................................... 90 11.1.4 Summary............................................................................................... 93 12 Conclusions & Future Work .......................................................... 94 13 Bibliography ................................................................................... 95
  • 7. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 7/96 List of Figures Figure 1: The WP4 core implicit personalization and contextualization workflow which is under implementation and will be a part of the final demonstrator.................................11 Figure 2: The WP4 extended workflow containing both the core and experimental (LUMOPedia, LUME, LSF Recommender) modules.....................................................11 Figure 3: Literary work products (article, novel, etc) were moved under the Intangible > Work category in v2, as opposed to under the Topic > Literature category in v1. They were related to Literature via the hasSubtopic property in a corresponding axiom ................26 Figure 4: Object properties in LUMO v2 and the semantics, domain and range of property “authorOf“.....................................................................................................................27 Figure 5: Extract of the LUMO-arts expansion......................................................................28 Figure 6: User racking for default mode (left) and seated mode (right). ................................30 Figure 7: Seated tracking with face tracking. ........................................................................31 Figure 8: The different action units given by the Microsoft SDK and their position on the face [WYR]...........................................................................................................................32 Figure 9: Action units on the left and expression discrimination on the right .........................32 Figure 10: Three different degrees of freedom: pitch, roll and yaw [FAC]. ............................33 Figure 11: User face windows with head pose estimation and age estimation......................33 Figure 12: The user is placed in front of the TV, and covers his head with a hat with infrared reflectors for the Qualisys system.................................................................................34 Figure 13: Setup for facial tracking recording. The Kinect for the head tracking algorithm is marked in green. We can also see the infrared reflectors for the Qualisys on the TV corners. ........................................................................................................................35 Figure 14: Mean correlation with the reference for the pitch depending on the distance from TV.................................................................................................................................36 Figure 15: Mean correlation with the reference for the yaw depending on the distance from TV.................................................................................................................................36 Figure 16: Mean correlation with the reference for the roll depending on the distance from TV .....................................................................................................................................37 Figure 17: Mean RMSE (in degrees) for the pitch depending on the distance to TV.............37 Figure 18: Mean RMSE (in degrees) for the yaw depending on the distance to TV. .............38 Figure 19: Mean RMSE (in degrees) for the roll depending on the distance. ........................38 Figure 20: pause action performed by Rita at 32s of video...................................................42 Figure 21: bookmark of specific chapter performed by Rita ..................................................42 Figure 22: view action of presented enrichment performed by Rita ......................................42 Figure 23: User Action Example: user Rita logged in............................................................43 Figure 24: Application Specific Actions Example: user Rita opens a new screen (TV, second screen ...) .....................................................................................................................43 Figure 25: Rita started looking at second screen device at 15th second of video .................44 Figure 26: Example of annotation send from player along with event ...................................46 Figure 27: Example of "keepalive" event for propagation of context .....................................47
  • 8. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 8/96 Figure 28: LiFR’s time performance for topic detection. Points denote each content item’s processing time. The line shows the polynomial trendline (order of 6) of the data points. .....................................................................................................................................63 Figure 29: Relational database schema hosting the LUMOPedia knowledge base...............68 Figure 30: Histogram of the curated instance relations.........................................................70 Figure 31: LUMOPedia Browser - the web-based frontend of the LUMOPedia knowledge base .............................................................................................................................71 Figure 32: The defined and inherited properties for the class "movie" ..................................72 Figure 33: Architecture of the LUMOPedia Browser application ...........................................73 Figure 34: The revised architecture of LUME .......................................................................76 Figure 35: A screenshot of the LUME user profile editor.......................................................78 Figure 36: Add an instance as a UME in LUME....................................................................78 Figure 37: Add a class with constraints as a UME in LUME .................................................79 Figure 38: Natural Language interface in LUME...................................................................79 Figure 39: Eliminate the semantic misunderstanding by selecting the structured information .....................................................................................................................................80 Figure 40: The confirmation popup dialog for the deletion of a UME.....................................80 Figure 41: Fast filtering and ranking the UME list .................................................................81 Figure 42: The database schema for storing the user models ..............................................83 Figure 43: UML deployment diagram of the Personal Recommender...................................85 Figure 44: The illustration of the enrichment process for an instance ...................................86 Figure 45: Top 30 LUMOPedia instances used in the video annotations..............................87 Figure 46: One screenshot of the video base displayed in the Personal Recommender frontend ........................................................................................................................88 Figure 47: One screen shot of the Personal Recommender .................................................89 Figure 48: The enrichments of the entity "Berlin"..................................................................90 List of Tables Table 1: History of the document..........................................................................................12 Table 2: OSC messages from Interest Tracker to the player and the produced action .........23 Table 3: List of behavioural and contextual features available from Interest Tracker. The features can be sent through HTTP or websocket protocols.........................................39 Table 4: Description of REST service used for tracking of interactions.................................41 Table 5: GAIN output example. prefix d_r_ indicates that the feature corresponds to a DBpedia resource from the English DBpedia, the prefix d_o_ to a concept from the DBpedia Ontology. For DBpedia in other languages, the prefix is d_r_lang_. ...............44 Table 6: GAIN interaction for Nina: bookmarking a media item while in the company of her kids...............................................................................................................................50 Table 7: Nina’s interaction serialized in her (previously empty) user profile ..........................51 Table 8: An example of a user profile of the first use case....................................................59 Table 9: Use case 1: Precision, recall, f-measure for the recommendation of the 73 manually annotated content items, over the 7 manual user profiles .............................................60
  • 9. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 9/96 Table 10: Use case 2: Precision, recall, f-measure for the recommendation of the 50 automatically annotated RBB content items, over the 5 manual user profiles ...............60 Table 11: Average precision, recall, f-measure of the automatic annotation of 50 RBB videos in comparison with the ground truth annotations...........................................................60 Table 12: Use case 3/RBB scenario: Precision, recall, f-measure for the recommendation of the 4 automatically annotated RBB chapters, over the Nina and Peter manual user profiles..........................................................................................................................61 Table 13: Use case 3/TKK scenario: Precision, recall, f-measure for the recommendation of the 9 automatically annotated RBB chapters, over the Anne, Bert, Michael and Rita manual user profiles .....................................................................................................62 Table 14: Time performance and memory consumption of LiFR, FiRE and FuzzyDL on global GLB calculation ............................................................................................................63 Table 15: Statistics of the LUMOPedia knowledge base.......................................................70 Table 16: The list of all RESTful services implemented in LUME servíce layer.....................82
  • 10. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 10/96 1 Contextualisation overview This deliverable deals with contextualizing content information, a process which is used for more efficient user profile personalization. As contextualization impacts the entire workflow of WP4, in this deliverable, advances in the final personalization and contextualization workflow implementation is detailed in two steps. The first step is the core workflow which is already implemented or in the process of imple- mentation (Figure 1) within the LinkedTV workflow. The core workflow comprises of implicit personalization and contextualization, and subsequent concept and content recommenda- tion, and will be demonstrated by LinkedTV partners. The second step is an extended experimental branch, consisting of an optional explicit per- sonalization and contextualization approach which is optimized and that can be used for test- ing with available REST services (Figure 2). This deliverable is structured around those two steps to describe the different blocks visible in Figure 2 and Figure 1. Chapter 2 illustrates personalization and contextualization within the 3 LinkedTV scenarios. To this end, the 3 scenarios are summarized and their link with contextualisation and per- sonalization (for the two first) and contextualization (only for the third one) is shown. Chapter 3 introduces the core personalization and contextualization workflow and details the chapters that deal with this workflow (chapters 4, 5, 6, 7). Chapter 4 presents updates on the core background knowledge and to this end it describes the LUMO v2 ontology and its arts and artefacts oriented expansion, namely LUMO-arts. Chapter 5 focuses on the implicit contextualized user tracking and preference extraction, which comprises of the attention/context tracker and Inbeat mainly through its GAIN and PL module. Chapter 6 describes the process of setting up of a contextualized user model using infor- mation from the core LUMO ontology (Chapter 4) and the implicit contextualization (Chapter 5). It also presents the final user profiles of the personas presented in Chapter 2. Chapter 7 deals with providing and evaluating content recommendations based on the user models presented in Chapter 6. This process if conducted via the core recommender which is based on the LiFR reasoner. In addition, it presents evaluations on the reasoner’s algo- rithmic efficiency. Chapter 8 introduces the optional experimental branch and details the chapters that deal with this workflow (chapters 9, 10, 11). Chapter 9 deals with the optional knowledge base called LUMOPedia. Chapter 10 talks about the optional explicit preference induction of the LUME module. Chapter 11 details the optional Personal Recommender module (LSF).
  • 11. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 11/96 While the present deliverable describes how the WP4 workflow is implemented, tests on real data going through the entire pipeline will be detailed in the next deliverable (D4.7 about Val- idation). Compared to previous deliverables exposing the contextualization and personalization ideas at a conceptual level, in the present deliverable the final pipeline is set up with all the neces- sary technical details needed for implementation (Figure 1). Figure 1: The WP4 core implicit personalization and contextualization workflow which is under implementa- tion and will be a part of the final demonstrator. Figure 2: The WP4 extended workflow containing both the core and experimental (LUMOPedia, LUME, LSF Recommender) modules.
  • 12. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 12/96 1.1 History of the document Table 1: History of the document Date Version Name Comment 2014/05/21 V0.1 Matei Mancas Empty document with initial ToC to be discussed 2014/08/8 V0.2 Tomas Kliegr UEP sections 5.3, 5.4 2014/08/14 V0.3 Daniel Stein FhG sections 2014/08/15 V0.4 Lotte Baltus- sen Added section on contextualisation and scenari- os from Sound and Vision use case perspective 2014/08/27 V0.4.1 Daniel Stein Minor update FhG sections 2014/09/12 V0.5 Matei Mancas Adding attention tracker validation 2014/09/22 V0.6 Nico Patz Adding RBB scenario 2014/10/15 V0.7 Dorothea Tsatsou Adding CERTH chapters 2014/10/22 V0.8 Tomas Kliegr 1st QA addressed for UEP sections 2014/10/24 V0.9 Dorothea Tsatsou 1st QA addressed for CERTH sections 2014/10/27 V1.0 Matei Mancas Fusion/Formatting/Ready for final QA 2014/10/29 V1.0.1 Dorothea Tsatsou Final QA addressed for CERTH sections, format- ting and spell check. 2014/10/30 V1.0.3 Jaroslav Ku- car Final QA addressed for UEP sections 2014/10/30 V1.0.4 Matei Mancas Final QA addressed 1.2 List of related deliverables This deliverable is related to the previous ones which focus on contextualization and person- alization (D4.2, D4.4 and D4.5). There are also links with the proposed scenarios in terms of personalization and contextualization information with deliverable D6.4. Also, there is a con- nection with D2.6 regarding the topic detection module which utilizes components inherent of WP4 tools. Finally, the communication between the final results of the WP4 workflow, namely the recommendations, and the platform are described in D5.6.
  • 13. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 13/96 2 Contextualisation and LinkedTV Scenarios LinkedTV proposes 3 different scenarios which are detailed in the deliverable D6.4. Two of them use the entire WP4 pipeline and they are referred in the following sections as “person- alization-aware” scenarios. The third one does not use the personalization but it uses the Interest and Context trackers which are the first brick of the WP4 pipeline (Figure 1). This scenario which is not personalization-aware but only context-aware is referred as “context- aware” scenario. In the following sections, the scenarios are summarized and their interaction with WP4 in terms of context and/or personalization are shown. 2.1 Personalization-aware scenarios 2.1.1 TKK Scenarios The Sound and Vision scenarios are based on the programme Tussen Kunst & Kitsch (henceforth: TKK) by Dutch public broadcaster AVRO. In the show, people can bring in art objects, which are then appraised by experts, who give information about e.g. the object’s creator, creation period, art style and value. The general aim of the scenarios is to describe how the information need of the Antiques Roadshow viewers can be satisfied from both their couch and on-the-go, supporting both passive and more active needs. Linking to external information and content, such as Europeana [EUR], museum collections but also auction information has been incorporated. These scenarios (three in total) can be found in full in D6.4 Scenario demonstrator v2. The personas and scenario summaries are provided below, after which the specific personalization issues and an example of a personalised profile will be provided. Rita: Tussen Kunst & Kitsch lover (young, medium media literacy) • Name and occupation: Rita, administrative assistant at Art History department of the University of Amsterdam • Age: 34 • Nationality / place of residence: Dutch / Amsterdam • Search behaviour: Explorative • Digital literacy: Medium
  • 14. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 14/96 1. Rita logs in to the LinkedTV application, so she can bookmark chapters that interest her. 2. Rita is interested to find out more about the host Nelleke van der Krogt. 3. Rita wants more information on the location of the programme, the Museum Martena and the concept of period rooms. 4. Rita wants more information on an object, the Frisian silver tea jar, and Frisian silver in particular. 5. Rita wants to bookmark this information to look at more in-depth later. 6. Rita wants to learn more about painter Jan Sluijters and the art styles he and his con- temporaries represent. 7. Rita wants to plan a visit to the Museum Martena. 8. Rita invites her sister to join her when she visits the Museum Martena. 9. Rita checks the resources she’s added to her favourites. 10. Rita sends a link to all chapters with expert Emiel Aardewerk to her sister. 11. Rita switches off. Bert and Anne: Antiques Dealer, volunteer (older, high + low media literacy) • Name and occupation: Bert, antiques dealer. Anne, volunteer at retirement home. • Age: Bert - 61, Anne - 59 • Nationality / place of residence: Dutch / Leiden • Search behaviour: Bert - Focused. Anne - Explorative • Digital literacy: Bert - High. Anne – Low 1. Bert sees a chapter about a statuette from the late 17th century, which is worth 12,5K, which is similar to a statuette he recently bought. 2. Bert bookmarks this chapter, so he can view it and the information sources related to it later on. 3. Bert immediately gets the chance to do so, because a chapter about a watch is next, something he doesn’t really care for. 4. Anne is however very interested in the watch chapter: it depicts gods from Greek my- thology and she want to brush up on her knowledge. She asks Bert to bookmark the information to the Greek gods to read later. 5. Anne would like to know more on why tea was so valuable (more than its silver con- tainer!) in the 18th century. Bert bookmarks the silver tea jar chapter for her. 6. Bert and Anne read and watch the additional information related to the wooden statue chapter and the Greek mythology after the show. Bert has sent the latter to Anne through email, so she can read it on her own device.
  • 15. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 15/96 Michael: library manager (middle-aged, high media literacy) • Name and occupation: Michael, library manager at a public library. • Age: 55 • Nationality / place of residence: Dutch / Laren • Search behaviour: Explorative and focussed • Digital literacy: High 1. Michael comes home late and has missed the latest Tussen Kunst & Kitsch episode. He logs in to the LinkedTV application and starts watching it from there. 2. He skips the first chapter that doesn’t interest him, and then starts watching one about a Delftware plate. 3. He likes Delftware, and sends the chapter to the main screen to explore more infor- mation about the plate on his tablet. It turns out doesn’t like this specific plate much. 4. He selected a related chapter filmed at De Porceleyne Fles, a renowned Delftware factory in Delft. This is about a plate he does like. 5. He adds relevant Delftware chapters to his “Delftware” playlist. 6. After this, there’s a chapter on a silver box, which reminds him of silver box he inher- ited from his grandparents. 7. Michael sees a link to similar content related to the chapter and finds another box similar to the one he owns. He bookmarks the chapter and shares the link via Twitter. Personalization in the S&V scenarios In order to make clear how personalization appears in these user scenarios, the Michael scenario summary is expanded below to indicate 1) which concepts the persona finds inter- esting (or not) and 2) to make clear how the persona acts in various situations. CONCEPTS: • Michael is interested in [boxes] [made out of silver] and that were made in [Europe]. o Short term preference: 100% o Middle term preference: 90% o Long term preference: 85% • Michael would like to learn more about the [Jewish] [Sukkot] Festival, in which the spice [etrog] place an important role. o Short term preference: 60% o Middle term preference: 20% o Long term preference: 5% • When Michael really likes a type of art object, like [Delftware plates] made at [De Porcelyne Fles], he wants to see [all related chapters] from TKK. o Short term preference: 100% o Middle term preference: 80%
  • 16. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 16/96 o Long term preference: 60% • Michael is not interested in [Delftware plates] with [oriental depictions]. o Short term preference: -100% o Middle term preference: -50% o Long term preference: -50% • Michael wants to learn more about the [designer] [Jan Eisenlöffel], who made objects in the [art nouveau]-related style [New Art], since Michael really loves art nouveau. o Short term preference: 90% o Middle term preference: 80% o Long term preference: 70% • Michael bookmarks the top three recommended TKK chapters related to [art nou- veau] to watch later. o Short term preference: 90% o Middle term preference: 60% o Long term preference: 40% • Michael is not interested in [African] [masks]. o Short term preference: -85% o Middle term preference: -85% o Long term preference: -85% • Michael is interested in [paintings] with value [over 10,000 euros] o Short term preference: 90% o Middle term preference: 90% o Long term preference: 90% SITUATIONAL CONTEXT Rita When Rita is not very interested in a chapter, she likes to use that time to [pick up her weights] and does [some weight-lifting] until the chapter is over. Bert and Anne • When Anne likes a chapter, but Bert doesn’t, he will [look away from the main screen and browse the web on this tablet], whereas Anne will keep watching the main screen. • When Anne doesn’t like a chapter, but Bert does, she will [get up and make a coffee], whereas Michael will keep watching the main screen. Michael • When Michael [views TKK with his wife] they specifically like to plan [visits to the mu- seum] in which the episode is recorded. • When Michael has [missed an episode], and a chapter come up that he doesn’t find interesting (e.g. one on an [African mask], he will [skip to the next chapter]. • However, when he watches an episode [together with his wife], he will [not skip the chapter], because she like to see the whole show, so he will then not [watch the tele- vision screen] but [use his tablet to surf or check his mail]. • Sometimes Michael uses the TKK Linked Culture app to [browse through the show’s archive] based on his interests (e.g. ‘art nouveau’), [bookmark chapters related to his interest] and then [watches the bookmarked chapters one after the other]. • When Michael is watching a chapter while [browsing the TKK archive], and he sees related information he likes on the second screen, he will [click the related infor- mation], [pause the episode] and [resume it when he’s checked out the information].
  • 17. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 17/96 2.1.2 RBB Scenarios The RBB scenarios are based on the daily news show RBB AKTUELL which is being en- riched with the help of the LinkedTV process. A combination of devices (HbbTV set or set-top box plus optional tablet) allows different depth of (extra) information to cater for varying in- formation needs. Levels of information can be chosen step by step: 1. Watching the “pure” news show, users will see notifications in the top right corner whenever extra information is available 2. If a user is interested in this extra information, s/he can get an introductory text about this person, location or topic easily on the TV screen with just a push of a button 3. Whenever a user feels that this introduction was not enough, s/he will pick up the tab- let and find more information and links to further resources in the second screen app The following describes how different users apply these steps differently according to differ- ent interests and information needs as well as different device preferences. [Entities] will be followed by an interest value in (XY %). 2.1.2.1 Nina, 33, urban mom Nina, now 33, is a young, urban mom. Her baby is growing and getting more active so Nina has to be even more flexible, also with respect to where she is watching the news and when she is consuming additional information; e.g. the baby is sleeping or playing in her room, but Nina has to keep an eye on her and be able to pause the interactive LinkedNews at any time. The tablet is her main screen, not only because she is young and innovative and keeps play- ing around with the tablet any free minute to escape from her daily responsibilities, but also because it makes her more mobile. Nina's show always has that chapter first on the list which was detected as the most relevant according to her profile settings - but switching back to default view is very simple. According to her preferences Nina will receive the topics in the order described in the following (#1, #2, #4, #5, #8, #9, #7, #3, #10, #6, #11). How Nina is watching the show of 02 June 2014 Nina is generally not interested in the [host] (0%), and this guy, [Arndt Breitfeld] (0%) doesn't change her mind. 1. Chapter #1: New Reproach against BER Management Nina is generally interested in Berlin politics ([Berlin] AND [politics]: 90%) and has been watching the developments around [BER airport] (80%) closely. She found it especially inter- esting that [Wowereit] (90%) and the BER holding [Flughafengesellschaft Berlin- Brandenburg BER] (60%) invited [Hartmut Mehdorn] to become BER top manager, although under his lead [Deutsche Bahn] (30%) had almost gone bankrupt. She is very interested in [Federal politics] (90%), but only to a limited extent when it comes to the [Ministry of Transport] and its Minister [Alexander Dobrindt] (40%). She doesn't like the Pirates party [Die Piraten] (-40%), incl. [Martin Delius] (30%), but they started playing an interesting role in German politics, so she couldn't afford to miss what they
  • 18. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 18/96 are saying - in effect that would mean that she would not be interested in reading back- ground information, but she would not want to miss news items where they give their com- ments! 2. Chapter #2: Danger of a Blackout? Would police, fire and other rescue service still work, if all electricity went off? [Emergency Management] (70%) This is definitely a matter of importance for an urban mother! She con- sumes all the information about [Energy] (80%) / [Energy AND Security] (80%), the Berlin [Fire Dept.] (50%), Berlin [police] (50%), Berlin's public transport service providers [Berliner Verkehrsbetriebe BVG] (75%) and [S-Bahn Berlin GmbH] (85%). Nina really can't stand Berlin's Senator of the Interior, [Frank Henkel] (-70%), but as she is very interested in such security issues she fights the wish to skip and listens to what he has to say. Listening to [Christopher Lauer] (-40%), another member of the Pirates party [Die Piraten] (- 40%), is equally hard for her to bear, so, as she thinks that this news item seems to be done anyway, she eventually skips to the next chapter. Then there is this expert interview: As an ecology-minded person, Nina is interested in hear- ing about how the much discussed [Energy Transition] (90%) can even foster her need for energy security. She picks up her tablet again to check what it might hold for her and takes the time to check the enrichments on the German Institute for Economic Research [Deutsch- es Institut für Wirtschaftsforschung DIW] (50%) which is here represented by the expert in- terviewee, [Claudia Kemfert] (0%). When the discussion turns to [Renewable Energies] (80%) and the expert defends these with strong and convincing arguments, Nina's interest unexpectedly rises and she picks up the app's enrichments on [Claudia Kemfert] (50%). 3. Chapter 4: Brown Coal in Brandenburg Nina is wondering a little why a news item on [Brandenburg] (20%) should be ranked so high on her preference list, but soon she realises that this is about [Renewable Energy] (80%), [Greenpeace] (90%) and [people's rights] (60%), so she listens intensely. Seeing that politi- cians of [SPD] (60%) and [Die Linke] (65%) act against their own promises really makes her angry, but the fact that people from the area will be relocated [Umsiedlung] (50%) from their homes to other places is even more annoying. Nina is interested in the mentioned plans both on the pullout from fossil energies and the relocation of whole villages, so she checks the enrichments to learn more. The [Renewable Energy] (80%) expert again. Nina liked her arguments in the other interview so she stays interested. 4. Chapter 5: Refugee Camps in Berlin Nina has followed the story of the refugees [Flüchtlinge] (70%) on Oranienplatz and the de- velopment of the discussions closely. [Human Rights] (70%) and the stories of refugees and how they are treated is always interesting for her. Usually she likes to look at the situation in other countries, now she is very interested in seeing what is going on in Berlin and Germany.
  • 19. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 19/96 5. Chapter 8: New RBB Smart Apps Here is a new app!? Of course, Nina is interested in [Smartphone]s (65%), [Tablet]s (70%) and other [New Media] apps and devices, so she listens carefully how the new apps intend to enable user participation. 6. Chapter 9: Arthaus Festival celebrates 20th anniversary of Arthaus Films [Die Blechtrommel] (70%) always used to be one of her favourite movies and Nina loves going to the [Cinema] (80%). [Günter Grass] (65%) has been discussed a lot in past years for his diverse history: he seems to have been in the [SS] (50%) and thus a servant to [Nationalsozialismus] (National Socialism) (65%), but in the 1970s and 1980s he used to be famous for his left-wing activi- ties. 7. Chapter 7: Short News Block 2 [Charity] (70%) is always a nice topic, so Nina keeps her attention high while watching this short news item on a campaign where people leave their change behind at the supermarket cash desk, so it can be transferred to kid’s foster homes. And here is another heart-melting activity: someone supported the building of a hospice for end-of-life care for [children] (95%) and [youths] (75%). How could Nina not support this!? [Science] (50%) is generally a topic which needs to be handled carefully, but Nina is definite- ly not interested in huge telescopes in Arizona's deadlands. 8. Chapter 3: Short News Block 1 Nina is shocked that Berlin's [police] (50%) apparently keeps records of mental illnesses and even transferable diseases like HIV. What about the [German Constitution]'s (70%) [first arti- cle] (90%) ("Human dignity shall be inviolable. To respect and protect it shall be the duty of all state authority.")??? This is an outrageous provocation of this basic law! Another car accident; bitter but nothing to look behind the scenes. Before she could even consider skipping, the spot was over. Oh, but this accident between a bicycle and a car happened just around the corner, in her neighborhood in [Prenzlauer Berg] (90%)! Maybe she even knew this guy? She quickly thinks who she might have to call, but of course, the news doesn't mention names in such events. 9. Chapter 10: Medien Morgen [Glienicker Brücke] (70%) is a beautiful spot between [Potsdam] (20%) and [Berlin] (70%), but Nina neither likes [Steven Spielberg] (-50%) nor [Tom Hanks] (-30%) too much. Hearing about the [Geisterbahnhof] (unused station) underneath [Kreuzberg]'s (60%) Dres- dner Straße really makes Nina curious and she is very interested in checking available en- richments 10. Chapter 6: Public Viewing on the Sofa
  • 20. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 20/96 Nina loves [Brazil] (75%), but is not interested in [Soccer] (-40%), but she absolutely detests [FIFA] (-95%) and what they did to take as much as possible out of the [FIFA World Cup] (- 100%). Therefore, Nina ignores this news chapter which was automatically sorted at the end of her list. 2.1.2.2 Peter, 65, retired Since Peter retired he is mainly interested in culture and sports and everything that happens around him, in Potsdam and the region of western Brandenburg. Peter knows what is going on, but he is always interested in taking a closer look. When it comes to personalization, Peter is rather conservative. He trusts the editors that when they deem something very important, it will be very important. To him an editor is al- most like a director: he (or she) has a tension curve in mind and Peter wouldn't want to de- stroy this, so his settings read: “Editor's order”. How Peter is watching the show of 02 June 2014 Peter generally likes looking at the Information cards for speakers/anchormen. This young man [Arndt Breitfeld] (50%) seems to be new in the anchorman position, but Peter thinks that he may have seen him before - so he checks the tablet for background information on the young man. 1. Chapter 1: New Reproach against BER Management Peter is generally interested in regional topics [Brandenburg] (90%) and especially in [BER] (75%), as it is the nearest airport. Furthermore, this is about corruption, that is to say that these people like [Jochen Großmann] (0%) waste our public money and that is absolutely unacceptable! Who is this guy, anyway? Peter had never heard of him before, so it is time to check this new rbb Tablet Service! Even Federal Minister [Alexander Dobrindt] (-20%) [Federal politics] (40%) now joins the dis- cussion and announces action in this tragedy. Peter is not particularly fond of the representatives of the Pirates party [Die Piraten] (-70%), but this guy [Martin Delius] (30%) surprisingly speaks out Peter's thoughts! 2. Chapter 2: Danger of a Blackout? Would police, fire and other rescue service still work, if all electricity went off? [Emergency Management] (90%) is definitely an issue for everyone! Peter listens closely and meanwhile bookmarks the additional information on his favourite topics: [Fire Rescue] (90%), [Police] (80%) and [Technology] (95%). [Berlin] (0%)! It's always Berlin! Does anyone care about the weak infrastructure in [Bran- denburg] (95%)? They always talk about [Berliner Verkehrsbetriebe BVG] (-80%) and [S- Bahn Berlin GmbH] (30%), which at least operates trains to Potsdam, too. The ways are much longer between the small townships in the country and the network of busses and trains is by far weaker. Peter switches to the next spot to see, if it brings anything about [Potsdam] (90%)
  • 21. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 21/96 3. Chapter 3: Short News Block 1 [Berlin] (0%) again! But the Short News are usually too short to skip, so Peter stays with it.Hm, so Berlin's [Police] (80%) keep records of people with mental illnesses and Transfera- ble Diseases? Yea, so what!? That is absolutely logical and fair, because they have to know about these special dangers, or not? Oh, a car accident on the Autobahn A10 near [Ludwigsfelde] (80%)! That is actually quite nearby! [Brandenburg] (95%). Oh, a biker got killed in an accident!? No one knows why this man, 42, unexpectedly changed from bike track to the road, but these bikers are crazy, anyway! 4. Chapter 4: Brown Coal in Brandenburg This next chapter is about the [Social Democrats] (60%) and the [Socialists] (85%) who rule in [Brandenburg] (95%) and how they lied to get voted! Peter is truly disappointed that even his preferred party cannot be trusted! While Peter is still checking the tablet for information about what the people of the region think, a new spot about refugees in [Berlin] (0%), [Kreuzberg] (-70%), starts. 5. Chapter 5: Refugee Camps in Berlin As Peter is not at all interested in what the Hippies do in Berlin's streets, he quickly pushed the Arrow Up and skips to the next spot by pushing the Arrow Left. 6. Chapter 6: Public Viewing on the Sofa Peter is not so much into [sports] (40%), let alone into [soccer] (20%), but with the [FIFA World Cup] (55%) coming, it may be worth listening and indeed... This looks like a lot of fun: People can bring their sofas into the football stadium and meet there for public viewings! Sitting on the sofa and not being alone – how could he not love the idea!? But, unfortunately, the Stadium at [Alte Försterei] (0%) in [Berlin] (40%) [Köpenick] (20%) is much too far away and he has no idea how to get his sofa on the green! But he likes the idea. 7. Chapter 7: Short News Block 2 Peter had seen this [charity] (65%) campaign at the supermarket and he likes this grey- haired guy, but somehow he still didn't get how he could do any good, i.e. how he could help in this campaign. The tablet certainly has links to further information, so Peter quickly grabs it and pushes the „Charity“ box with the image of this famous guy to the bookmarks section at the top to check it later. [Science] AND [Technology] (80%) has always been a favourite topic for Peter, so he is es- pecially proud that scientists from [Potsdam] (90%) now send a huge telescope or something to [America] (-40%). This may help these Ami guys see that Potsdam is much bigger than they thought! 8. Chapter 8: New rbb Smart Apps
  • 22. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 22/96 There is the nice young man again, announcing that rbb's news shows, both the one for [Ber- lin] (0%) and the one for [Brandenburg] [85%), now launched apps for Tablets [Technology] (80%). Peter listens closely, trying to understand what makes these better than the one he is using just now – probably it's the option to send comments and even Photos or videos if you happen to witness any accident or so. Now that sounds nice, so Peter quickly bookmarks this spot for download information, so he may try them later. …and it is also nice to see the speakers and moderators of RBB like [Tatjana Jury] (80%), [Dirk Platt] (60%), [Cathrin Böh- me] (70%) and [Sascha Hingst] (90%) and even some people from behind the scenes, like [Christoph Singelnstein] (0%), the main editor in chief. 9. Chapter 9: Arthaus Festival celebrates 20th anniversary of Arthaus Films “[Die Blechtrommel]”? (30%) by [Gunter Grass] (-65%). Yes, Peter had heard this book title numerous times, but he doesn't know much about it as he preferred reading East-German books at the time. SO, he calls up the Information Cards on the TV screen again to get a first notion and see, if he should explore further. After the first bits of information he decides, he has seen enough. Eventually, Peter closes the service and the TV in general to go and check the bookmarks he had made during the show. 2.2 Context-aware scenarios The proposed artistic scenarios managed by the UMONS partner explore various opportuni- ties arising from the merge of LinkedTV technologies and media arts. They aim to present demonstrations of current achievement and trigger conceptual ideas from several media art- ists. Artists who participated to the call for projects were assisted by UMONS to define the outline, and then refine their projects to use in the most relevant way the technologies devel- oped for LinkedTV. From the three retained scenarios, one had a specific interest in identify- ing context and behaviours. This scenario does not make use of personalization techniques, but it uses contextual features (number of people, looking at the main/second screen, joint attention, viewing time) on viewer’s reactions provided by the interest and context tracker developed in WP4. This scenario is called Social Documentary and is detailed in the deliverable D6.4, section 4.2. Shortly, it consists in an interactive and evolving artistic installation created to navigate through a collection of multimedia content. The project has emerged after the social events that happened in the Gezi Park in Istanbul (Turkey, June 2013) and during which a lot of con- tent has been produced by both TV channels and protestors themselves. The artists, four former students from Istanbul Technical University, wanted to re-use some of this content and present it “as-it” in their installation. They also wanted to use the visitors’ behaviour as an input of their system, to compare it to the behaviour of people on the images and violence of the videos. The attention of the visitors is used as a Facebook “like”, each time a visitor watches the video for at least 5.5 seconds, the rate of this video is increased. The rate is di- rectly linked to the probability to display this video later to other visitors. In case of joint atten- tion - two visitors looking at the same screen - the player adds a Tweet from our selection (tagged with #GeziPark, #occupyGezi, etc.) on the main screen.
  • 23. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 23/96 The installation software is divided in three parts communicating through OSC2 protocol. This protocol which is UDP based is easily accessible using a lot of softwares used by media art- ists. The player part, programmed in Processing (www.processing.org) a simplified Java plat- form extensively used by media artists, receives messages from a Reactable software and the Interest Tracker developed in WP4 and manages the video feedback (through video pro- jection and effects). We use a MS Kinect sensor and a modified version of the Interest Tracker to send specific messages using the OSC protocol. These messages (see Table 2) inform the player about the number of visitors in the room, if they look to the main screen (projection wall) or the second screen (projected on a table in front of them), and the time they have spent looking at it, related to the attention level [HAW05]. Only the two visitors closest to the Kinect sensor are taken into account. Table 2: OSC messages from Interest Tracker to the player and the produced action OSC message Values Description Player effect /context/nbusers [0..6] Number of users tracked in the room Stop the video if 0 /context/facetrackedusers userID [0..5], (userID [0..5]) IDs of the users for whom we have a face tracking Updates internal state /context/jointattention 0, 1 1 if the face tracked users are watching the same screen, 0 else Add tweet on the main screen /context/user/attention userID [0..5], screenID [0, 1], attention [0..3] Attention level of a user when it changes Update currently played video rate /context/user/coordinates userID [0..5], screenID [0, 1], x[- 1..1], y[-1..1] Approximate gaze coordi- nates on the screen for a single user, not used by the player none 2 http://opensoundcontrol.org/
  • 24. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 24/96 3 The Core Technology The implicit personalization and contextualization workflow is displayed in Figure 1, which is also available below. This consists of the core WP4 technology which will be fully imple- mented and tested throughout with LinkedTV data in the next deliverable (D4.7). Concerning the modules communication implementation (Figure 1), the Kinect-based behav- ioural Interest/Context Tracker sends events (through a HTTP protocol) to the player. The player enriches the events with the video ID and time when the events occurred and passes them to the GAIN module using the GAIN API (which is also HTTP-based), to enable retriev- al of the specific media fragment for which an Interest/Context event was manifested. In ad- dition to the behavioural Interest/Context events, the player also sends player interaction events (like pause, play, bookmark etc…) using the same channel to GAIN. The GAIN module fuses this data and provides a singular measure of user interest for all the entities describing a given media fragment and in a given context (alone, with other people, etc…). In addition, the PL module detects associations between entities for a given user, which it formulates in association rules. The communication between these modules is de- tailed in Chapter 5. This information is sent using a RESTful service to the model building step, namely the Linked Profiler. This step comprises conveying entities into the LUMO “world” via the LUMO Wrapper utility and using this data to progressively learn user preferences based on the Simple Learner component. Communication of the implicit tracking and the Linked Profiler module is detailed in Chapter 6. Finally, the user models created by the Linked Profiler are passed onto the LiFR-based rec- ommender, which matches user preferences to candidate concepts and content and as a result provides recommendations over this data. Recommendation results are lastly provided to the LinkedTV platform as described in Chapter 7. The core pipeline provides all the functionalities needed by the three scenarios of LinkedTV described in Chapter 2. Additional experimental modules can be used for explicit model management for example and they will be detailed in Chapters 8 to 11.
  • 25. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 25/96 4 Core Reference Knowledge In year 3 of LinkedTV, the ontologies engineered to consist of the background knowledge for the personalization and contextualization services were revised and updated. To this end, a new version of LUMO [TSA14a] (v2)3 was released, along with an arts and artefacts expan- sion, namely LUMO-arts. 4.1 LUMO v2 As described in deliverable D4.4, ch. 2, LUMO serves as a uniform, lightweight schema with well-formed semantics and advanced concept interrelations which models the networked media superdomain from an end-user’s perspective. It balances between being not too ab- stract or too specific, in order to scale well and maintain the decidability of formal reasoning algorithms. These traits might make LUMO useful to a plurality of semantic services de- signed to manage/deliver content to an end user, even passed its use within LinkedTV and besides or alongside personalization. Semantic search, semantic categorization, semantic profiling, semantic matchmaking and recommendation technologies that are hindered by the volume and inconsistency of other vocabularies and/or need to take advantage of advanced inferencing algorithms might benefit from reusing LUMO as their background ontology. Its reusability from semantic technologies is further strenghtened by its connection to the most prominent LOD vocabularies, as described in D2.6, ch. 5, which prevail in Semantic Web applications. In the scope of personalisation and contextualisation in LinkedTV, LUMO aims to (a) homog- enize implicitly tracked user interests under a uniform user-pertinent vocabulary, (b) express and combine content information with contextual features under this common vocabulary and (c) provide hierarchical and non-taxonomical concept connections at the schema level that will enable advanced semantic matchmaking between user profiles and candidate content, both in the concept as well as at the content filtering layer. LUMO is primarly the schema behind the implicit tracking core workflow, where it is used to formulate both user preferences as well as homogenize information about the content, but can also be used in the explicit tracking branch (ch. 8-11), to express user interests. In comparison to v1, v2 of LUMO has been updated and extended on 4 layers: 1) New concepts New concepts (classes) were added at the schema level, to enhance completeness of the ontology. This provides greater coverage of (a) the relevant concept space and (b) the new- est versions of the vocabularies that WP2 uses to annotate content and LUMO maps to (re- fer to D2.6, ch. 5 for LUMO mappings to other vocabularies). 3 For LUMO engineering principles, design decisions, core ontology/v1 presentation, refer to D4.4, ch. 2.
  • 26. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 26/96 Covering these vocabularies is important, since WP2 annotations are the information from which implicit user preferences are built and that is used to determine relevance of content to the user profiles. However, this extension and coverage was not exhaustive, to stay in line with the primary LUMO design decision: maintain the ontology lightweight yet at the same time meaningful from a user perspective. Therefore, over 100 new classes were added, mostly under the “Agent“, “Tangible“ and “In- tangible“ categories. These were based mostly on the updated schema of the DBPedia on- tology (version 2014) [LEH14] and in part for including some concepts within YAGO2 [BIE13] relevant to the news scenario. 2) New axioms New concepts brought along the need to model new non-taxonomical concept relations. To this end, several new universal quantification4 axioms were added, in order to maintain con- nections under the “Agent“, “Tangible“ and “Intangible“ categories with related “Topics“, based on the “hasTopic“ and “hasSubtopic“ LUMO properties. 3) Revised semantics Semantics of existing concepts in v1 have been revised and updated to better reflect their hierarchical and non-taxonomical relations in the ontology. E.g. concepts in the “Topics” subhierarchy were deemed as belonging under the “Tangible” or “Intangible” subhierarchies, but connected to their related topics with the “hasTopic” and “hasSubtopic” relations. An ex- ample can be seen in Figure 3. In addition, some concepts of v1 were omitted as too specif- ic in the interest of maintaining the ontology lightweight. Figure 3: Literary work products (article, novel, etc) were moved under the Intangible > Work category in v2, as opposed to under the Topic > Literature category in v1. They were related to Literature via the hasSubtop- ic property in a corresponding axiom 4 I.e., relations of the form: entity ⊑ ∀has(Sub)Topic.topic, where entity is subsumed by the Agent, Tangible and Intangible concepts/categories of LUMO and topic is subsumed by the Topics concept/category of LUMO. Cf. D4.4, ch, 2.1.2 for more details.
  • 27. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 27/96 4) New object properties In the interest of accommodating the needs of the explicit profiling branch (ch. 8-11) of WP4, which demands for extensive concept interconnections at the schema level, almost 30 new object properties were added in the ontology, with corresponding semantics and domain/ range attributes assigned to them. Figure 4 illustrates the object properties in v2, and an ex- ample of the semantics, domain and range of property “authorOf”. Figure 4: Object properties in LUMO v2 and the semantics, domain and range of property “authorOf“ The update in version 2 brings LUMO to 929 classes, 38 object properties and more than 130 universal quantification axioms. 4.1.1 LUMO-arts In the interest of accommodating the LinkedTV cultural scenario (TKK scenario), an expan- sion of LUMO was engineered to provide more detailed coverage of the arts and artefacts domain. In order to maintain a reduced concept space, this expansion was modeled sepa- rately from the core, more generic, LUMO v2, but was built as an extension of the core hier- archy. This expansion was heavily based on the Arts & Architecture Thesaurus (AAT)5. The recent release of AAT as LOD enabled to advise a well-formed hierarchy, while we adjusted the semantics to the core LUMO v2 schema. 5 http://www.getty.edu/research/tools/vocabularies/aat/about.html
  • 28. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 28/96 The TKK scenario partners (S&V) have carefully examined the AAT and, out of the vast in- formation in the vocabulary, defined several facets that were deemed the most relevant to describe TKK content. To this end, LUMO-arts models details on materials, clothing, furnish- ing, art styles etc which outline the contents of the TKK scenario. An extract of the ontology can be seen in Figure 5. The next version of LUMO-arts will delve deeper into the TKK sce- nario requirements and, based on AAT, will expand more on the requirements of the context- aware artistic scenario. Figure 5: Extract of the LUMO-arts expansion
  • 29. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 29/96 5 Implicit user interactions The user interactions can be implicitly captured from the user behaviour. The features which are captured come from a sensor watching the users (like the Microsoft Kinect sensor [KIN10]) or from the player (logging the user actions). All those interactions are linked to a video shot and sent to the GAIN module which process them into a value of “interest” quanti- fying the way the user is interested (or not) in the current video shot. 5.1 Behavioural features extraction As already stated in previous deliverables, the use of implicit and contextual features which can be extracted by analysing the behaviour of the people watching the TV such as the number of people watching, their engagement from the body language, their emotions or the viewing direction bring crucial additional information for content personalization and adapta- tion. One issue in using cameras to watch people while they view TV is of course the degree of acceptance and the ethical issues which are brought in there by this situation. Nevertheless, the camera or the depth sensor are not recorded by any means and all the processing is made in real time. Only precise features which can be controlled are sent to the system. Moreover, the degree of acceptance of cameras watching people is higher and higher since the Xbox systems using the Kinect sensor invaded a lot of houses for game purposes. Ex- tension from games to TV is very natural and already happened since Microsoft proposed an add-on for XBOX One to watch TV and use the Kinect gesture recognition capabilities6. Also Google just acquired Flutter7, a company which provides webcam-based gestures and this feature could be added to Chromecast. Apple also bought Primesense8, the company which built up the first Kinect version, and this one could be used as a new feature for Apple TV. Finally classical TV manufacturers like Samsung already propose cameras for communica- tion or gesture control. This trend might be a first step in the use of cameras and depth sen- sors for TV in the future with further steps which are to go beyond gestural controls (already available on the market) where specific features will be acquired to enhance and personalize the TV experience. Viewer acceptance of cameras inside homes is also growing from simple games to TVs. In this context, the work on the Interest/Context tracker based on the first ver- sion of the Kinect sensor is very important as new contextual features will need to be used by the WP4 pipeline to enhance personalization. WP4 will test the information brought by such kind of future TV sensors and see how much it can enhance the TV experience personaliza- tion. 6 http://www.polygon.com/2014/8/7/5979055/xbox-one-digital-tv-tuner-europe 7 http://www.bbc.com/news/technology-24380202 8 http://www.forbes.com/sites/anthonykosner/2013/11/26/apple-buys-primesense-for-radical-refresh-of-apple-tv- as-gaming-console/
  • 30. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 30/96 More details about the Kinect sensor used for the Interest/Context tracker are available in previous deliverables of WP4 (as D4.4).The Kinect sensor is a low-cost depth and RGB camera. It contains two CMOS sensors, one for the RGB image (640 x 480 pixels at 30 frames per second) and another for the infrared images from which the depth map is calcu- lated, based on the deformation of an infrared projected pattern. The depth sensor has an optimal utilisation in a range of 1.2 meter (precision better than 10 mm) to 3.5 m (precision better than 30 mm) [GON13]. The main use of the Kinect is the user tracking. It allows tracking up to 6 users and giving a skeletal tracking up to two users to follow their actions in the Microsoft SDK 1.8 which we used. It is possible to detect and track several points of the human body and reconstruct a “skeleton” of the user (see Figure 6). Skeletal Tracking is able to recognize users standing or sitting (Figure 7), and it is optimized for users facing the Kinect; sideways poses imply higher chances to have tracking loss or errors. Figure 6: User racking for default mode (left) and seated mode (right). To be correctly tracked, the users need to be in front of the sensor, making sure the sensor can see their head and upper body. There tracking is possible when the users are seated. The seated tracking mode is designed to track people who are seated on a chair or couch, only the upper body is tracking (arm, neck and head). The default tracking mode, in contrast, is optimized to recognize and track people who are standing and fully visible to the sensor, this mode gives legs tracking. We thus used for the interest/context tracking the standing version of the skeleton as in a TV configuration, there are a lot of chances to have hidden legs. We can estimate user’s engagement by his sitting position. From the skeleton we use extract torso orientation to determine whether the user is leaning forward or backward.
  • 31. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 31/96 Other features are compute based on the skeleton features: the child/adults discrimination and the attention and the user interest. Regarding the age, the distance between the two shoulders is used and it is compared to a statistical set of anthropometry of child body size. The development of the body, and shoulder width, specifies whether the person is closer to the body of a child or of an adult (Figure 11). In addition to the skeleton features, Microsoft provides a Face Tracking module with the Ki- nect SDK since the version 1.5. These SDKs can be used together to “create applications that can track human faces in real time” To achieve face tracking, at least the upper part of the user's Kinect skeleton had to be tracked in order to identify the position of the head (Fig- ure 8). Figure 7: Seated tracking with face tracking. Based on the face tracking, it’s also possible to extract facial feature to obtain more infor- mation about the users faces. The Microsoft SDK gives 6 facial features and they are called “animation units”. The tracking quality may be affected by the image quality of the RGB input frames (that is, darker or fuzzier frames track worse than brighter or sharp frames). Also, larger or closer faces are tracked better than smaller faces. The system estimate the basic information of the user’s head: the neutral position of their mouth, brows, eyes, and so on. The Action Units represents the difference between the actual face and the neutral face. Each AU is expressed as a numeric weight varying between -1 and +1.
  • 32. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 32/96 Figure 8: The different action units given by the Microsoft SDK and their position on the face [WYR] Based on these action units, it’s possible to compute and discriminates basics expressions. Nevertheless, in real-life TV conditions, the sensor is not precise enough to provide usable information about the precise emotions. We managed to provide relatively reliable infor- mation about the discrimination between the neutral pose and “non-neutral” pose (Figure 9). The system can provide events on emotion changes without being precise enough to provide the exact emotion of the viewer. Figure 9: Action units on the left and expression discrimination on the right Finally, the last and most important extracted feature is the head direction which is very close to the eye gaze (see previous deliverable of WP4, D4.4 for more details). The Get3DPose() method returns two tables of three float numbers. The first one contains the Euler rotation angles in degrees for the pitch, roll and yaw as described in Figure 10, and the second contains the head position in meters. All the values are calculated relatively to the sensor which is the origin for the coordinates.
  • 33. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 33/96 Figure 10: Three different degrees of freedom: pitch, roll and yaw [FAC]. The technique used to estimate the rotations and facial features tracking of the head is not described by Microsoft, but the method uses the RGB image and depth map. The head posi- tion is located using 3D skeleton only on the depth map. The head pose estimation itself is mainly achieved on the RGB images. Consequently, the face tracking hardly works in bad light conditions (shadow, too much contrast, etc.). This drawback will be solved in Kinect version 2, where the head tracking is made using the depth map and the infra-red image, much less sensitive to illumination changes. Based on the head pose estimation, it is possible to know where the user is looking (main screen, second screen or elsewhere). Based on the duration of a screen watching by the user, a measure of attention is given [HAW05]: • Gaze not taken into account if shorter than 1.5 seconds • Orienting attention (1.5s to 5s) • Engaged attention (5s to 15s) • Starring (more than 15 s) In addition to those single-user features, if two viewer look at the same screen, the attention mode becomes “joint attention” which might show mutual interest for the content display on the screen. Figure 11: User face windows with head pose estimation and age estimation.
  • 34. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 34/96 The features described here are summarized in the table of section 5.2. A first validation test was led on the head direction feature and it shows best robustness compared to other state- of-the-art available methods. This validation is detailed in the next section. 5.1.1 Head direction validation: the setup For the validation of the head direction feature, the results obtained by the Kinect sensor are compared with an accurate measurement of the head movements (Figure 12). This ground truth was obtained thanks to an optical motion capture system from Qualisys [QUA]. The used setup consists of eight cameras, which emit infrared light and which track the position of reflective markers placed on the head. Qualisys Track Manager Software (QTM) provides the possibility to define a rigid body and to characterize the movement of this body with six degrees of freedom (6DOF: three Cartesian coordinates for its position and three Euler an- gles - roll, pitch and yaw - for its orientation). Figure 12: The user is placed in front of the TV, and covers his head with a hat with infrared reflectors for the Qualisys system. The Qualisys system produces marker-based accurate data in real-time for object tracking at about 150 frames per second. The infrared light and marker do not interfere with RGB image and with infrared pattern from the Kinect. The choice of Qualisys as reference has been done especially in order to compare markerless methods without interferences. This positioning is shown on Figure 13. The angles compute from the different methods are the Euler angles.
  • 35. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 35/96 Figure 13: Setup for facial tracking recording. The Kinect for the head tracking algorithm is marked in green. We can also see the infrared reflectors for the Qualisys on the TV corners. We made several recordings with 10 different candidates. Each one performs the same head movement sequence (verticals, horizontals, diagonals and rotations) at 5 different distances from the screen: 1.20m, 1.50m, 2m, 2.5m and 3m. Movements performed are conventional movements that people make when facing a TV screen (pitch, roll, and yaw; combination of these movements; slow and fast rotation). Six of the candidates had very light skin, others had darker skin color. Some of the candidates had bears and others not. 5.1.2 Head direction validation: some results After having synchronized the results obtained by the Kinect SDK and the reference, as the sampling frequencies are different, we have interpolated reference values to obtain points at the same moments for the two systems. To make the comparison with the reference com- puted by the Qualisys, we use two metrics: the Root Mean Square Error (RMSE) and the correlation (CC). The correlation is a good indicator used to establish the link between a set of given values and its reference. It is interesting to analyse the average candidates’ correlation value ob- tained for each distance from the TV screen. If the correlation value is equal to 1, the two signals are the same. If the correlation is between 0.5 and 1, we consider a strong depend- ence. The 0 value shows that the two signals are independent and de -1 value correspond to the opposite of the signal. Figure 14 shows the correlation for pitch, Figure 15 for yaw and Figure 16 for roll. The curve from KinectSDK is compared with the reference obtained with the Qualisys system. The pitch, roll and yaw are described in Figure 10.
  • 36. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 36/96 Figure 14: Mean correlation with the reference for the pitch depending on the distance from TV. On Figure 14, we observe that the pitch (up-down movements) of the KinectSDK has a good correlation (0.84) at a distance of 1m20. And it decrease with the distance under the correlation value of 0.5 from 2 meters. Figure 15: Mean correlation with the reference for the yaw depending on the distance from TV. For the second angle, the yaw (right-left movement), we have on the Figure 15 good results for the KinectSDK with values upper than 0.9 for 1m20, 1m50 and 2m. Then de values de- crease from 0.85 for 2m50 to 0.76 for 3m. We can consider that the values of the correlation for the KinectSDK are pretty good. Thus the yaw measure is much more reliable than the pitch measure. 0 0,25 0,5 0,75 1 1m20 1m50 2m 2m50 3m KinectSDK 0 0,25 0,5 0,75 1 1m20 1m50 2m 2m50 3m KinectSDK
  • 37. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 37/96 Figure 16: Mean correlation with the reference for the roll depending on the distance from TV The KinectSDK has good correlation as the roll curve (0.93 to 0.7) on the Figure 16. After watching the correlation values, it is also interesting to look at the mean error made by each system. Indeed, a method with a big correlation and low RMSE is considered very good for head pose estimation. Figure 17 shows the RMSE for pitch, Figure 18 for yaw and Figure 19 for roll. Figure 17: Mean RMSE (in degrees) for the pitch depending on the distance to TV. We observe on Figures 17 to 19 that the RMSE obviousely increases with distance to the TV (more precisely to the Kinect sensor located on the TV). For the pitch (Figure 17), the KinectSDK is good at 1m20 with a RMSE of 5.9 degrees. The error grows with the ditance to 12 degrees after 2.5 m. 0 0,25 0,5 0,75 1 1m20 1m50 2m 2m50 3m KinectSDK 0 2 4 6 8 10 12 14 1m20 1m50 2m 2m50 3m Kinect SDK
  • 38. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 38/96 Figure 18: Mean RMSE (in degrees) for the yaw depending on the distance to TV. On the yaw, we observe on Figure 18 higher mean error (from 10 to 12), but this error growth with the ditance is very small. Figure 19: Mean RMSE (in degrees) for the roll depending on the distance. In the case of roll (Figure 19), the RMSE the KinectSDK gives values around 10 degrees with a smaller error at 2m for KinectSDK. While it is possible to extractn the roll has less interest in the LinkedTV Project where mainly yaw and pitch are used. The correlation is good and place the LinkedTV interest trakcer on the top of the state of the art in the filed. Moreover, all those values will become even better when using the Kinect One, second version of the Kinect sensor. 0 2 4 6 8 10 12 14 1m20 1m50 2m 2m50 3m Kinect SDK 0 5 10 15 20 25 1m20 1m50 2m 2m50 3m Kinect SDK
  • 39. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 39/96 5.2 Communication of behavioural features with the LinkedTV player The Interest Tracker software offers several measures of contextual and behavioural fea- tures, such as the number of people in the room, their position or basic expression analysis. All these features which are computed in real time without any need of data recordings can be sent to the LinkedTV player. The features are computed at an average frequency of 30 per second. To avoid sending too many messages to the player, only feature variations (events) will be sent through the network. The player gets the messages and adds the current video ID and time and forward the event to InBeat via GAIN API for user profiling and personalization. Several network protocols have been implemented in the software. We implemented HTTP (POST and GET) and websocket communication with respectively cURL9 and easywebsocket10 libraries. Another version of the Interest Tracker has been developed for artistic scenarios and this one uses the OSC protocol (see section 2.2 and D6.4 for more details). The list of features and their format for HTTP (GET) and websocket protocol are detailed in Table 3. Table 3: List of behavioural and contextual features available from Interest Tracker. The features can be sent through HTTP or websocket protocols. Feature Name Value Number of detected people in front of the TV Recognized_Viewers_NB 0, 1, 2 Websocket: {"interaction":{"type":"context"},"attributes":{"action":"Recognized_Viewers_NB","value":[0,1,2]}} HTTP Get: http://baseUrl?Recognized_Viewers_NB=[0,1,2] The screen the user is currently watching (for each user) [HAW05] Viewer_Looking 0 = viewer does not look to any screen (maybe someone called him or he is simply doing something else in front of the TV) 1 = viewer looks to the main screen from more than 1.5 seconds (if less than 1.5 s nothing is sent as this corre- sponds with the hazard peak or the monitoring looks). From 1.5 seconds we have "orienting" looks. 9 http://curl.haxx.se/ 10 https://github.com/dhbaird/easywsclient
  • 40. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 40/96 2 = main screen from more than 5 seconds. From 5 seconds we have "engaged" looks. 3 = main screen from more than 15 seconds. From 5 seconds we have "staring" looks. 4 = second screen from more than 1.5 seconds 5 = second screen from more than 5 seconds 6 = second screen from more than 15 seconds Websocket: {"interaction":{"type":"context"},"attributes":{"action":"Viewer_Looking","value":[0,1,2,3,4,5,6], "confidence":[0..1]},"user":{"id":[1,2,3,4,5,6]}} HTTP Get: http://baseUrl?UserID=[userID]&Viewer_Looking=[0,1,2,3,4,5,6] There are two people in front of the TV and both are looking at the same screen Viewer_Joint_Looking 1 if true, 0 else Websocket: {"interaction":{"type":"context"},"attributes":{"action":"Viewer_Joint_Looking","value":[0,1], "confidence":[0..1]}} HTTP Get: http://baseUrl?Viewer_Joint_Looking=[0,1] There are only adults in front of the TV Viewer_adults 1 if true, 0 else Websocket: {"interaction":{"type":"context"},"attributes":{"action":"Viewer_adults","value":[0,1],"confidence":[0..1]}} HTTP Get: http://baseUrl?Viewer_Adults=[0,1] Basic emotion analysis (for each user) Viewer_emotion 0 if neutral, 1 else Websocket: {"interaction":{"type":"context"},"attributes":{"action":"Viewer_emotion","value":[0,1],"confidence":[0..1]}} HTTP Get: http://baseUrl?UserID =[userID]&Viewer_Emotion[0,1] User lean forward/backward Viewer_engagement 0 = lean backward 1 = lean forward 2 = unknown (HTTP only)
  • 41. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 41/96 Websocket: {"interaction":{"type":"context"},"attributes":{"action":"Viewer_engagement","value":[0,1],"confidence":[0..1]}} HTTP Get: http://baseUrl?UserID=[userID]&Viewer_Engagement[0,1,2 ] In the examples, the value of baseUrl can be any valid URL which will handle such messag- es in the player interface. For testing and debugging, we used http://httpbin.org/get, an ad- dress which returns exactly the data it receives. 5.3 Communication of LinkedTV player with GAIN/InBeat module Communication between the LinkedTV player and the GAIN module of InBeat is performed by using REST API calls from player to the GAIN module. The API is designed to handle multiple types of interactions including standard player actions (e.g. play, pause, bookmark, view of enrichment …), user actions (login, bookmark ...), platform specific actions (add or remove second screen …) and contextual features (e.g. viewer looking, number of persons …). In this section, we will describe new version of the API and the communication format for all previously described actions. The communication was tested using a Noterik player simulator, which emulates user actions in the player, which generate respective calls to the GAIN API. 5.3.1 API description The first version of the GAIN API was described in D4.2 – User profile schema and profile capturing. Since GAIN became part of InBeat [KUC13] there has been a change in the base URL of the API, and we also updated the exchange format in order to support more types of interaction. Table 4: Description of REST service used for tracking of interactions Track interaction Description POST /gain/listener HTTP Method POST Content-Type application/json URI http://inbeat.eu/gain/listener cURL curl -v --data @data.json http://inbeat.eu/gain/listener data.json Format is described in following section. Status codes 201 – Created 400 – Bad request
  • 42. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 42/96 5.3.1.1 Player Actions There is no change in the format for actions generated by user’s operation of the remote con- trol (or player control buttons or gestures in general). All such events should be assigned value player in the category attribute, and the action attribute is set to one value from the enumeration of possible actions. The location attribute specifies the time passed since the start of the video. Figure 20: pause action performed by Rita at 32s of video Figure 20 presents an example of action “pause” performed by Rita at 32s of video. Exam- ples of action “bookmark“ of a specific chapter and “view” of enrichment presented by player to the user are described in Figure 21 and 22 respectively. Figure 21: bookmark of specific chapter performed by Rita Figure 22: view action of presented enrichment performed by Rita
  • 43. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 43/96 5.3.1.2 User Actions User actions are actions performed by a user that are not connected to any multimedia con- tent (in contrast to player actions). For user actions, the objectId attribute is set to empty string value. Each type of interaction is specified by category and action attributes. Figure 23: User Action Example: user Rita logged in 5.3.1.3 Application specific actions Application specific actions are actions invoked by the player. Figure 22 presents the typical example for an application specific action, which is opening of a new screen by the user. The format is similar to User action, the only one difference is in the category and the action attributes. These actions are also not connected to specific multimedia content. Figure 24: Application Specific Actions Example: user Rita opens a new screen (TV, second screen ...) 5.3.1.4 Contextual Features Contextual features are completely a new type of interaction. This type is identified by value context in the type attribute of the communication format. Combination of action and value attributes specify the type of the contextual feature. The example depicted on Figure 23 describes user Rita, who started looking (“Viewer_looking”) at a second screen device
  • 44. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 44/96 (“value=2”) at the 15th second of video. Other contextual features and their possible values are described in D4.4. Figure 25: Rita started looking at second screen device at 15th second of video 5.4 InBeat The InBeat platform is composed of three main modules: GAIN (General Analytics INtercpe- tor), module for tracking and aggregating the user interactions, Preference Learning module for analyzing user preferences, and Recommender System module providing on-demand recommendation. All components expose independent RESTful APIs, which allow to create custom workflows. Within the scope of this deliverable we focus on the GAIN module, which processes the in- teractions sent by the LinkedTV player for a given user and generates aggregated output, which is consumed by further WP4 components to build user profiles. GAIN logic combines multiple interest clues it derives from the interactions into a single scalar Interest attribute. GAIN aggregates all types of interactions including player actions (play, bookmark, view of enrichment …) and contextual features mainly provided by the Kinectbased Interest/Context tracker. Similarly, GAIN also performs the aggregation of the content of the shot, based on the entity annotation it receives either along with the interactions from the player or from the platform, into a feature vector usually corresponding to DBpedia or NERD concepts. Table 5: GAIN output example. prefix d_r_ indicates that the feature corresponds to a DBpedia resource from the English DBpedia, the prefix d_o_ to a concept from the DBpedia Ontology. For DBpedia in other lan- guages, the prefix is d_r_lang_. User_id d_r_ North_Korea … d_o_SoccerPlayer … c_userlooking … interest 1 0 0.9 1 0.3 2 1 1 0 1
  • 45. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 45/96 GAIN module became part of the InBeat service (see next section for more details). Howev- er, GAIN has the same purpose and goals – tracking and aggregating user interactions. GAIN uses a specific JSON format (details in the previous section) as input; the main output is a tabular form of aggregated data. In this section we describe new features supported by the latest release – support for contextualization, import of annotations for media content and tabular format serialization. The developments in the GAIN module are reported in Section 5.4.1 and 5.4.2. The InBeat Preference learner module is a standalone component which wraps EasyMiner and LISp-Miner as the underlying learning stack. We have experimented also with alternative learning backends, but the EasyMiner/LISp-Miner stack seems to provide the best experi- ence in terms of web-based user interface and preserving compatibility with existing LinkedTV components. The InBeat Recommender systems module (InBeat RS) has been developed simultaneously to other components of InBeat (GAIN as component for collecting and aggregating user feedback, Preference learning component that learns user preferences). This component is not part of main workflow of LinkedTV platform, but it was used as a development tool to test GAIN and PL modules, and to participate in benchmarking contests. The developments in the InBeat-RS module are reported in Section 5.4.3. 5.4.1 Import of annotations for media content In order to reduce communication between GAIN and the LinkedTV platform and to over- come issues with updating annotations of media on the GAIN side, we designed the ap- proach for sending annotations along with interactions. Figure 24 demonstrates the format for sending description of the object (chapter) the user interacted with. The LinkedTV player can provide annotation of the content played with entities, since the entity information is available in the player. GAIN module supports attachment of entities to interactions that are sent from the player. This approach should solve issues with updating annotations in GAIN module. GAIN needs annotations on its input to perform the aggregations. If there will be no annotation for played media content in internal storage of GAIN, it will lead to incorrect aggregations or delays caused by on demand fetch of data from the LinkedTV platform. Each annotation needs to be sent only once during the viewer session since it is cached in the GAIN module. This ap- proach allows to reduce the amount of data communicated between the player and GAIN. Another advantage is in temporal aspects of annotations – for next session can the platform provide updated annotations for specific media content. This approach allows active adapta- tion to new or updated content.
  • 46. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 46/96 Figure 26: Example of annotation send from player along with event 5.4.2 Support of contextualization In this section, we describe progress on the support for contextualization. Contextual fea- tures supported in GAIN were introduced in the Deliverable D4.4. However, the communica- tion format did not account for the following situation: the viewer is watching screen without any interactions or changes in context. In this case, the tracking module does not have any information about the content that was on the screen, since no events that would have this information attached are raised by the player, and cannot provide the correct output. For this specific situation we designed “keepalive“ interaction type that will provide data and descriptions for each shot. This interac- tion is raised by the player even if there is no explicit user action or change in context to noti- fy GAIN about the content being played. GAIN interprets this type of interaction as a simple “copy previous state” command. Figure 25 provides description of data format implemented in GAIN. Example: Viewer Rita would like to watch media content with 1…N. She presses “play” but- ton and attention tracker recognizes that she is watching the screen. Both interactions are sent to GAIN as an interaction with event “Play” and context “Viewer_looking=1”. She is watching the screen carefully without any interactions for all the remaining shots. Without support of “keepalive” interaction, GAIN could be able to derive interest clues only from the first shot and its annotations. On the other hand, when the player sends “keepalive” for each remaining shot, GAIN propagates “Viewer_looking=1” context to all these shots. Each of these “notified” shots is afterwards included to the final output and interest value can be cal- culated based on the propagated values.
  • 47. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 47/96 Figure 27: Example of "keepalive" event for propagation of context 5.4.3 InBeat Recommender System Recommendations are one of key features of personalization provided by LinkedTV platform. In this section we will briefly introduce and describe InBeat Recommender System (InBeat RS) that is available in the platform. InBeat RS consumes inputs from both GAIN and Prefer- ence Learning modules and provides recommendation as its outputs. The InBeat Recom- mender Systems participated in the RecSys’13 News Recommender Challenge (2nd place) and the CLEF NewsReel Challenge’14 (3rd place). 5.4.3.1 Components The Interest Beat (InBeat) recommender consists of several components described below. Recommendation Interface module obtains requests for recommendation, which compris- es the user identification, identification of the currently playing mediafragment (seed me- diafragment), and description of user context. As a response, the module returns a ranked list of enrichment content. Recommender algorithms module covers set of algorithms that can be used in LinkedTV platform. BR Engine module finds the rules matching the seed content vector, aggregates their con- clusions, and returns a predicted single value of interest. BR Ranking module combines the estimated user interest in the individual enrichment con- tent item produced by the BR Engine with the importance of the entity, for which the enrich- ment item was found.
  • 48. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 48/96 5.4.3.2 Recommender Algorithms InBeat RS contains implementations of several baseline algorithms and experimental imple- mentations of specific algorithms that fit to the LinkedTV workflow. InBeat RS can provide recommendations based on following algorithms: • Most recent – recommendations based on simple heuristic that selects a set of new- est items from all available candidates. • Most interacted – only top “viewed“ items are selected. • Content-based similarity – a set of most similar items to the item that user is currently viewing. • Collaborative filtering – both user-based and item-based versions are available. • Matching Preference Rules with Content • Rule-based similarity of users and their contextual • Ensemble – combining of algorithms. See next sections for more details. The most recent and most interacted methods are described in [KUC13], the rule based simi- larity algorithm is described in [KUC14]. The details of the “Matching preference rules with content” algorithm are given in Subs. 5.4.3.3., and the details of the ensemble method in Subs. 5.4.3.4. 5.4.3.3 In-Beat: Matching Preference Rules with Content Preference rules learned with EasyMiner via the In-Beat Preference Learner (see D4.4 Sec- tion 4.2.2) can be directly used to rank relevant enrichment content according to user prefer- ences, in addition to further processing of these rules by SimpleLearner. In this section, we introduce In-Beat Recommender, an experimental recommender, which serves for direct recommendation based on preference rules learnt with EasyMiner and stored in the Rule Store. 5.4.3.4 InBeat: Ensemble as combination of multiple recommenders Existence of different recommender algorithms that can have different quality in different sit- uations leads to idea of combining algorithms in order to get better overall quality. One algo- rithm can have better quality of recommendations in the morning, since users can be inter- ested in the newest content in the morning. The second algorithm can provide better recom- mendations for youth in the evening. InBeat RS deals with combining using ensemble based on Multi-Armed Bandit algorithm [KUL00]. The core of ensemble uses probabilistic distributions to decide which algorithm is probably the best for the specific situation. At the beginning all algorithms have the same probability of selection. One of them is randomly selected and its recommendations are pre- sented to the user. If the user chooses one of the recommended items, it is interpreted as positive case. The probability associated with the selected algorithm is increased and for all others, it is decreased. User can also provide negative feedback. The probability of the algo- rithm that provided the recommendation is then decreased.
  • 49. Contextualisation solution and implementation D4.6 © LinkedTV Consortium, 2014 49/96 The selection of the recommendation algorithm is affected by the modified probabilities. A more successful algorithm has a higher probability of selection. Since one algorithm can be successful at the beginning and its quality can rapidly get down later, ensemble also deals with the level of conservativeness. Ensemble supports different strategies in order to change speed of adaptation to new situation. It allows forgetting previous states and evolution of en- semble in time.