master_thesis.pdf

MOHAMED V UNIVERSITY– RABAT
NATIONALADVANCED SCHOOL OF COMPUTER
SCIENCE AND SYSTEM ANALYSIS
Master Thesis
Data science and Big Data
Designing and Developing a Personalized Country Recommender System
Presented By :
EL MAJJODI Ayoub
Octobre, 2019
Supervised By:
Pr. Lamia BENHIBA, ENSIAS
Pr. Nabil EL IONI, UNIBZ
Pr. Mehdi ELAHI, UNIBZ

2
Outline
I. Introduction
II. State of the Art
III. Methodology
III. Results
V. Conclusion

State of the Art
3
3
Introduction Methodology Results Conclusion
Motivation Research questions
Proposed Solution
Quality of Life, UK
Education quality, USA
Health Care, UEA

State of the Art
4
4
Proposed Solution
Ranking lists: best place to live

State of the Art
5
Proposed Solution
Recommender Systems found success in many domains such as :
E-commerce
Education
Entertainment
Travel

State of the Art
Motivation Research question
Proposed Solution
Personalized System With a Recommendation of Countries
6

State of the Art
Project Proposal
In order to achieve our objective, we formulated a number of research questions :
1- Which recommender algorithms can be adopted -based on the preferences of
users in order to generate personalized country ranking ?
2- What are the most important features that users consider when deciding to move
to another country ?
3- Do recommender algorithm preferences depend on personality types ?
4- Will the system for generating personalized country ranking be usable
according to the user’s assessment ?
7

State of the Art
Definition Human Behaviour and
Personality
Knowledge Source Approaches Evaluation
Recommender systems :
{people provide recommendations as inputs, which the system
then aggregates and directs to appropriate recipients} (Resnick, 1997)
{Any system that produces individualized recommendations as output or has
the effect of guiding the user in a personalized way to interesting or useful objects
in a large space of possible options} (Burkee 2002)
8

State of the Art
Personality
Recommender systems formulation :
● U the set of all the users
● I the set of all the possible items
● Let f be the utility function that measures the suitability of
item i to the users u needs
● A system of recommendation tries to choose item i’ in I that
maximize the user’s utility function :
9

State of the Art
Personality
The data used by Recommender systems can be categorized into :
Items : objects that are recommended (goods, movies,books,
courses ..).
Transactions : recorded interactions between the user and
the system.
Users : users of the recommender system
10

State of the Art
Personality
Recommender system approaches :
Collaborative Filtering
Content Based Filtering
Hybrid Filtering
11

State of the Art
Personality
Knowledge Source Approaches:
collaborative filtering
Evaluation
Collaborative Filtering : generate
ratings for new user based on people with
similar interest.
Example :collaborative filtering
12

State of the Art
Personality
Content Based filtering
Evaluation
Content Based filtering : recommends an
item to user based on, the description of item
characteristic and user profile in term of item
characteristics.
Example: Content based filtering
13

State of the Art
Personality
Hybrid Filtering
Evaluation
combine two or more recommendation techniques,
mostly collaborative and content based filtering to
make recommendations.
Hybrid filtering :
Example: Hybrid filtering
14

State of the Art
Personality
Knowledge Source Approaches: Evaluation
Statistical Measures
Statistical measures :
Mean Absolute Error (MAE) : measures the average absolute deviation
between a predicted rating and the user’s true ratings.
{
{
{
Ratings set predicted rating of
item i to user u.
true rating
15

State of the Art
Personality
Statistical Measures
Statistical measures :
Root Mean Squared Error (RMSE) : between the predicted values and
actual rating.
16

State of the Art
Personality
Usefulness
Usefulness :
★ Novelty
★ Diversity
★ Understand ME
★ Satisfaction
★ Accuracy
17

State of the Art
Personality
Usefulness
Personality :
{ Individual’s characteristic pattern of thinking, feeling, and psychological
mechanism, inﬂuences how people make their decision. }(The personality puzzle
1997)
18

State of the Art
Personality
Usefulness
Big-5 personality traits :
➢ Openness : reflects a person’s tendency to intellectual curiosity, creativity and preference
for novelty and variety of experience.
➢ Conscientiousness : reflects a person’s tendency to show self-discipline and aim for
personal achievements, and to have an organized and dependable behavior.
➢ Neuroticism: reflects a person’s tendency to experience unpleasant emotions.
➢ Extraversion: reflects a person’s tendency to show sociability, talkativeness and
assertiveness traits.
➢ Agreeableness : reflects a person’s tendency to be kind, concerned, truthful and cooper-
ative towards others.
19

State of the Art
Data description Experiment
Recommender Algorithms Implementation & Design
Form used in training dataset collection
20

State of the Art
Training dataset:
- 136 users
- 25 country
- 3400 rows
Ratings Matrix
21

State of the Art
Recommender Algorithms
cross-validation
Implementation & Design
Adopted Algorithms :
Cross validation results
SVD
KNN-B
Co-clustering
22

State of the Art
SVD
➢ SingularValue Decomposition (SVD): factorize the original ratings matrix into two
matrices using a prediction function.
R = Ratings matrix, m users, n item
P=User matrix , m user, f features
Q= Item matrix, n item, f
A rating r(ui) can be estimated by dot product of user vector p(u) and item vector q(i).
23

State of the Art
KNN-B
➢ K-Nearest Neighbor Baseline (KNN-B): Finding like-minded users or similar items
for a given users, based on :
➔ A similarity measures
➔ A function that fetch the neighborhood using the similarity measures
➔ A rating prediction function based on the neighbor ratings.
24

State of the Art
Co-clustering
➢ Co-clustering: grouping both similar user and similar items into, categories
synchronously.
Example: Co-clustering 25

State of the Art
26
System Architecture

State of the Art
27
User flow :
● Registration step
● Username, email, password
● Personality survey: Five factor model
● Openness, conscientiousness ,
extraversion, agreeableness,
neuroticism

State of the Art
28
User flow :
● Select Features (at least 3 out of 12)
1. Education quality
2. Political insecurity
3. Social conflict
4. Work opportunities
5. Health care
6. Income difference
7. Wars and dictatorship
8. Family member abroad
9. Cultural and linguistic similarities
10. Working atmosphere
11. Shorter distance
12. Crime rate

State of the Art
29
User flow :
● Rate countries (at least 5 ) using 5-star rating scale

State of the Art
30
User flow :
● Result (3 lists)
● Evaluation survey : 5 metrics , accuracy, Diversity, understand Me,
satisfaction, Novelty.
● List 1 : SVD
● List 2 : KNN-B
● List 3 : Co-clustering

State of the Art
31
User flow :
● Usability Survey (System Usability Scale,
SUS : score 10-item questionnaire based
on 5-point Likret scale)

State of the Art
32
➢ Online evaluation with real user
➢ 281 new user attempted the experiment, 109 completed all the steps
➢ Data collected was analysed in order to find possible patterns
Registration
281
(100%)
Personality
241
(85%)
Features
226
(80%)
Ratings
193
(69%)
Evaluate
189
(67%)
Usability
109
(38%)

State of the Art
33
Under 18
Age:
12
(5%)
18-24
100
(42%)
25-35
93
(39%)
35-45
23
(9%)
45-55
10
(4%)
Over 55
2
(1%)
Females
Origin Country:
65
(27%)
Males
170
(71%)
25-35
5
(25%)
Gender:
➢ USA (20%)
➢ Morocco (11%)
➢ Egypte (5%)
+ Various other countries (64%)

State of the Art
Algorithm Comparison System Usability
Personality & Algorithm Preferences
34
Feature Preferences
Metric Question Co-clustering KNN-B SVD
Accuracy 1. Which list has more selections that you ﬁnd appealing ? 33% 29% 38%
Accuracy 2. Which list has more obviously bad suggestions for you ? 59% 31% 10%
Diversity 3. Which list has more countries that are similar to each other ? 26% 26% 48%
Diversity 4. Which list has a more varied selection of countries ? 32% 40% 18%
Diversity 5. Which list has countries that match a wider variety of preferences ? 24% 74% 29%

State of the Art
Algorithm Comparison
Personality & algorithm preferences
35
Feature Preferences
Understand ME 6. Which list better reﬂects your preferences in countries ? 18% 26% 56%
Understand ME 7. Which list seems more personalized to your countries ratings ? 21% 24% 55%
Understand ME 8. Which list represents more mainstream ratings instead of your
own ?
15% 24% 61%
Satisfaction 9. Which list would better help you ﬁnd countries to consider ? 14% 40% 46%
Satisfaction 10. Which list would you be more likely to recommend to your
friends ?
19% 19% 62%
System Usability

State of the Art
36
Feature Preferences
Novelty 11. Which list has more countries you did not expect ? 55% 33% 12%
Novelty 12. Which list has more countries that are familiar to you ? 23% 29% 48%
Novelty 13. Which list has more pleasantly surprising countries ? 25% 49% 26%
Novelty 14. Which list provides fewer new suggestions ? 29% 17% 54%
System Usability

State of the Art
37
Feature Preferences
RQ1: Which recommender algorithms can be adopted -based on the preferences
of users in order to generate personalized country ranking ?
SVD : better in terms of accuracy, Understand Me, Satisfaction
SVD : many mainstream suggestions
KNN-B : better in terms of Diversity and Novelty
Co-clustering :Deemed underperforming by majority of users across
most of the categories of metrics
System Usability

State of the Art
Feature Preferences
Overall (226)
Work Opportunities 161 (72%)
Education Quality 105 (42%)
Working Atmosphere 100 (44%)
Health Care 97 (43%)
Income Difference 84 (37%)
Political Insecurity 59 (26%)
Crime Rate 58 (26%)
Social Conflict 49 (22%)
Cultural & Linguistic Similarities 41 (21%)
Wars & Dictatorship 37 (16%)
Family Member Abroad 20 (8%)
Shorter Distance 15 (6%)
Males (170)
Work Opportunities 108(70%)
Education Quality 81(48%)
Females (65)
Work Opportunities 40 (61%)
Working Atmosphere 31 (48%)
38
System Usability

State of the Art
39
Feature Preferences
RQ2: What are the most important features that users consider when
deciding to move to another country ?
Top 4 features :
➢ Work Opportunities
➢ Education Quality
➢ Working Atmosphere
➢ Health Care
System Usability

State of the Art
40
Feature Preferences System Usability
Accuracy Diversity Understand ME Satisfaction Novelty
Openness SVD Co-clustering SVD SVD KNN-B
Conscientiousness SVD Co-clustering SVD SVD KNN-B
Extraversion SVD KNN-B SVD SVD KNN-B
Agreeableness SVD KNN-B SVD SVD KNN-B
Neuroticism SVD Co-clustering SVD SVD KNN-B
Personality and Algorithm preferences

State of the Art
41
Feature Preferences
RQ3: Do recommender algorithm preferences depend on personality
types ?
➢ People with different types of personality may tend to choose results generated
by different types of algorithms.
System Usability

State of the Art
42
Feature Preferences System Usability
Sus Score Interpretion :
Score Grade Rating
> 80 A Excellent
68 - 80 B Good
68 C OKay
51-68 D Poor
< 51 F Awful
Final score : 60.82
Lowest : 22.5
Highest : 100

State of the Art
43
Feature Preferences
RQ4: Will the system for generating personalized country ranking be usable
according to the user’s assessment ?
➢ Scored lower than well accepted benchmark
➢ The system didn’t pass the usability test
System Usability

State of the Art
44
Conclusion
➔ We survyed people to gather explicit rating about some countries.
➔ Proposed System evaluated according to real users
assessment .
➔ A recomender system of countries was designed, deployed.
➔ We cross validate seviral collaborative filtering algorithms.
Future works

State of the Art
45
➔ Investigate whether recommender system based on deep
learning would improve quality of recommendation in this
domain.
➔ Investigate the usefulness of making recommendations related
to immigration factors.
➔ Incorporate the personality information in the prediction
model.
➔ Extend the experement to gather more data.
Conclusion Future works

master_thesis.pdf

Related slideshows

More Related Content

master_thesis.pdf