Music Recommendation and Discovery in the Long Tail

Music Recommendation and Discovery in
the Long Tail

Òscar Celma
Doctoral Thesis Defense
(Music Technology Group ~ Universitat Pompeu Fabra)

PhD defense // UPF // Feb 16th 2009

Music
Recommendation
(personalized)

and Discovery
(explore large music collections)

in the Long Tail
(non-obvious, novel, relevant music)

“The Paradox of Choice: Why More Is Less”, Barry Schwartz (2004)

The problem
Paradox of choice


music overload
• Today(August, 2007)
iTunes: 6M tracks


P2P: 15B tracks


53% buy music on line


• Finding unknown, relevant music is hard!
Awareness vs. access to content



music overload?
Digital Tracks – Sales data for 2007
●

●

Nearly 1 billion sold in 2007
●

●

1% of tracks account for 80% of sales
●

●

3.6 million tracks sold less than 100 copies, and
●

1 million tracks sold exactly 1 copy
●

•
•
•Data from Nielsen Soundscan 'State of the (US) industry' 2007 report


the Long Tail of popularity
• Help me find it! [Anderson, 2006]


research questions
• 1) How can we evaluate/compare different music
recommendation approaches?

• 2) How far into the Long Tail do music
recommenders reach?

• 3) How do users perceive novel (unknown to
them), non-obvious recommendations?


If you like
The Beatles
you might like ...


• popularity bias
• low novelty
ratio


FACTORS AFFECTING RECOMMENDATIONS:

Novelty
Relevance
Diversity
Cold start
Coverage
Explainability
Temporal effects


novelty vs. relevance


how can we measure novelty?
• predictive accuracy vs. perceived quality
• metrics
MAE, RMSE, P/R/F-measure, ...


Test

Train

Can't measure novelty



how can we measure novelty?
• predictive accuracy vs. perceived quality
• metrics
MAE, RMSE, P/R/F-measure, ...


Can measure novelty



how can we measure relevance?

quot;The key utility measure is user happiness. It
seems reasonable to assume that relevance of
the results is the most important factor:
blindingly fast, useless answers do not make a
user happy.quot;

 quot;Introduction to Information Retrievalquot;
(Manning, Raghavan, and Schutze, 2008)


research in music recommendation
• Google Scholar

Papers that contain “music recommendation” or “music recommender”
in the title (Accessed October 1st, 2008)


research in music recommendation
• ISMIR community


music recommendation approaches
• Expert-based
• Collaborative filtering
• Context-based
• Content-based
• Hybrid (combination)


• Expert-based
AllMusicGuide


Pandora


• Context-based
• Content-based


• Expert-based
User-Item matrix
 [Resnick, 1994], [Shardanand, 1995], [Sarwar, 2001]

• Context-based
• Content-based


• Expert-based
User-Item matrix

Similarity


 Cosine

 Adj. cosine

 Pearson

 SVD / NMF: matrix factorization
• Context-based
• Content-based


• Expert-based
User-Item matrix

Similarity


 Cosine

 Adj. cosine

 Pearson

 SVD / NMF: matrix factorization
Prediction (user-based)


 Avg. weighted


• Expert-based
• Context-based
WebMIR

thrash
[Schedl, 2008]

Content Reviews Lyrics Blogs heavy metal Tags Bios Playlists
Social
Edgy
Weird
concert
90s
Loud
rock
[Hu&Downie, 2006] [Celma et al., 2006] [Levy&Sandler, 2007] [Baccigalupo, 2008]
[Symeonidis, 2008]

• Content-based


• Expert-based
• Context-based
• Content-based
Audio features


 Bag-of-frames (MFCC) [Aucouturier, 2004], Rhythm [Gouyon,
2005], Harmony [Gomez, 2006], ...

Similarity


 KL-divergence: GMM [Aucouturier, 2002]
 EMD [Logan, 2001]
 Euclidean: PCA [Cano, 2005]
 Cosine: mean/var (feature vectors)
 Ad-hoc


• Expert-based
• Context-based
• Content-based
Weighted


Cascade


Switching



Work done


contributions


contributions

1) Network-based evaluation
Item Popularity + Complex networks


contributions


2) User-based evaluation


contributions


2) User-based evaluation
3) Systems


complex network analysis :: artists
• 3 Artist similarity (directed) networks
CF*: Social-based, incl. item-based CF (Last.fm)


 “people who listen to X also listen to Y”
CB: Content-based Audio similarity


 “X and Y sound similar”
EX: Human expert-based (AllMusicGuide)


 “X similar to (or influenced by) Y”


• Small-world networks [Watts & Strogatz, 1998]

Network traverse in a few clicks



• Indegree – avg. neighbor indegree correlation
r = Pearson correlation
 [Newman, 2002]



Kin(Bruce Springsteen)=534
=>
avg(Kin(sim(Bruce Springsteen)))=463



=>

Kin(Mike Shupp)=14
=>
avg(Kin(sim(Mike Shupp)))=15



=>

Kin(Mike Shupp)=14
=>
avg(Kin(sim(Mike Shupp)))=15

Homophily effect!


Last.fm presents assortative mixing (homophily)


 Artists with high indegree are connected together,
and similarly for low indegree artists


• Last.fm is a scale-free network [Barabasi, 2000]
power law exponent for the cumulative indegree


distribution [Clauset, 2007]

A few artists (hubs) control the network



• Summary: artist similarity networks
|------------|---------|-----|-----------|
| | Last.fm | CB | Exp (AMG) |
|------------|---------|-----|-----------|
|Small World | yes | yes | yes |
| | | | |
|Ass. mixing | yes | No | No |
| | | | |
| Scale-free | yes | No | No |
|------------|---------|-----|-----------|

Last.fm artist similarity network resembles to a social


network (e.g. facebook)


• But, still some remaining questions...

Are the hubs the most popular artists?


How can we navigate along the Long Tail, using


the artist similarity network?


contributions

Long Tail analysis


the Long Tail in music
• last.fm dataset (~260K artists)


the Long Tail in music
• last.fm dataset (~260K artists)
the beatles (50,422,827)

radiohead (40,762,895)
red hot chili peppers (37,564,100)

muse (30,548,064)
death cab for cutie (29,335,085)
pink floyd (28,081,366)
coldplay (27,120,352)
metallica (25,749,442)


the Long Tail model [Kilkki, 2007]

• F(x) = Cumulative distribution up to x



• Top-8 artists: F(8)~ 3.5% of total plays

50,422,827 the beatles
40,762,895 radiohead
37,564,100 red hot chili peppers
30,548,064 muse
29,335,085 death cab for cutie
28,081,366 pink floyd
27,120,352 coldplay
25,749,442 metallica



• Split the curve in three parts

(82 artists) (6,573 artists) (~254K artists)


contributions

+
Long Tail analysis


artist indegree vs. artist popularity
• Are the network hubs the most popular artists?

???


Last.fm: correlation between Kin and playcounts


 r = 0.621


Audio CB similarity: no correlation


 r = 0.032


Expert: correlation between Kin and playcounts


 r = 0.475


navigation along the Long Tail
• “From Hits to Niches”
# clicks to reach a Tail artist, starting in the Head


how many clicks?


Audio CB similarity example (VIDEO)



Audio CB similarity example


 Bruce Springsteen (14,433,411 plays)




 The Rolling Stones (27,720,169 plays)




 The Rolling Stones (27,720,169 plays)
 Mike Shupp (577 plays)


artist similarity vs. artist popularity
• navigation in the Long Tail
Similar artists, given an artist in the HEAD part:


CF CB EXP

64,74%
60,92%
54,68%
45,32%
33,26%
28,80%
(0%) 6,46% 5,82%
Head Mid Tail Head Mid Tail Head Mid Tail

Also, it can be seen as a Markovian Stochastic


process...


Markov transition matrix



Last.fm Markov transition matrix



From Head to Tail, with P(T|H) > 0.4


Number of clicks needed


 CF : 5
 CB : 2
 EXP: 2 HEAD

#clicks?
TAIL


artist popularity
Summary
|-----------------------|---------|-----|-----------|
|-----------------------|---------|-----|-----------|
| Indegree / popularity| yes | no | yes |
| | | | |
|Similarity / popularity| yes | no | no |
|-----------------------|---------|-----|-----------|


summary: complex networks+popularity
|-----------------------|---------|-----|-----------|
|-----------------------|---------|-----|-----------|
| Small World | yes | yes | yes |
| | | | |
| Scale-free | yes | no | no |
| | | | |
| Ass. mixing | yes | no | no |
|-----------------------|---------|-----|-----------|
| Indegree / popularity| yes | no | yes |
| | | | |
|Similarity / popularity| yes | no | no |
|-----------------------|---------|-----|-----------|
| POPULARITY BIAS | YES | NO | FAIRLY |
|-----------------------|---------|-----|-----------|


contribution #2: User-based evaluation
• How do users perceive novel, non-obvious
recommendations?
Survey


 288 participants
Method: blind music recommendation


 no metadata (artist name, song title)
 only 30 sec. audio excerpt


music recommendation survey
• 3 approaches:
CF: Social-based Last.fm similar tracks


CB: Pure audio content-based similarity


HYbrid: AMG experts + audio CB to rerank songs


 (Not a combination of the two previous approaches)
• User profile:
last.fm, top-10 artists


• Procedure
Do you recognize the song?


 Yes, Only Artist, Both Artist and Song title
Do you like the song?


 Rating: [1..5]


music recommendation survey: results
• Overall results


• Familiar recommendations (Artist & Song)


• Ratings for novel recommendations


• Ratings for novel recommendations

one-way ANOVA within subjects (F=29.13, p<0.05)


Tukey's test (pairwise comparison)



• % of novel recommendations


• % of novel recommendations

one-way ANOVA within subjects (F=7,57, p<0.05)


Tukey's test (pairwise comparison)



• Novel recommendations

Last.fm provides less % of novel songs, but of


higher quality


Why?
besides better understanding of music recommendation...
Open questions in the State of the Art in music discovery &
recommendation:

Is it possible to create a music discovery engine exploiting the
music content in the WWW? How to build it? How can we
describe the available music content?
=> SearchSounds

Is it possible to recommend, filter and personalize music
content available on the WWW? How to describe a user
profile? What can we recommend beyond similar artists?
=> FOAFing the Music


contribution #3: two complete systems
• Searchsounds
Music search engine


 keyword based search
 “More like this” (audio CB)


• Searchsounds

Crawl MP3 blogs


> 400K songs analyzed



• Searchsounds
Further work: improve song descriptions using


 Auto-tagging [Lamere, 2008] [Turnbull, 2007]
audio CB similarity [Sordo et al., 2007]
tags from the text (music dictionary)
 Feedback from the users
thumbs-up/down
tag audio content


• FOAFing the music
Music recommendation


 constantly gathering music related info via RSS feeds
 It offers:
artist recommendation
new music releases (iTunes, Amazon, eMusic, Rhapsody, Yahoo! Shopping)
album reviews
concerts close to user's locations
related mp3 blogs and podcasts


Integrates different user accounts (circa 2005!)


Semantic Web (FOAF, OWL/RDF) + Web 2.0


2nd prize Semantic Web Challenge (ISWC 2006)



Further work:


 Follow Linking Open Data best practices
 Link our music recommendation ontology with
Music Ontology [Raimond et al., 2007]
 (Automatically) add external information from:
Myspace
Jamendo
Garageband
...


summary of contributions :: research questions

recommenders reach?



Objective framework comparing music rec.


approaches (CF, CB, EX) using Complex Network
analysis
Highlights fundamental differences among the


approaches

recommenders reach?




recommenders reach?
Combine 1) with the Long Tail model, and Markov


model theory
Highlights differences in terms of discovery and


navigation




recommenders reach?

Survey with 288 participants


Still room to improve novelty (3/5 or less...)


 To appreciate novelty users need to understand the
recommendations


recommenders reach?
=>
Systems that perform best (CF) do not exploit the


Long Tail, and
Systems that can ease Long Tail navigation (CB) do


not perform good enough
Combine (hybrid) different approaches!



Systems that perform


best (CF) do not exploit
the Long Tail, and
Systems that can ease


Long Tail navigation (CB)
do not perform good
enough
Combine different


approaches!


summary of contributions :: systems
• Furthermore...
2 web systems that improved existing State of the


Art work in music discovery and recommendation
 Searchsounds: music search engine exploiting music
related content in the WWW
 FOAFing the Music: music recommender based on a
FOAF user profile, also offering a number of extra
features to complement the recommendations


further work :: limitations
• 1) How can we evaluate/compare different
recommendations approaches?
Dynamic networks
 [Leskovec, 2008]

 track item similarity over time
 track user's taste over time
 trend and hype detection


• 2) How far into the Long Tail do recommendation
algorithms reach?
Intercollections


how to detect bad quality music in the tail?



• 3) How do users perceive novel, non-obvious
recommendations?
 User understanding [Jennings, 2007]

 savant, enthusiast, casual, indifferent
Transparent, steerable recommendations
 [Lamere &
Maillet, 2008]

 Why? as important as What?


summary: articles
• #1) Network-based evaluation for RS
 O. Celma and P. Cano. “From hits to niches? or how
popular artists can bias music recommendation and
discovery”. ACM KDD, 2008.
 J. Park, O. Celma, M. Koppenberger, P. Cano, and J. M.
Buldu. “The social network of contemporary popular
musicians”. Journal of Bifurcation and Chaos (IJBC),
17:2281–2288, 2007.
 M. Zanin, P. Cano, J. M. Buldu, and O. Celma. “Complex
networks in recommendation systems”. WSEAS, 2008
 P. Cano, O. Celma, M. Koppenberger, and J. M. Buldu
“Topology of music recommendation networks”. Journal
Chaos (16), 2006.
• #2) User-based evaluation for RS
 O. Celma and P. Herrera. “A new approach to
evaluating novel recommendations”. ACM RecSys, 2008.


summary: articles
• #3) Prototypes
FOAFing the Music


 O. Celma and X. Serra. “FOAFing the music: Bridging
the semantic gap in music recommendation”. Journal of
Web Semantics, 6(4):250–256, 2008.
 O. Celma. “FOAFing the music”. 2nd Prize Semantic Web
Challenge ISWC, 2006.
 O. Celma, M. Ramirez, and P. Herrera. “FOAFing the
music: A music recommendation system based on rss
feeds and user preferences”. ISMIR, 2005.
 O. Celma, M. Ramirez, and P. Herrera. “Getting music
recommendations and filtering newsfeeds from foaf
descriptions”. Scripting for the Semantic Web, ESWC,
2005.


summary: articles
• #3) Prototypes
Searchsounds


 O. Celma, P. Cano, and P. Herrera. “Search sounds: An
audio crawler focused on weblogs”. ISMIR, 2006.
 V. Sandvold, T. Aussenac, O. Celma, and P. Herrera.
“Good vibrations: Music discovery through personal
musical concepts”. ISMIR, 2006.
 M. Sordo, C. Laurier, and O. Celma. “Annotating music
collections: how content-based similarity helps to
propagate labels”. ISMIR, 2007.


summary: articles
• Misc. (mainly MM semantics)
 R. Garcia C. Tsinaraki, O. Celma, and S. Christodoulakis.
“Multimedia Content Description using Semantic Web
Languages” book, Chapter 2. Springer–Verlag, 2008.
 O. Celma and Y. Raimond. “Zempod: A semantic web
approach to podcasting”. Journal of Web Semantics,
6(2):162–169, 2008.
 S. Boll, T. Burger, O. Celma, C. Halaschek-Wiener, E.
Mannens. “Multimedia vocabularies on the Semantic
Web”. W3C Technical report, 2007.
 O. Celma, P. Herrera, and X. Serra. “Bridging the music
semantic gap”. SAMT, 2006.
 R. Garcia and O. Celma. “Semantic integration and
retrieval of multimedia metadata”. ESWC, 2005


summary: articles
 R. Troncy, O. Celma, S. Little, R. Garcia and C. Tsinaraki.
“MPEG-7 based multimedia ontologies: Interoperability
support or interoperability issue?” MARESO, 2007.
 M. Sordo, O. Celma, M. Blech, and E. Guaus. “The quest
for musical genres: Do the experts and the wisdom of
crowds agree?”. ISMIR, 2008.
• Music Recommendation Tutorials -- with Paul Lamere
 ACM MM, 2008 (Vancouver, Canada)
 ISMIR, 2007 (Vienna, Austria)
 MICAI, 2007 (Aguascalientes, Mexico)


summary: dissemination
• PhD Webpage
http://mtg.upf.edu/~ocelma/PhD


 PDF
 Source code
Long Tail Model in R
 References
Citeulike
 Related links
delicious


acknowledgments

NB: The complete list of acknowledgments can be found in the document

PICA-PICA
UPF-Tanger, 3rd floor

Music Recommendation and Discovery in the Long Tail

Related slideshows

More Related Content

Music Recommendation and Discovery in the Long Tail