SlideShare a Scribd company logo
The right path to relevant search
Charlie Hull, http://www.opensourceconnections.com
● Leading experts on open source search (Solr, Elasticsearch…)
● We empower search teams with training, guidance & team
augmentation
● We wrote the book on relevance
● Based on the USA with a new UK office (me!)
● We run the Haystack conference www.haystackconf.com
○ Haystack EU - Berlin Oct 28th - SOLD OUT
○ Haystack US - Charlottesville Apr 29th-30th
Who are we?
● Leading experts on open source search (Solr, Elasticsearch…)
● We empower search teams with training, guidance & team
augmentation
● We wrote the book on relevance
● Based on the USA with a new UK office (me!)
● We run the Haystack conference www.haystackconf.com
○ Haystack EU - Berlin Oct 28th - SOLD OUT
○ Haystack US - Charlottesville Apr 29th-30th
We’re hiring!
Talk to me
later...
Who are we?
Relevance Cornucopia🦃 Training Event:
http://o19s.com/blog/2019/09/11/announcing-relevance-cornucopia/
● Week of November 10, Charlottesville, VA, USA
● "Think Like a Relevance Engineer" for Solr or Elasticsearch
● "Learning to Rank" & "Natural Language Search" training
● Delivered by our crack team of expert relevance consultants
Plug alert!
1. Three aspects of search quality
2. Focusing on Relevance
3. Not just a technology problem
4. Measure, rinse, repeat
Outline
“Our search should work like Google!”
“Our search should work like Google!”
“We’ve spent all this money on a new
search engine, why do people still hate
search?”
“Our search should work like Google!”
“We’ve spent all this money on a new
search engine, why do people still hate
search?”
“We should be doing AI...”
Three aspects of search quality
● Relevance (the right
results in the right order)
● Performance (quick &
fresh)
● Experience (design &
interaction)
https://opensourceconnections.com/blog/2018/11/19/an-introduction-to-search-quality/
Three aspects of search quality
● Relevance (the right
results in the right order)
● Performance (quick &
fresh)
● Experience (design &
interaction)
Three aspects of search quality
https://opensourceconnections.com/blog/2018/11/19/an-introduction-to-search-quality/
Focusing on Relevance
2019 forecast template
Ourintranet.somewhere.templates.776h.tmp
“Use this template for all 2019 forecasts…”
.doc - 2267kb
Actual Sales in 2019
Ourintranet.somewhere.salesfigures.xls
“...sales 2019…”
.xls - 3348kb
2019 forecast for East of England
Ourintranet.somewhere.stuff.forecasting2019.htm
“I’ve written a weather forecast for 2019 here…”
.htm - 1310kb
Search Engine Results PageQuery
“2019 sales forecast”
Focusing on Relevance
iPad headphones white
Ourshop.com/headphones/ipad/47264872786.htm
“Great headphones for your iPad in white”
$12.99
iPad 19
Ourshop.com/tablets/ipad/47264872786.htm
“iPad in white with black checks”
$207.99
iPad case
Ourshop.com/cases/ipad/47313.htm
“iPad case”
$27.99
SERPQuery
“White iPad”
Focusing on Relevance
iPad headphones white
Ourshop.com/headphones/ipad/47264872786.htm
“Great headphones for your iPad in white”
$12.99
iPad 19
Ourshop.com/tablets/ipad/47264872786.htm
“iPad in white with black checks”
$207.99
iPad case
Ourshop.com/cases/ipad/47313.htm
“iPad case”
$27.99
SERPQuery
“White iPad”
Not just a technology problem
● SMEs / Marketing know what’s wrong with relevance
- but don’t usually know how search actually works
● SMEs / Marketing know what’s wrong with relevance
- but don’t usually know how search actually works
● I.T. know how to tweak the search engine - but don’t
know how to judge what’s ‘good’ for business
Not just a technology problem
● SMEs / Marketing know what’s wrong with relevance
- but don’t usually know how search actually works
● I.T. know how to tweak the search engine - but don’t
know how to judge what’s ‘good’ for business
● Management don’t understand why you can’t work
together to solve the problem
Not just a technology problem
● SMEs / Marketing know what’s wrong with relevance
- but don’t usually know how search actually works
● I.T. know how to tweak the search engine - but don’t
know how to judge what’s ‘good’ for business
● Management don’t understand why you can’t work
together to solve the problem but think AI will help
Not just a technology problem
Not just a technology problem
...it’s a people problem.
Not just a technology problem
Building your search team
● Cross functional
Building your search team
● Cross functional
● Educate & empower
Building your search team
● Cross functional
● Educate & empower
● Build skills internally (recruiting is hard)
Building your search team
● Cross functional
● Educate & empower
● Build skills internally (recruiting is hard)
● Use external resources:
○ Conferences, Meetups
○ Books & blogs
○ Training
○ Experts
Building your search team
● Cross functional
● Educate & empower
● Build skills internally (recruiting is hard)
● Use external resources:
○ Conferences, Meetups
○ Books & blogs
○ Training
○ Experts - with a Proven Process
A Proven Process
Roadmap & Architecture to plan 90 daysDiscovery 90 Day Journey
Training for your team
Team Maturity Reassessment Trusted Advisor
Accelerator
The relevance-centered enterprise
● Relevance Feedback loops
○ Lost sales & complaints
○ Business awareness
○ Content curation
○ Pairing
○ Test-driven
relevance tuning
See https://www.manning.com/books/relevant-search
● Relevance Feedback loops
○ Lost sales & complaints
○ Business awareness
○ Content curation
○ Pairing
○ Test-driven
relevance tuning
The relevance-centered enterprise
● Relevance Feedback loops
○ Lost sales & complaints
○ Business awareness
○ Content curation
○ Pairing
○ Test-driven
relevance tuning
The relevance-centered enterprise
Measure, rinse, repeat
● Without measurement how do you know you’re
getting any better?
Measure, rinse, repeat
● Without measurement how do you know you’re
getting any better?
● Measure everything!
Measure your search maturity
Business
Understand
User Needs
Search /
Discovery
Tech
Experiment
Driven
UX Enrichment
Advanced
Data
Inventory
Business
stakeholders
use real-time
KPIs
Producing
quality data
from analytics
Develops
custom
plugins
Ops supports
A/B testing &
offline tests
Innovative
Discovery
(chatbots,
etc)
NLP & Data
science team
Varied,
complex,
large-scale
data
Practitioner
Occasional
reporting
Some user
testing / basic
analytics
Complex
relevance
config; uses
plugins
Available, but
complex
experiments
UI supports
findability
Taxonomies /
Ontologies
Moderate
data
complexity
Baseline
Business
impact not
measured
No query logs
or user
testing
Stock or
moderately
tweak config
Test
discovery
manually,
deployed
rarely
10 search
links on page
Minor
enrichment
(synonyms)
Very simple
data model
Measure your relevance
iPad headphones white
Ourshop.com/headphones/ipad/47264872786.htm
“Great headphones for your iPad in white”
$12.99
iPad 19
Ourshop.com/tablets/ipad/47264872786.htm
“iPad in white with black checks”
$207.99
iPad case
Ourshop.com/cases/ipad/47313.htm
“iPad case”
$27.99
SERPQuery
“White iPad” 3/10
9/10
1/10
Rating
Measure your relevance
Measure your relevance
Measure your relevance
Better ways to measure relevance
● Measure search & engagement
○ Define ‘success’
● Measure search & engagement
○ Define ‘success’
○ Instrumentation & Logs
Better ways to measure relevance
● Measure search & engagement
○ Define ‘success’
○ Instrumentation & Logs
○ Human judgements
Better ways to measure relevance
● Measure search & engagement
○ Define ‘success’
○ Instrumentation & Logs
○ Human judgements
○ As much data as you can get
Better ways to measure relevance
● Define metrics that matter e.g.
○ Discounted Cumulative Gain (DGC)
○ Expected Reciprocal Rank (ERR) “the expected
reciprocal length of time that the user will take to find a relevant document”
Better ways to measure relevance
● Define metrics that matter e.g.
○ Discounted Cumulative Gain (DGC)
○ Expected Reciprocal Rank (ERR) “the expected
reciprocal length of time that the user will take to find a relevant document”
● But make sure to also consider business metrics!
Better ways to measure relevance
● Build a relevance testing framework
○ Test UIs
○ Open source tools
Better ways to measure relevance
Quepid
● Relevancy workbench for
Elasticsearch & Solr
● Allows SMEs to rate results
● Allows relevance engineers to
tweak Solr or Elasticsearch settings
and see how the scores change
● Free hosted version at quepid.com
● Open source at
https://github.com/o19s/quepid
Rated Ranking Evaluator (RRE)
● Search quality evaluation tool
for Elasticsearch & Solr
● Given rated documents and a set
of queries, runs these and shows
many metrics
● From Sease Ltd.
● Open source at
https://github.com/SeaseLtd/rate
d-ranking-evaluator
● Build a relevance testing framework
○ Test UIs
○ Open source tools
■ Quepid
■ Rated Ranking Evaluator
■ More are being built!
Better ways to measure relevance
● Build a relevance testing framework
○ Test UIs
○ Open source tools
■ Quepid
■ Rated Ranking Evaluator
■ More are being built!
○ Open source your own?
Better ways to measure relevance
● Go Open:
○ Most of the interesting work in relevance
measurement & testing is in the open
source world
Better ways to measure relevance
● Go Open:
○ Most of the interesting work in relevance
measurement & testing is in the open
source world
○ Think about what data, examples, code you
can open source
Better ways to measure relevance
● Go Open:
○ Most of the interesting work in relevance
measurement & testing is in the open
source world
○ Think about what data, examples, code you
can open source
○ Helps attract talent and encourages
collaboration!
Better ways to measure relevance
1. Three aspects of search quality
a. Relevance, Performance, Experience
Recap & Takeaways
1. Three aspects of search quality
a. Relevance, Performance, Experience
Recap & Takeaways
1. Three aspects of search quality
2. Focusing on Relevance
Recap & Takeaways
1. Three aspects of search quality
2. Focusing on Relevance
3. Not just a technology problem
a. Build & Empower your search
team
Recap & Takeaways
1. Three aspects of search quality
2. Focusing on Relevance
3. Not just a technology problem
4. Measure, rinse, repeat
a. Measure everything
b. Search Maturity
c. Go open source & join the community!
Recap & Takeaways
● Join Relevance Slack at
https://opensourceconnections.com/slack
● We’re hiring!
● Follow me on Twitter at @FlaxSearch
Any questions?

More Related Content

The right path to making search relevant - Taxonomy Bootcamp London 2019

  • 1. The right path to relevant search Charlie Hull, http://www.opensourceconnections.com
  • 2. ● Leading experts on open source search (Solr, Elasticsearch…) ● We empower search teams with training, guidance & team augmentation ● We wrote the book on relevance ● Based on the USA with a new UK office (me!) ● We run the Haystack conference www.haystackconf.com ○ Haystack EU - Berlin Oct 28th - SOLD OUT ○ Haystack US - Charlottesville Apr 29th-30th Who are we?
  • 3. ● Leading experts on open source search (Solr, Elasticsearch…) ● We empower search teams with training, guidance & team augmentation ● We wrote the book on relevance ● Based on the USA with a new UK office (me!) ● We run the Haystack conference www.haystackconf.com ○ Haystack EU - Berlin Oct 28th - SOLD OUT ○ Haystack US - Charlottesville Apr 29th-30th We’re hiring! Talk to me later... Who are we?
  • 4. Relevance Cornucopia🦃 Training Event: http://o19s.com/blog/2019/09/11/announcing-relevance-cornucopia/ ● Week of November 10, Charlottesville, VA, USA ● "Think Like a Relevance Engineer" for Solr or Elasticsearch ● "Learning to Rank" & "Natural Language Search" training ● Delivered by our crack team of expert relevance consultants Plug alert!
  • 5. 1. Three aspects of search quality 2. Focusing on Relevance 3. Not just a technology problem 4. Measure, rinse, repeat Outline
  • 6. “Our search should work like Google!”
  • 7. “Our search should work like Google!” “We’ve spent all this money on a new search engine, why do people still hate search?”
  • 8. “Our search should work like Google!” “We’ve spent all this money on a new search engine, why do people still hate search?” “We should be doing AI...”
  • 9. Three aspects of search quality
  • 10. ● Relevance (the right results in the right order) ● Performance (quick & fresh) ● Experience (design & interaction) https://opensourceconnections.com/blog/2018/11/19/an-introduction-to-search-quality/ Three aspects of search quality
  • 11. ● Relevance (the right results in the right order) ● Performance (quick & fresh) ● Experience (design & interaction) Three aspects of search quality https://opensourceconnections.com/blog/2018/11/19/an-introduction-to-search-quality/
  • 12. Focusing on Relevance 2019 forecast template Ourintranet.somewhere.templates.776h.tmp “Use this template for all 2019 forecasts…” .doc - 2267kb Actual Sales in 2019 Ourintranet.somewhere.salesfigures.xls “...sales 2019…” .xls - 3348kb 2019 forecast for East of England Ourintranet.somewhere.stuff.forecasting2019.htm “I’ve written a weather forecast for 2019 here…” .htm - 1310kb Search Engine Results PageQuery “2019 sales forecast”
  • 13. Focusing on Relevance iPad headphones white Ourshop.com/headphones/ipad/47264872786.htm “Great headphones for your iPad in white” $12.99 iPad 19 Ourshop.com/tablets/ipad/47264872786.htm “iPad in white with black checks” $207.99 iPad case Ourshop.com/cases/ipad/47313.htm “iPad case” $27.99 SERPQuery “White iPad”
  • 14. Focusing on Relevance iPad headphones white Ourshop.com/headphones/ipad/47264872786.htm “Great headphones for your iPad in white” $12.99 iPad 19 Ourshop.com/tablets/ipad/47264872786.htm “iPad in white with black checks” $207.99 iPad case Ourshop.com/cases/ipad/47313.htm “iPad case” $27.99 SERPQuery “White iPad”
  • 15. Not just a technology problem ● SMEs / Marketing know what’s wrong with relevance - but don’t usually know how search actually works
  • 16. ● SMEs / Marketing know what’s wrong with relevance - but don’t usually know how search actually works ● I.T. know how to tweak the search engine - but don’t know how to judge what’s ‘good’ for business Not just a technology problem
  • 17. ● SMEs / Marketing know what’s wrong with relevance - but don’t usually know how search actually works ● I.T. know how to tweak the search engine - but don’t know how to judge what’s ‘good’ for business ● Management don’t understand why you can’t work together to solve the problem Not just a technology problem
  • 18. ● SMEs / Marketing know what’s wrong with relevance - but don’t usually know how search actually works ● I.T. know how to tweak the search engine - but don’t know how to judge what’s ‘good’ for business ● Management don’t understand why you can’t work together to solve the problem but think AI will help Not just a technology problem
  • 19. Not just a technology problem
  • 20. ...it’s a people problem. Not just a technology problem
  • 21. Building your search team ● Cross functional
  • 22. Building your search team ● Cross functional ● Educate & empower
  • 23. Building your search team ● Cross functional ● Educate & empower ● Build skills internally (recruiting is hard)
  • 24. Building your search team ● Cross functional ● Educate & empower ● Build skills internally (recruiting is hard) ● Use external resources: ○ Conferences, Meetups ○ Books & blogs ○ Training ○ Experts
  • 25. Building your search team ● Cross functional ● Educate & empower ● Build skills internally (recruiting is hard) ● Use external resources: ○ Conferences, Meetups ○ Books & blogs ○ Training ○ Experts - with a Proven Process
  • 26. A Proven Process Roadmap & Architecture to plan 90 daysDiscovery 90 Day Journey Training for your team Team Maturity Reassessment Trusted Advisor Accelerator
  • 27. The relevance-centered enterprise ● Relevance Feedback loops ○ Lost sales & complaints ○ Business awareness ○ Content curation ○ Pairing ○ Test-driven relevance tuning See https://www.manning.com/books/relevant-search
  • 28. ● Relevance Feedback loops ○ Lost sales & complaints ○ Business awareness ○ Content curation ○ Pairing ○ Test-driven relevance tuning The relevance-centered enterprise
  • 29. ● Relevance Feedback loops ○ Lost sales & complaints ○ Business awareness ○ Content curation ○ Pairing ○ Test-driven relevance tuning The relevance-centered enterprise
  • 30. Measure, rinse, repeat ● Without measurement how do you know you’re getting any better?
  • 31. Measure, rinse, repeat ● Without measurement how do you know you’re getting any better? ● Measure everything!
  • 32. Measure your search maturity Business Understand User Needs Search / Discovery Tech Experiment Driven UX Enrichment Advanced Data Inventory Business stakeholders use real-time KPIs Producing quality data from analytics Develops custom plugins Ops supports A/B testing & offline tests Innovative Discovery (chatbots, etc) NLP & Data science team Varied, complex, large-scale data Practitioner Occasional reporting Some user testing / basic analytics Complex relevance config; uses plugins Available, but complex experiments UI supports findability Taxonomies / Ontologies Moderate data complexity Baseline Business impact not measured No query logs or user testing Stock or moderately tweak config Test discovery manually, deployed rarely 10 search links on page Minor enrichment (synonyms) Very simple data model
  • 33. Measure your relevance iPad headphones white Ourshop.com/headphones/ipad/47264872786.htm “Great headphones for your iPad in white” $12.99 iPad 19 Ourshop.com/tablets/ipad/47264872786.htm “iPad in white with black checks” $207.99 iPad case Ourshop.com/cases/ipad/47313.htm “iPad case” $27.99 SERPQuery “White iPad” 3/10 9/10 1/10 Rating
  • 37. Better ways to measure relevance ● Measure search & engagement ○ Define ‘success’
  • 38. ● Measure search & engagement ○ Define ‘success’ ○ Instrumentation & Logs Better ways to measure relevance
  • 39. ● Measure search & engagement ○ Define ‘success’ ○ Instrumentation & Logs ○ Human judgements Better ways to measure relevance
  • 40. ● Measure search & engagement ○ Define ‘success’ ○ Instrumentation & Logs ○ Human judgements ○ As much data as you can get Better ways to measure relevance
  • 41. ● Define metrics that matter e.g. ○ Discounted Cumulative Gain (DGC) ○ Expected Reciprocal Rank (ERR) “the expected reciprocal length of time that the user will take to find a relevant document” Better ways to measure relevance
  • 42. ● Define metrics that matter e.g. ○ Discounted Cumulative Gain (DGC) ○ Expected Reciprocal Rank (ERR) “the expected reciprocal length of time that the user will take to find a relevant document” ● But make sure to also consider business metrics! Better ways to measure relevance
  • 43. ● Build a relevance testing framework ○ Test UIs ○ Open source tools Better ways to measure relevance
  • 44. Quepid ● Relevancy workbench for Elasticsearch & Solr ● Allows SMEs to rate results ● Allows relevance engineers to tweak Solr or Elasticsearch settings and see how the scores change ● Free hosted version at quepid.com ● Open source at https://github.com/o19s/quepid
  • 45. Rated Ranking Evaluator (RRE) ● Search quality evaluation tool for Elasticsearch & Solr ● Given rated documents and a set of queries, runs these and shows many metrics ● From Sease Ltd. ● Open source at https://github.com/SeaseLtd/rate d-ranking-evaluator
  • 46. ● Build a relevance testing framework ○ Test UIs ○ Open source tools ■ Quepid ■ Rated Ranking Evaluator ■ More are being built! Better ways to measure relevance
  • 47. ● Build a relevance testing framework ○ Test UIs ○ Open source tools ■ Quepid ■ Rated Ranking Evaluator ■ More are being built! ○ Open source your own? Better ways to measure relevance
  • 48. ● Go Open: ○ Most of the interesting work in relevance measurement & testing is in the open source world Better ways to measure relevance
  • 49. ● Go Open: ○ Most of the interesting work in relevance measurement & testing is in the open source world ○ Think about what data, examples, code you can open source Better ways to measure relevance
  • 50. ● Go Open: ○ Most of the interesting work in relevance measurement & testing is in the open source world ○ Think about what data, examples, code you can open source ○ Helps attract talent and encourages collaboration! Better ways to measure relevance
  • 51. 1. Three aspects of search quality a. Relevance, Performance, Experience Recap & Takeaways
  • 52. 1. Three aspects of search quality a. Relevance, Performance, Experience Recap & Takeaways
  • 53. 1. Three aspects of search quality 2. Focusing on Relevance Recap & Takeaways
  • 54. 1. Three aspects of search quality 2. Focusing on Relevance 3. Not just a technology problem a. Build & Empower your search team Recap & Takeaways
  • 55. 1. Three aspects of search quality 2. Focusing on Relevance 3. Not just a technology problem 4. Measure, rinse, repeat a. Measure everything b. Search Maturity c. Go open source & join the community! Recap & Takeaways
  • 56. ● Join Relevance Slack at https://opensourceconnections.com/slack ● We’re hiring! ● Follow me on Twitter at @FlaxSearch Any questions?