SlideShare a Scribd company logo
Charlie Hull - Managing Director
20th
October 2015
Search Solutions
charlie@flax.co.uk
www.flax.co.uk/blog
+44 (0) 8700 118334
Twitter: @FlaxSearch
Towards a new model of
test-based relevance tuning
Building open source search applications since 2001
Independent, honest advice and analysis
Expert design & development, Apache Solr committers
Test-driven relevancy and performance tuning
Custom training & mentoring for your staff
Flexible support up to 24/7/365 with SLAs
Building open source search applications since 2001
Independent, honest advice and analysis
Expert design & development, Apache Solr committers
Test-driven relevancy and performance tuning
Custom training & mentoring for your staff
Flexible support up to 24/7/365 with SLAs
Come and join the open source search community (tonight?)
Search Solutions 2015:  Towards a new model of search relevance testing
Why bother testing?
Throwing it over the fence
Some (slightly) better methods
A collaborative model
Quepid & other tools
A better way to test
@FlaxSearch
Search is Magic
Search doesn't affect the bottom line
The new search engine is better than the old one
We can just fix this one problem here...
Why bother testing?
@FlaxSearch
Business users / content creators know search is broken
Managers tell search developers to 'fix it'
Search developers don't understand why it needs fixing
Business users / content creators don't understand side
effects of a fix
Bad communication, internal politics, search gets worse!
Throwing it over the fence
@FlaxSearch
Avoiding the HiPPO
@FlaxSearch
© Copyright William J Bagshaw and licensed for reuse under this Creative Commons Licence
Identify what to test
– Query logs
– 'Most valuable' queries
– Languages/markets
– Segmented query types
Keep proper records
– Manual query testing
– Record relevance judgements
• Per page or per result?
– Say why
– Have an overall score
Use the same test system
Some (slightly) better methods
@FlaxSearch
Some (slightly) better methods
@FlaxSearch
Problems:
– Slow iterations
– Lots of error-prone copy-and-paste
– Unwieldy
– Not really collaborative
Improvements:
– Build test UIs
– Use a better scoring algorithm
• e.g. average discounted distance
– Bring in other data e.g. web analytics
• But remember clicks only 45% predict relevance (75% is achievable)
(Susan Dumais, Microsoft)
Some (slightly) better methods
@FlaxSearch
In software we use tests to collaborate
1. “What should happen in this case?”
2. Write test code to check
TDD can lead to improved software quality
Why not with search relevancy?
Test-based relevancy
@FlaxSearch
A client approached Flax:
– “What's the current state of the art /academically proven
way to test relevance?”
– Er.....
I remembered something called Quepid...
So this happened....
@FlaxSearch
Built by Doug Turnbull of OpenSource Connections
A browser-based tool for tuning relevance
Needed some development for enterprise use -
We did this working with our client
Let's take a look...
Quepid
@FlaxSearch
You should test your searches in a methodical way
Collaboration between 'the business' and developers is vital
Some tools now exist to help
Hopefully this is the first step to better relevance tuning
Conclusions
@FlaxSearch
Thankyou!
Any questions?
charlie@flax.co.uk
www.flax.co.uk/blog
+44 (0) 8700 118334
Twitter: @FlaxSearch

More Related Content

Search Solutions 2015: Towards a new model of search relevance testing

  • 1. Charlie Hull - Managing Director 20th October 2015 Search Solutions charlie@flax.co.uk www.flax.co.uk/blog +44 (0) 8700 118334 Twitter: @FlaxSearch Towards a new model of test-based relevance tuning
  • 2. Building open source search applications since 2001 Independent, honest advice and analysis Expert design & development, Apache Solr committers Test-driven relevancy and performance tuning Custom training & mentoring for your staff Flexible support up to 24/7/365 with SLAs
  • 3. Building open source search applications since 2001 Independent, honest advice and analysis Expert design & development, Apache Solr committers Test-driven relevancy and performance tuning Custom training & mentoring for your staff Flexible support up to 24/7/365 with SLAs Come and join the open source search community (tonight?)
  • 5. Why bother testing? Throwing it over the fence Some (slightly) better methods A collaborative model Quepid & other tools A better way to test @FlaxSearch
  • 6. Search is Magic Search doesn't affect the bottom line The new search engine is better than the old one We can just fix this one problem here... Why bother testing? @FlaxSearch
  • 7. Business users / content creators know search is broken Managers tell search developers to 'fix it' Search developers don't understand why it needs fixing Business users / content creators don't understand side effects of a fix Bad communication, internal politics, search gets worse! Throwing it over the fence @FlaxSearch
  • 8. Avoiding the HiPPO @FlaxSearch © Copyright William J Bagshaw and licensed for reuse under this Creative Commons Licence
  • 9. Identify what to test – Query logs – 'Most valuable' queries – Languages/markets – Segmented query types Keep proper records – Manual query testing – Record relevance judgements • Per page or per result? – Say why – Have an overall score Use the same test system Some (slightly) better methods @FlaxSearch
  • 10. Some (slightly) better methods @FlaxSearch
  • 11. Problems: – Slow iterations – Lots of error-prone copy-and-paste – Unwieldy – Not really collaborative Improvements: – Build test UIs – Use a better scoring algorithm • e.g. average discounted distance – Bring in other data e.g. web analytics • But remember clicks only 45% predict relevance (75% is achievable) (Susan Dumais, Microsoft) Some (slightly) better methods @FlaxSearch
  • 12. In software we use tests to collaborate 1. “What should happen in this case?” 2. Write test code to check TDD can lead to improved software quality Why not with search relevancy? Test-based relevancy @FlaxSearch
  • 13. A client approached Flax: – “What's the current state of the art /academically proven way to test relevance?” – Er..... I remembered something called Quepid... So this happened.... @FlaxSearch
  • 14. Built by Doug Turnbull of OpenSource Connections A browser-based tool for tuning relevance Needed some development for enterprise use - We did this working with our client Let's take a look... Quepid @FlaxSearch
  • 15. You should test your searches in a methodical way Collaboration between 'the business' and developers is vital Some tools now exist to help Hopefully this is the first step to better relevance tuning Conclusions @FlaxSearch