Side by Side with Elasticsearch and Solr
- 1. Side by Side with
Solr and Elasticsearch
Radu GheorgheRafał Kuć
- 4. {
"id": "4",
"url": "https://www.youtube.com/watch?v=IutoHcJT61k",
"title": "#bbuzz: Rafał Kuć: Battle of the Giants: Solr vs ElasticSearch, Round 2",
"uploaded_by": "newthinking communications",
"upload_date": "2013-06-19",
"views": 380,
"likes": 1,
"tags": ["elasticsearch", "solr", "lucene", "comparison"]
}
Let’s Index Videos
Examples available at:
https://github.com/sematext/berlin-buzzwords-samples/
- 6. MappingSchema
schema.xml+... -> ZooKeeper
<schema name="
BerlinBuzzwords2014" version="1.5">
<fields>
<field name="id" type="string"
indexed="true" stored="true"
required="true"
multiValued="false" />
...
<field name="tags" type="string"
indexed="true" stored="true"
multiValued="true"/>
</fields>
...
</schema>
PUT -> /bbuzz/videos/_mapping
{
"videos": {
"_id": {
"path": "id"
},
"properties": {
...
"tags": {
"type": "string",
"index": "not_analyzed"
},
...
}
}
}
- 7. URI Request“q” Parameter
GET -> /solr/bbuzz/select
params -> q=elasticsearch
fl=*,score
...
<result name="response"
numFound="7" start="0">
<doc>
<float name="score">0.
44896343</float>
<str name="id">2</str>
<str name="url">
/watch?v=6QX5hXf_e7c</str>
<str name="title">Introduction
to Elasticsearch by Radu</str>
...
</doc>
...
GET -> /bbuzz/videos/_search
params -> q=elasticsearch
...
"hits" : [ {
"_index" : "bbuzz",
"_type" : "videos",
"_id" : "2",
"_score" : 0.26516503,
"_source" : {
"url": "/watch?v=6QX5hXf_e7c",
"title": "Introduction to Elasticsearch
by Radu",
...
- 8. Bool QueryBool Query
GET -> /solr/bbuzz/select
q=title:elasticsearch OR tags:logs
q=title:elasticsearch tags:logs
q.op=OR
GET -> /bbuzz/videos/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "elasticsearch"
}
},
{
"term": {
"tags": "logs"
...
- 10. HierarchiesHierarchies
names:
-> first: Rafał, last: Kuć
-> first: Radu, last: Gheorghe
nested (block join)
parent-child (query time join)
"names": [
{ "first": "Rafał", "last": "Kuć" },
{ "first": "Radu", "last": "Gheorghe" },
]
nested (block join)
parent-child
Rafał
Kuć
Radu
Gheorghe
2 names
⇐
Rafał
Kuć
Radu
Gheorghe
names
Rafał
Kuć
Radu
Gheorghe
2 names
⇐
Rafał
Kuć
Radu
Gheorghe
names
- 13. Demo time: Graph all the things!
http://f1.thejournal.ie/media/2013/05/meatloaf-2.jpg
- 14. Stats APIsStats
JMX / Solr admin / clusterstate GET -> /_stats
"index_total" : 15118403,
"index_time" : "4.2h",
...
"query_total" : 41092,
"query_time" : "57.2m",
GET -> /_cluster/stats
"heap_used_in_bytes" : 83960392,
...
- 18. things to comeNew juicy
facet by function
https://issues.apache.
org/jira/browse/SOLR-1581
analytics component
https://issues.apache.
org/jira/browse/SOLR-5302
Solr as standalone application
5.0 - no general issue yet
top_hits aggregation
https://github.
com/elasticsearch/elasticsearch/pull/61
24
minumum_should_match on has_child
https://github.
com/elasticsearch/elasticsearch/issues/
6019
filters aggregation
https://github.
com/elasticsearch/elasticsearch/issues/
6118
- 19. most projects work well with either
many small differences, few show-stoppers
choose the best. for your use-case.
- 20. Want to work with both?
We’re hiring!
Worldwide
http://www.staff.amu.edu.pl/~zbzw/glob/glob.gif