SlideShare a Scribd company logo
Side by Side with
Solr and Elasticsearch
Radu GheorgheRafał Kuć
RaduRafał
LogseneLogsene
AgendaOverview
documents
queries
mapping
index&store
aggregations
percolations
scale out
searches
tools ecosystem
documents
schema
index&store
facets
scale out
searches
tools ecosystem
backupreplicate
{
"id": "4",
"url": "https://www.youtube.com/watch?v=IutoHcJT61k",
"title": "#bbuzz: Rafał Kuć: Battle of the Giants: Solr vs ElasticSearch, Round 2",
"uploaded_by": "newthinking communications",
"upload_date": "2013-06-19",
"views": 380,
"likes": 1,
"tags": ["elasticsearch", "solr", "lucene", "comparison"]
}
Let’s Index Videos
Examples available at:
https://github.com/sematext/berlin-buzzwords-samples/
Demo time: Start your engines!
hkcarworld.com
MappingSchema
schema.xml+... -> ZooKeeper
<schema name="
BerlinBuzzwords2014" version="1.5">
<fields>
<field name="id" type="string"
indexed="true" stored="true"
required="true"
multiValued="false" />
...
<field name="tags" type="string"
indexed="true" stored="true"
multiValued="true"/>
</fields>
...
</schema>
PUT -> /bbuzz/videos/_mapping
{
"videos": {
"_id": {
"path": "id"
},
"properties": {
...
"tags": {
"type": "string",
"index": "not_analyzed"
},
...
}
}
}
URI Request“q” Parameter
GET -> /solr/bbuzz/select
params -> q=elasticsearch
fl=*,score
...
<result name="response"
numFound="7" start="0">
<doc>
<float name="score">0.
44896343</float>
<str name="id">2</str>
<str name="url">
/watch?v=6QX5hXf_e7c</str>
<str name="title">Introduction
to Elasticsearch by Radu</str>
...
</doc>
...
GET -> /bbuzz/videos/_search
params -> q=elasticsearch
...
"hits" : [ {
"_index" : "bbuzz",
"_type" : "videos",
"_id" : "2",
"_score" : 0.26516503,
"_source" : {
"url": "/watch?v=6QX5hXf_e7c",
"title": "Introduction to Elasticsearch
by Radu",
...
Bool QueryBool Query
GET -> /solr/bbuzz/select
q=title:elasticsearch OR tags:logs
q=title:elasticsearch tags:logs
q.op=OR
GET -> /bbuzz/videos/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "elasticsearch"
}
},
{
"term": {
"tags": "logs"
...
PercolatorGrouping
GET -> /solr/bbuzz/select
q=elasticsearch
group=true
group.field=uploaded_by
PUT -> /bbuzz/.percolator/1
{
"query" : {
"term" : { "tags" : "elasticsearch" }
}
}
GET -> /bbuzz/videos/_percolate
{
"doc": {
"title": "Scaling Massive ES Clusters",
"tags": [ "elasticsearch", "scaling"]
}
}
HierarchiesHierarchies
names:
-> first: Rafał, last: Kuć
-> first: Radu, last: Gheorghe
nested (block join)
parent-child (query time join)
"names": [
{ "first": "Rafał", "last": "Kuć" },
{ "first": "Radu", "last": "Gheorghe" },
]
nested (block join)
parent-child
Rafał
Kuć
Radu
Gheorghe
2 names
⇐
Rafał
Kuć
Radu
Gheorghe
names
Rafał
Kuć
Radu
Gheorghe
2 names
⇐
Rafał
Kuć
Radu
Gheorghe
names
AggregationsFacets
facet=true
facet.field=tags
facet=true
facet.query=uploaded_by:
LuceneSolrRevolution
facet.query=uploaded_by:"
newthinking communications"
"aggregations" : {
"tags" : {
"terms" : { "field" : "tags" }
}
}
"aggregations": {
"uploader_count": {
"cardinality": {
"field": "uploaded_by"
}
}
}
Nesting AggsPivot Facets
facet=true
facet.pivot=tags,views
"aggregations" : {
"tags" : {
"terms" : { "field" : "tags" },
"aggregations": {
"dates": {
"date_histogram": {
"field": "upload_date",
"interval": "month",
"format" : "yyyy-MM"
}
}
}
}
}
Demo time: Graph all the things!
http://f1.thejournal.ie/media/2013/05/meatloaf-2.jpg
Stats APIsStats
JMX / Solr admin / clusterstate GET -> /_stats
"index_total" : 15118403,
"index_time" : "4.2h",
...
"query_total" : 41092,
"query_time" : "57.2m",
GET -> /_cluster/stats
"heap_used_in_bytes" : 83960392,
...
Backup
PUT -> /_snapshot/bbuzz
{
"type": "fs",
"settings": {
"location": "/mnt/bbuzz_backup"
}
}'
PUT -> /_snapshot/bbuzz/1
{
"indices": "bbuzz"
}'
POST -> /_snapshot/bbuzz/1/_restore"
Demo time: Scaling out
Apache Software Foundation
Contributors
Code
Mailing list
Elasticsearch
Contributors
Code
Mailing list
things to comeNew juicy
facet by function
https://issues.apache.
org/jira/browse/SOLR-1581
analytics component
https://issues.apache.
org/jira/browse/SOLR-5302
Solr as standalone application
5.0 - no general issue yet
top_hits aggregation
https://github.
com/elasticsearch/elasticsearch/pull/61
24
minumum_should_match on has_child
https://github.
com/elasticsearch/elasticsearch/issues/
6019
filters aggregation
https://github.
com/elasticsearch/elasticsearch/issues/
6118
most projects work well with either
many small differences, few show-stoppers
choose the best. for your use-case.
Want to work with both?
We’re hiring!
Worldwide
http://www.staff.amu.edu.pl/~zbzw/glob/glob.gif
Thank you!
Radu Gheorghe
@radu0gheorghe
Rafał Kuć
@kucrafal
Examples available at:
https://github.com/sematext/berlin-buzzwords-samples/
@sematext

More Related Content

Side by Side with Elasticsearch and Solr