ElasticSearch
- 2. Links
• https://bitbucket.org/lsdr/es/overview
• https://confluence.abril.com.br/x/J5I_AQ
Friday, March 8, 13
- 3. Instalação
• Mac OSX:
• brew install elasticsearch
• CentOS 6.x:
• não tem RPM oficial :-(
• https://gist.github.com/lsdr/5117589
Friday, March 8, 13
- 4. Setup
cluster.name: buffalo
node.name: "Bruce Smith"
path.data: /usr/local/var/elasticsearch/
path.logs: /usr/local/var/log/elasticsearch/
path.plugins: /usr/local/var/lib/elasticsearch/plugins
network.host: 127.0.0.1
suficiente para subir um server local!
Friday, March 8, 13
- 5. Setup++
• Configuração de um nó:
Master
TRUE FALSE
TRUE Development Workhorse
Data
FALSE Coordinator Load Balancer
http://elasticsearch.org/guide/reference/modules/node.html
Friday, March 8, 13
- 6. Setup++
• # shards e # replicas
• possível aumentar replicas em runtime, shards
não
• Plugins “obrigatórios”
• só inicia nó se estiverem presentes
• Tunning JVM
• Outros módulos: Thrift, JMX
Friday, March 8, 13
- 7. “Hello World!”
$ curl -XGET 'localhost:9200/world'
No handler found for uri [/world] and method [GET]
Friday, March 8, 13
- 8. “Hello World!”
curl -XPOST "localhost:9200/world/hello" -d
'{
"text": "hello world",
"lang": "en"
}'
Friday, March 8, 13
- 9. “Hello World!”
{
"ok": true,
"_index": "world",
"_type": "hello",
"_id": "A5HX8IhTR0CzMNWHBPhEqA",
"_version": 1
}
POST: id automágico | PUT: id “manual”
Friday, March 8, 13
- 10. “Hello World!”
$ curl -XGET 'localhost:9200/world/_count’
{
"count":1,
"_shards":
{
"total": 3,
"successful": 3,
"failed":0
}
}
Friday, March 8, 13
- 11. “Hello World!”
$ curl -XGET 'localhost:9200/world/_search’
"hits" [
{
"_index": "world",
"_type": "hello",
"_id": "A5HX8IhTR0CzMNWHBPhEqA",
"_score": 1.0,
"_source": {"text": "hello world", "lang": "en"}
}
]
Friday, March 8, 13
- 12. “Hello World!”
$ curl -XGET 'localhost:9200/world/hello/_mapping’
{
"hello": {
"properties": {
"lang": {
"type": "string"
},
"text": {
"type": "string"
}
}
}
}
Friday, March 8, 13
- 13. Mapping
Mapping is the process of defining how a document
should be mapped to the Search Engine, including
its searchable characteristics such as which fields
are searchable and if/how they are tokenized.
http://elasticsearch.org/guide/reference/mapping/
Friday, March 8, 13
- 14. Querying
• URI Request
• Não expõe todos os features do ES
• /guide/reference/api/search/uri-request.html
• Query DSL
• POST-based (no cache!)
• /guide/reference/query-dsl/
Friday, March 8, 13
- 15. Querying
• Brincar de fazer queries em matérias!
• Queries simples funcionam, mas...
• facets quebram?
Friday, March 8, 13
- 16. Mapping
By default, there isn’t a need to define an explicit
mapping, (...) Only when the defaults need to be
overridden must a mapping definition be provided.
http://elasticsearch.org/guide/reference/mapping/
Friday, March 8, 13
- 17. Mapping
• Override não é trivial
• Possivelmente envolve reindexação
• Esse é o trabalho do time
• “massagistas de dados”
Friday, March 8, 13
- 18. Analyzer
curl -XGET 'localhost:9200/_analyze?analyzer=standard' -d
'Esporte::Futebol'
curl -XGET 'localhost:9200/_analyze?analyzer=keyword' -d
'Esporte::Futebol'
Friday, March 8, 13
- 20. River
• Cria índices/mappings se não existir
• lembrar limitações
• Pulling
• elasticsearch-river-mongo
• Explode se o mongo estiver fora
• Demora (se perde?) quando voltar
Friday, March 8, 13
- 21. River
• Instalar River (plugin)
• Criar River
• mongorestore
Friday, March 8, 13
- 22. River
$ES_HOME/bin/plugin
-‐install
elasticsearch/elasticsearch-‐mapper-‐
attachments/1.6.0
$ES_HOME/bin/plugin
-‐url
https://github.com/downloads/
richardwilly98/elasticsearch-‐river-‐mongodb/elasticsearch-‐river-‐
mongodb-‐1.6.1.zip
-‐install
river-‐mongodb
Friday, March 8, 13
- 23. River
curl -XPUT "localhost:9200/_river/v/_meta" -d '{
"type": "mongodb",
"mongodb": {
"db": "alx_midia",
"collection": "videos",
"servers": [
{ "host": "localhost", "port": "27017" }
]
},
"index": {
"name": "videos",
"type": "documents"
}
}'
origem - destino
Friday, March 8, 13
- 24. River
curl -XGET "localhost:9200/_river/v/_meta"
{
"_index": "_river",
"_id": "_meta",
"exists": true,
"_source": {
"type": "mongodb",
"mongodb": {
"db": "alx_midia",
"collection": "videos",
"servers": [
{ "host": "localhost", "port": "27017" }
]
},
"index": {
"name": "videos",
"type": "documents"
}
}
}
Friday, March 8, 13
- 25. River
$ mongorestore --host localhost --port 27017
--noIndexRestore alx_midia
Friday, March 8, 13
- 26. River
[videos] creating index, cause [api], shards [3]/[1], mappings []
[_river] update_mapping [v] (dynamic)
[mongodb][v] No known previous slurping time for this collection
[_river] update_mapping [v] (dynamic)
[videos] update_mapping [documents] (dynamic)
Indexed 100 insertions 0, updates, 0 deletions, 100 documents per second
Indexed 100 insertions 0, updates, 0 deletions, 100 documents per second
Indexed 81 insertions 0, updates, 0 deletions, 81 documents per second
Indexed 100 insertions 0, updates, 0 deletions, 100 documents per second
Indexed 15 insertions 0, updates, 0 deletions, 15 documents per second
Friday, March 8, 13
- 27. River
• Na operação padrão, não vai ter
“restore” em caso de falha
• Necessário pensar em uma solução de
“recrawling”
Friday, March 8, 13