ElasticSearch Learning Notes
ElasticSearch Learning Notes
2018.4.17
#2 Slide/Conf
link
Easy to Scale
RESTful API 可以与各种语⾔交互
Pre-operation Persistence 数据丢失⼏率⼩
Excellent Query DSL - ⽤JSON进⾏接⼝交互,接⼝DSL设计的好
Multi-tenancy 不懂
Support for advanced search features (Full Text)
..
Basic concpets
Cluster
Node 最好⼀个server⼀个node 不过好⾏已经不是这样了
Index 类似 database
Type 类似table
Document is a JSON document 类似 row in table. 每个Document都是⼀个JSON对象。
每⼀个Document是被存储在index⾥然后有type和id和key-value field
Field 类似column
Mapping 类似schema definition
Shard 不懂
Primary Shard 不懂
ReplaceSahrd
ElasticSearch Routing
Elasticsearch has no idea where to look for your document. All the docs were randomly
distributed aroudn your cluster. So Elasticsearch has no choice to broadcasts the
requerst to all shards.
Searching
#4 Searching
http://localhost:9200/test-data/_search
Search explicitly for documents of type cities within the test-data index.
http://localhost:9200/test-data/cities/_search
Search explicitly for documents of type cities within the test-data index using paging
http://localhost:9200/test-data/cities/_search?size=5&from=10
排名
link
link
#2 Blog
link
异步reindex⽅案
link
#2 Video
link
#2 Wiki
#3 Install
link
当启动的时候可以设置cluster与node的名字
./elasticsearch -Ecluster.name=my_cluster_name -
Enode.name=my_node_name
Index前不是⼀定要⼿动先创建index,它会⾃动帮我们创建
指定document id进⾏创建
如果还是在这个id下再次创建,id不变,version加1,对应的内容改变 (要注意这个不是
update⽽是replace)
换新的id就会在新的id下创建document,
Updating Documents
Document更新的本质上是删除原来的,创建⼀个新的。
curl -X POST "http://localhost:9200/customer/_doc/1/_update?pretty" -
H 'Content-Type: application/json' -d'
{
"doc": { "name": "Jane Doe" }
}
'
跟新还可以使⽤Script
Deleting Documents
Batch Processing
这个在⼀个Batch⾥⼀次性执⾏许多语句,这样不⽤多次调⽤接⼝
跟新第⼀个document,删除第⼆个document
curl -X POST "http://localhost:9200/customer/_doc/_bulk?pretty" -H
'Content-Type: application/json' -d'
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}
'
多个action中有个别action执⾏失败也不会影响其他action的执⾏,会在返回信息中告诉你每
⼀个action的执⾏情况
下⾯尝试导⼊⼤批量的数据:download-link
Searching
endpoint is _search
返回bank这个index下的所有documents
{
"took" : 63,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1000,
"max_score" : null,
"hits" : [ {
"_index" : "bank",
"_type" : "_doc",
"_id" : "0",
"sort": [0],
"_score" : null,
"_source" :
{"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname"
:"Mckenzie","age":29,"gender":"F","address":"244 Columbus
Place","employer":"Euron","email":"[email protected]","city"
:"Hobucken","state":"CO"}
}, {
"_index" : "bank",
"_type" : "_doc",
"_id" : "1",
"sort": [1],
"_score" : null,
"_source" :
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"D
uke","age":32,"gender":"M","address":"880 Holmes
Lane","employer":"Pyrami","email":"[email protected]","city":"Brog
an","state":"IL"}
}, ...
]
}
}
与⼀般SQL数据库不同,ElasticSearch返回内容以后就不会管你了,不像其他还有类似 page
throught这种功能。
返回Top 10
只返回某些fields (account_number与balance)
curl -X GET "http://localhost:9200/bank/_search" -H 'Content-Type:
application/json' -d '
{
"query": { "match_all": {} },
"_source": ["account_number", "balance"]
}
'
match搜素,找到account_number为 20的记录
match搜素,找到所有address⾥contain mill或者lane的记录
这样可以组合很多搜索条件
Bool⾥还可以嵌套Bool,Must与must_not, should都可以平⾏使⽤
找出40岁但没有ID的记录
搜索+普通的数字筛选, 找出balance在20000与30000之间的记录
Search + Aggregations
group all the accounts by state, and then return to top 10 states sorted by count
descending
数⼀数每个state都有多少record,然后返回top 10
算⼀算每个state下有多少record并计算其平均balance是多少,然后返回top10
在上⾯的基础上,根据balance平均值进⾏排序
This example demonstrates how we can group by age brackets (ages 20-29, 30-39, and
40-49), then by gender, and then finally get the average account balance, per age bracket,
per gender:
link
link
#2 Video
link
6.2 - Upgrade Elasticsearch
可在这⾥找到完整具体的升级细节:https://www.elastic.co/products/upgrade_guide](http
s://www.elastic.co/products/upgrade_guide)
#2 Book
#2 Repo
#3 Searchkick
link
#1 其它
查看项⽬有多少⾏代码
link
cloc .
#2 主题学习
#3 Index
ElasticSearch uses Apache Lucene library to write and read the data from the index.
A single Elasticsearch index may be build of more than a single Apache Lucene index, by
using shards and replicas.