前言
当新增一个文档的时候,文档会被存储到一个主分片中,elasticsearch如何知道一个文档应该存放到哪个分片中呢? 当我们创建文档时候,它如何决定这个文档应当被存储在分片1还是分片2中。
路由算法- 首先这肯定不会是随机的,否则将来要获取文档的时候我们就不知道从何处寻找了。实际上,这个过程是根据下面这个公式决定的
shard = hash(routing) % number_of_primary_shards
- routing是一个可变值,默认是文档的_id,也可以设置成一个自定义的值。routing通过hash函数生成一个数字,然后这个数字在除以number_of_primary_shards(主分片数量)后得到余数。这个分布在0到number_of_primary_shards-1之间的余数,就是我们所寻求的文档所在分片的位置
- 这就解释了为什么我们要在创建索引的时候就确定好主分片的数量并且永远不会改变这个数量,因为如果数值发生了变化,那么所有之前路由的值都会无效,文档也再也找不到了。
新增一个文档(指定id)
curl -X PUT "http://172.25.45.150:9200/nba/_doc/1" -H 'Content-Type:application/json' -d '
{
"name":"张三",
"team_name":"火箭",
"position":"前锋",
"play_year":"10",
"jerso_no":"13"
}
'
查看该文档在哪个分片上
curl -X GET "http://172.25.45.150:9200/nba/_search_shards?routing=1"
返回结果
{
"nodes": {
"cFpSqC2OSJWSYZZiWNcSTQ": {
"name": "node-3",
"ephemeral_id": "4ahSeBu_T_yARMflb5Zqmw",
"transport_address": "172.17.0.4:9500",
"attributes": {
"ml.machine_memory": "1820590080",
"ml.max_open_jobs": "20",
"xpack.installed": "true"
}
},
"9l3jVOkhRuir-u2WcDVBLA": {
"name": "node-2",
"ephemeral_id": "oGqWT2rJT42-qdUZB_qREw",
"transport_address": "172.17.0.3:9400",
"attributes": {
"ml.machine_memory": "1820590080",
"ml.max_open_jobs": "20",
"xpack.installed": "true"
}
},
"3_sSMGD9RO-niM98FIvs6Q": {
"name": "node-1",
"ephemeral_id": "Ptu0VU1KQfujYtKM5D8maQ",
"transport_address": "172.17.0.2:9300",
"attributes": {
"ml.machine_memory": "1820590080",
"ml.max_open_jobs": "20",
"xpack.installed": "true"
}
}
},
"indices": {
"nba": {}
},
"shards": [
[
{
"state": "STARTED",
"primary": false,
"node": "cFpSqC2OSJWSYZZiWNcSTQ",
"relocating_node": null,
"shard": 2,
"index": "nba",
"allocation_id": {
"id": "PG_q691kQ5ack_8zIG0fGQ"
}
},
{
"state": "STARTED",
"primary": true,
"node": "9l3jVOkhRuir-u2WcDVBLA",
"relocating_node": null,
"shard": 2,
"index": "nba",
"allocation_id": {
"id": "jnvDSezxR4WKsz8-5AyfVw"
}
},
{
"state": "STARTED",
"primary": false,
"node": "3_sSMGD9RO-niM98FIvs6Q",
"relocating_node": null,
"shard": 2,
"index": "nba",
"allocation_id": {
"id": "Ua-yypDnROO_Hf3Jnuq08Q"
}
}
]
]
}
从上面结果nodes中我们可以看到我们的文档在三个节点中都存放了,而文档我们可以在shards中可以看到文档分到了分片为2的分片上,这里我们注意到有个primary:true主分片发现它的的node:"9l3jVOkhRuir-u2WcDVBLA"找到该id的节点为node-2,也就是主分片存放于node-2节点上