github: https://github.com/medcl/elasticsearch-analysis-ik
安装方式1、先查看版本号: http://localhost:9200/
找到对应版本: https://github.com/medcl/elasticsearch-analysis-ik/releases
2、安装
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip
3、重启es
4、分词测试
curl -X PUT 'localhost:9200/website'
curl -XGET "http://localhost:9200/website/_analyze" -H 'Content-Type: application/json' -d'
{
"text":"中华人民共和国国歌","tokenizer": "ik_max_word"
}'
返回内容
{
"tokens": [
{
"token": "中华人民共和国",
"start_offset": 0,
"end_offset": 7,
"type": "CN_WORD",
"position": 0
},
{
"token": "中华人民",
"start_offset": 0,
"end_offset": 4,
"type": "CN_WORD",
"position": 1
},
{
"token": "中华",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 2
},
{
"token": "华人",
"start_offset": 1,
"end_offset": 3,
"type": "CN_WORD",
"position": 3
},
{
"token": "人民共和国",
"start_offset": 2,
"end_offset": 7,
"type": "CN_WORD",
"position": 4
},
{
"token": "人民",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 5
},
{
"token": "共和国",
"start_offset": 4,
"end_offset": 7,
"type": "CN_WORD",
"position": 6
},
{
"token": "共和",
"start_offset": 4,
"end_offset": 6,
"type": "CN_WORD",
"position": 7
},
{
"token": "国",
"start_offset": 6,
"end_offset": 7,
"type": "CN_CHAR",
"position": 8
},
{
"token": "国歌",
"start_offset": 7,
"end_offset": 9,
"type": "CN_WORD",
"position": 9
}
]
}
如果安装失败,可以使用如下方式进行安装
源码解压后拷贝至es目录: plugins/ik , 重启服务
ik_max_word: 会将文本做最细粒度的拆分 ik_smart: 会做最粗粒度的拆分
参考 Elasticsearch5.x安装IK分词器以及使用