设置拼音分词器
"analysis" : {
"analyzer" : {
"pinyin_analyzer" : {
"tokenizer" : "my_pinyin"
}
},
"tokenizer" : {
"my_pinyin" : {
"ignore_pinyin_offset" : "false",
"lowercase" : "true",
"keep_original" : "false",
"remove_duplicated_term" : "true",
"keep_first_letter" : "false",
"keep_separate_first_letter" : "false",
"type" : "pinyin",
"limit_first_letter_length" : "16",
"keep_full_pinyin" : "true"
}
}
},
使用enshi
无法搜索恩施
GET test/_search
{
"query": {
"match_phrase": {
"cityName.pinyin": "enshi"
}
}
}
无结果
enshi
分词
GET test/_analyze
{
"analyzer": "pinyin_analyzer",
"text": ["enshi"]
}
结果 en 被拆分
{
"tokens" : [
{
"token" : "e",
"start_offset" : 0,
"end_offset" : 1,
"type" : "word",
"position" : 0
},
{
"token" : "n",
"start_offset" : 1,
"end_offset" : 2,
"type" : "word",
"position" : 1
},
{
"token" : "shi",
"start_offset" : 2,
"end_offset" : 5,
"type" : "word",
"position" : 2
}
]
}
恩施 分词
GET test/_analyze
{
"analyzer": "pinyin_analyzer",
"text": ["恩施"]
}
{
"tokens" : [
{
"token" : "en",
"start_offset" : 0,
"end_offset" : 1,
"type" : "word",
"position" : 0
},
{
"token" : "shi",
"start_offset" : 1,
"end_offset" : 2,
"type" : "word",
"position" : 1
}
]
}
由于 en
,ou
被拆分导致查不到结果
https://elasticsearch.cn/question/12879
调一下词典就行,pinyin_alphabet.dict 这个文件,把你缺的拼音加上 在 elasticsearch-analysis-pinyin-7.9.3.jar 这个包内的