ElasticSearch教程FG014-ElasticSearch建议器与自动补全实战
Part01-基础概念与理论知识
1.1 建议器概念
建议器(Suggester)是ElasticSearch中用于实现自动补全、拼写纠错、搜索建议等功能的组件。建议器可以根据用户的输入提供相关的搜索建议,提高用户搜索体验。更多视频教程www.fgedu.net.cn
建议器的特点:
- 实时提供搜索建议
- 支持多种建议类型
- 可自定义建议规则
- 高性能响应
1.2 建议器类型
ElasticSearch支持多种建议器类型:
- Term Suggester:术语建议器,用于拼写纠错
- Phrase Suggester:短语建议器,用于短语级别的拼写纠错
- Completion Suggester:自动补全建议器,用于实时自动补全
- Context Suggester:上下文建议器,支持基于上下文的建议
1.3 自动补全原理
自动补全的实现原理:
- 使用特殊的completion字段类型
- 构建前缀树索引
- 基于前缀匹配快速查询
- 支持权重排序
学习交流加群风哥微信: itpux-com
Part02-生产环境规划与建议
2.1 建议器性能优化
建议器性能优化的方法:
- 使用completion字段类型提高性能
- 合理设置suggestion的大小
- 使用context suggester限制建议范围
- 监控建议器性能
- 优化索引结构
2.2 自动补全配置
自动补全配置建议:
- 使用completion字段类型
- 合理设置analyzer
- 配置合适的搜索参数
- 考虑使用context进行过滤
- 设置合理的size参数
2.3 生产环境最佳实践
生产环境中,建议器应注意:
- 对频繁搜索的字段启用建议
- 监控建议器性能
- 设置合理的建议参数
- 考虑缓存建议结果
- 测试不同建议器的性能
学习交流加群风哥QQ113257174
Part03-生产环境项目实施方案
3.1 术语建议器实战
术语建议器的使用:
curl -X PUT “http://192.168.1.10:9200/fgedu-products” -H “Content-Type: application/json” -d ‘{
“settings”: {
“number_of_shards”: 5,
“number_of_replicas”: 2
},
“mappings”: {
“properties”: {
“product_name”: {
“type”: “text”,
“analyzer”: “ik_max_word”
}
}
}
}’
# 插入测试数据
curl -X POST “http://192.168.1.10:9200/fgedu-products/_bulk” -H “Content-Type: application/json” -d ‘{
“index”: {“_id”: “1”}
}
{
“product_name”: “ElasticSearch实战指南”
}
{
“index”: {“_id”: “2”}
}
{
“product_name”: “Kibana权威指南”
}
{
“index”: {“_id”: “3”}
}
{
“product_name”: “Logstash入门与实践”
}
{
“index”: {“_id”: “4”}
}
{
“product_name”: “ElasticStack实战”
}
‘
# 术语建议器
curl -X GET “http://192.168.1.10:9200/fgedu-products/_search” -H “Content-Type: application/json” -d ‘{
“suggest”: {
“term-suggestion”: {
“text”: “elastcsearch”,
“term”: {
“field”: “product_name”,
“suggest_mode”: “popular”,
“max_edits”: 2
}
}
}
}’
# 执行
# 输出日志
{
“took”: 5,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: {
“value”: 0,
“relation”: “eq”
},
“max_score”: null,
“hits”: []
},
“suggest”: {
“term-suggestion”: [
{
“text”: “elastcsearch”,
“offset”: 0,
“length”: 12,
“options”: [
{
“text”: “elasticsearch”,
“score”: 0.8333333,
“freq”: 2
}
]
}
]
}
}
3.2 短语建议器实战
短语建议器的使用:
curl -X GET “http://192.168.1.10:9200/fgedu-products/_search” -H “Content-Type: application/json” -d ‘{
“suggest”: {
“phrase-suggestion”: {
“text”: “elastcsearch guide”,
“phrase”: {
“field”: “product_name”,
“gram_size”: 3,
“max_errors”: 2,
“confidence”: 1
}
}
}
}’
# 执行
# 输出日志
{
“took”: 10,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: {
“value”: 0,
“relation”: “eq”
},
“max_score”: null,
“hits”: []
},
“suggest”: {
“phrase-suggestion”: [
{
“text”: “elastcsearch guide”,
“offset”: 0,
“length”: 19,
“options”: [
{
“text”: “elasticsearch guide”,
“score”: 0.8
}
]
}
]
}
}
3.3 自动补全建议器实战
自动补全建议器的使用:
curl -X PUT “http://192.168.1.10:9200/fgedu-products-completion” -H “Content-Type: application/json” -d ‘{
“settings”: {
“number_of_shards”: 5,
“number_of_replicas”: 2
},
“mappings”: {
“properties”: {
“product_name”: {
“type”: “text”,
“analyzer”: “ik_max_word”
},
“suggest”: {
“type”: “completion”,
“analyzer”: “ik_max_word”,
“search_analyzer”: “ik_max_word”
}
}
}
}’
# 插入测试数据
curl -X POST “http://192.168.1.10:9200/fgedu-products-completion/_bulk” -H “Content-Type: application/json” -d ‘{
“index”: {“_id”: “1”}
}
{
“product_name”: “ElasticSearch实战指南”,
“suggest”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”]
}
{
“index”: {“_id”: “2”}
}
{
“product_name”: “Kibana权威指南”,
“suggest”: [“Kibana”, “权威指南”, “Kibana权威指南”]
}
{
“index”: {“_id”: “3”}
}
{
“product_name”: “Logstash入门与实践”,
“suggest”: [“Logstash”, “入门与实践”, “Logstash入门与实践”]
}
{
“index”: {“_id”: “4”}
}
{
“product_name”: “ElasticStack实战”,
“suggest”: [“ElasticStack”, “实战”, “ElasticStack实战”]
}
‘
# 自动补全建议器
curl -X GET “http://192.168.1.10:9200/fgedu-products-completion/_search” -H “Content-Type: application/json” -d ‘{
“suggest”: {
“product-suggestion”: {
“prefix”: “ela”,
“completion”: {
“field”: “suggest”,
“size”: 5
}
}
}
}’
# 执行
# 输出日志
{
“took”: 3,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: {
“value”: 0,
“relation”: “eq”
},
“max_score”: null,
“hits”: []
},
“suggest”: {
“product-suggestion”: [
{
“text”: “ela”,
“offset”: 0,
“length”: 3,
“options”: [
{
“text”: “ElasticSearch”,
“_index”: “fgedu-products-completion”,
“_id”: “1”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticSearch实战指南”,
“suggest”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”]
}
},
{
“text”: “ElasticStack”,
“_index”: “fgedu-products-completion”,
“_id”: “4”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticStack实战”,
“suggest”: [“ElasticStack”, “实战”, “ElasticStack实战”]
}
},
{
“text”: “ElasticSearch实战指南”,
“_index”: “fgedu-products-completion”,
“_id”: “1”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticSearch实战指南”,
“suggest”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”]
}
}
]
}
]
}
}
更多学习教程公众号风哥教程itpux_com
Part04-生产案例与实战讲解
4.1 电商搜索自动补全
电商搜索自动补全场景:
curl -X PUT “http://192.168.1.10:9200/fgedu-ecommerce” -H “Content-Type: application/json” -d ‘{
“settings”: {
“number_of_shards”: 5,
“number_of_replicas”: 2
},
“mappings”: {
“properties”: {
“product_name”: {
“type”: “text”,
“analyzer”: “ik_max_word”
},
“category”: {
“type”: “keyword”
},
“price”: {
“type”: “double”
},
“suggest”: {
“type”: “completion”,
“analyzer”: “ik_max_word”,
“search_analyzer”: “ik_max_word”,
“contexts”: [
{
“name”: “category”,
“type”: “category”
}
]
}
}
}
}’
# 插入测试数据
curl -X POST “http://192.168.1.10:9200/fgedu-ecommerce/_bulk” -H “Content-Type: application/json” -d ‘{
“index”: {“_id”: “1”}
}
{
“product_name”: “ElasticSearch实战指南”,
“category”: “技术书籍”,
“price”: 99.9,
“suggest”: {
“input”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”],
“contexts”: {
“category”: [“技术书籍”]
}
}
}
{
“index”: {“_id”: “2”}
}
{
“product_name”: “Kibana权威指南”,
“category”: “技术书籍”,
“price”: 79.9,
“suggest”: {
“input”: [“Kibana”, “权威指南”, “Kibana权威指南”],
“contexts”: {
“category”: [“技术书籍”]
}
}
}
{
“index”: {“_id”: “3”}
}
{
“product_name”: “Logstash入门与实践”,
“category”: “技术书籍”,
“price”: 69.9,
“suggest”: {
“input”: [“Logstash”, “入门与实践”, “Logstash入门与实践”],
“contexts”: {
“category”: [“技术书籍”]
}
}
}
‘
# 带上下文的自动补全
curl -X GET “http://192.168.1.10:9200/fgedu-ecommerce/_search” -H “Content-Type: application/json” -d ‘{
“suggest”: {
“product-suggestion”: {
“prefix”: “ela”,
“completion”: {
“field”: “suggest”,
“size”: 5,
“contexts”: {
“category”: [“技术书籍”]
}
}
}
}
}’
# 执行
# 输出日志
{
“took”: 5,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: {
“value”: 0,
“relation”: “eq”
},
“max_score”: null,
“hits”: []
},
“suggest”: {
“product-suggestion”: [
{
“text”: “ela”,
“offset”: 0,
“length”: 3,
“options”: [
{
“text”: “ElasticSearch”,
“_index”: “fgedu-ecommerce”,
“_id”: “1”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticSearch实战指南”,
“category”: “技术书籍”,
“price”: 99.9,
“suggest”: {
“input”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”],
“contexts”: {
“category”: [“技术书籍”]
}
}
}
},
{
“text”: “ElasticSearch实战指南”,
“_index”: “fgedu-ecommerce”,
“_id”: “1”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticSearch实战指南”,
“category”: “技术书籍”,
“price”: 99.9,
“suggest”: {
“input”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”],
“contexts”: {
“category”: [“技术书籍”]
}
}
}
}
]
}
]
}
}
4.2 文档搜索建议
文档搜索建议场景:
curl -X PUT “http://192.168.1.10:9200/fgedu-documents” -H “Content-Type: application/json” -d ‘{
“settings”: {
“number_of_shards”: 5,
“number_of_replicas”: 2
},
“mappings”: {
“properties”: {
“title”: {
“type”: “text”,
“analyzer”: “ik_max_word”
},
“content”: {
“type”: “text”,
“analyzer”: “ik_max_word”
},
“suggest”: {
“type”: “completion”,
“analyzer”: “ik_max_word”,
“search_analyzer”: “ik_max_word”
}
}
}
}’
# 插入测试数据
curl -X POST “http://192.168.1.10:9200/fgedu-documents/_bulk” -H “Content-Type: application/json” -d ‘{
“index”: {“_id”: “1”}
}
{
“title”: “ElasticSearch索引优化指南”,
“content”: “ElasticSearch索引优化是提高搜索性能的关键…”,
“suggest”: [“ElasticSearch”, “索引优化”, “ElasticSearch索引优化指南”]
}
{
“index”: {“_id”: “2”}
}
{
“title”: “Kibana可视化实战”,
“content”: “Kibana是ElasticStack中的可视化工具…”,
“suggest”: [“Kibana”, “可视化”, “Kibana可视化实战”]
}
{
“index”: {“_id”: “3”}
}
{
“title”: “Logstash数据采集实战”,
“content”: “Logstash是ElasticStack中的数据采集工具…”,
“suggest”: [“Logstash”, “数据采集”, “Logstash数据采集实战”]
}
‘
# 文档搜索建议
curl -X GET “http://192.168.1.10:9200/fgedu-documents/_search” -H “Content-Type: application/json” -d ‘{
“suggest”: {
“document-suggestion”: {
“prefix”: “kib”,
“completion”: {
“field”: “suggest”,
“size”: 5
}
}
}
}’
# 执行
# 输出日志
{
“took”: 3,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: {
“value”: 0,
“relation”: “eq”
},
“max_score”: null,
“hits”: []
},
“suggest”: {
“document-suggestion”: [
{
“text”: “kib”,
“offset”: 0,
“length”: 3,
“options”: [
{
“text”: “Kibana”,
“_index”: “fgedu-documents”,
“_id”: “2”,
“_score”: 1.0,
“_source”: {
“title”: “Kibana可视化实战”,
“content”: “Kibana是ElasticStack中的可视化工具…”,
“suggest”: [“Kibana”, “可视化”, “Kibana可视化实战”]
}
},
{
“text”: “Kibana可视化实战”,
“_index”: “fgedu-documents”,
“_id”: “2”,
“_score”: 1.0,
“_source”: {
“title”: “Kibana可视化实战”,
“content”: “Kibana是ElasticStack中的可视化工具…”,
“suggest”: [“Kibana”, “可视化”, “Kibana可视化实战”]
}
}
]
}
]
}
}
4.3 性能调优实战
建议器性能调优:
curl -X GET “http://192.168.1.10:9200/fgedu-products-completion/_search” -H “Content-Type: application/json” -d ‘{
“suggest”: {
“product-suggestion”: {
“prefix”: “ela”,
“completion”: {
“field”: “suggest”,
“size”: 10,
“skip_duplicates”: true
}
}
}
}’
# 执行
# 输出日志
{
“took”: 2,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: {
“value”: 0,
“relation”: “eq”
},
“max_score”: null,
“hits”: []
},
“suggest”: {
“product-suggestion”: [
{
“text”: “ela”,
“offset”: 0,
“length”: 3,
“options”: [
{
“text”: “ElasticSearch”,
“_index”: “fgedu-products-completion”,
“_id”: “1”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticSearch实战指南”,
“suggest”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”]
}
},
{
“text”: “ElasticStack”,
“_index”: “fgedu-products-completion”,
“_id”: “4”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticStack实战”,
“suggest”: [“ElasticStack”, “实战”, “ElasticStack实战”]
}
}
]
}
]
}
}
# 性能调优 – 限制建议数量
curl -X GET “http://192.168.1.10:9200/fgedu-products-completion/_search” -H “Content-Type: application/json” -d ‘{
“suggest”: {
“product-suggestion”: {
“prefix”: “ela”,
“completion”: {
“field”: “suggest”,
“size”: 3
}
}
}
}’
# 执行
# 输出日志
{
“took”: 1,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: {
“value”: 0,
“relation”: “eq”
},
“max_score”: null,
“hits”: []
},
“suggest”: {
“product-suggestion”: [
{
“text”: “ela”,
“offset”: 0,
“length”: 3,
“options”: [
{
“text”: “ElasticSearch”,
“_index”: “fgedu-products-completion”,
“_id”: “1”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticSearch实战指南”,
“suggest”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”]
}
},
{
“text”: “ElasticStack”,
“_index”: “fgedu-products-completion”,
“_id”: “4”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticStack实战”,
“suggest”: [“ElasticStack”, “实战”, “ElasticStack实战”]
}
},
{
“text”: “ElasticSearch实战指南”,
“_index”: “fgedu-products-completion”,
“_id”: “1”,
“_score”: 1.0,
“_source”: {
“product_name”: “ElasticSearch实战指南”,
“suggest”: [“ElasticSearch”, “实战指南”, “ElasticSearch实战指南”]
}
}
]
}
]
}
}
风哥提示:使用completion字段类型和skip_duplicates参数可以显著提高自动补全的性能
Part05-风哥经验总结与分享
5.1 建议器最佳实践
- 使用completion字段类型实现自动补全
- 合理设置suggestion的大小
- 使用context suggester限制建议范围
- 启用skip_duplicates避免重复建议
- 监控建议器性能指标
5.2 常见问题与解决方案
- 建议器性能慢:使用completion字段类型
- 建议结果不准确:检查analyzer配置
- 建议数量过多:设置合理的size参数
- 重复建议:启用skip_duplicates
- 内存占用高:限制suggestion的大小
5.3 生产环境调优建议
- 对频繁搜索的字段启用建议
- 监控建议器性能
- 设置合理的建议参数
- 考虑缓存建议结果
- 测试不同建议器的性能
from ElasticSearch视频:www.itpux.com
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
