ElasticSearch教程FG005-ElasticSearch集群搭建与节点规划实战
本文档风哥主要介绍ElasticSearch数据库的集群搭建与节点规划,包括集群概念、节点角色、集群形成原理、集群规模规划、节点角色规划、网络规划、集群部署、节点配置、集群初始化、集群健康检查、节点管理、集群扩展等内容,风哥教程参考ElasticSearch官方文档Discovery and cluster formation、Nodes等内容编写,适合DBA人员和开发人员在学习和测试中使用,如果要应用于生产环境则需要自行确认。
Part01-基础概念与理论知识
1.1 集群概念
ElasticSearch集群是由多个节点组成的集合,共同存储数据并提供搜索和分析服务。集群中的节点通过网络相互通信,形成一个统一的逻辑单元。更多视频教程www.fgedu.net.cn
1.2 节点角色
ElasticSearch节点角色包括:
- 主节点(Master):负责集群管理,包括索引创建、分片分配、节点发现等
- 数据节点(Data):负责存储数据和执行搜索、聚合等操作
- 协调节点(Coordinating):负责接收客户端请求,路由到相应节点,并返回结果
- Ingest节点:负责数据预处理,如转换、丰富等
- 机器学习节点:负责运行机器学习任务
1.3 集群形成原理
ElasticSearch集群形成的原理:
- 节点发现:节点通过discovery.seed_hosts配置发现其他节点
- 主节点选举:如果没有主节点,节点会进行选举
- 集群状态同步:主节点维护集群状态,并同步到所有节点
- 分片分配:主节点根据规则分配分片到各个数据节点
Part02-生产环境规划与建议
2.1 集群规模规划
ElasticSearch集群规模规划:
## 1. 小型集群(3-5节点)
– 适用场景:中小型应用,数据量100GB以下
– 配置建议:
– 3个主节点(确保高可用)
– 2-3个数据节点
– 每个节点8-16GB内存
– SSD存储
## 2. 中型集群(6-10节点)
– 适用场景:中型应用,数据量100GB-1TB
– 配置建议:
– 3个主节点
– 4-7个数据节点
– 每个节点16-32GB内存
– SSD存储
## 3. 大型集群(10+节点)
– 适用场景:大型应用,数据量1TB以上
– 配置建议:
– 3-5个主节点
– 10+个数据节点
– 每个节点32-64GB内存
– SSD存储或NVMe
2.2 节点角色规划
ElasticSearch节点角色规划:
- 主节点:
- 建议3个,确保高可用
- 配置较高的内存(16GB+)
- CPU要求不高
- 不存储数据
- 数据节点:
- 根据数据量和查询负载确定数量
- 配置较高的内存(16GB+)
- 多核CPU
- SSD存储
- 协调节点:
- 根据客户端请求量确定数量
- 配置中等内存(8GB+)
- 多核CPU
- 不存储数据
- Ingest节点:
- 根据数据处理需求确定数量
- 配置中等内存(8GB+)
- 多核CPU
- 不存储数据
2.3 网络规划
ElasticSearch集群网络规划:
- 网络拓扑:
- 使用专用网络段
- 确保节点间网络延迟低(< 1ms)
- 避免跨数据中心部署(除非使用跨集群复制)
- 端口配置:
- 9200:API访问
- 9300:节点间通信
- 确保防火墙开放这些端口
- 网络参数:
- 调整TCP参数,优化网络性能
- 启用TCP keepalive
- 调整网络缓冲区大小
Part03-生产环境项目实施方案
3.1 集群部署实战
ElasticSearch集群部署:
## 1. 准备服务器
# 节点1(主节点):192.168.1.10
# 节点2(主节点):192.168.1.11
# 节点3(主节点):192.168.1.12
# 节点4(数据节点):192.168.1.13
# 节点5(数据节点):192.168.1.14
## 2. 在所有节点上安装ElasticSearch
# 导入GPG密钥
$ rpm –import https://artifacts.elastic.co/GPG-KEY-elasticsearch
# 创建yum仓库
$ cat > /etc/yum.repos.d/elasticsearch.repo << EOF
[elasticsearch-8.x]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF
# 安装ElasticSearch
$ yum install -y elasticsearch
## 3. 配置主节点1(192.168.1.10)
$ vi /etc/elasticsearch/elasticsearch.yml
cluster.name: fgedu-cluster
node.name: master-1
node.roles: [master]
path.data: /es/fgdata
path.logs: /var/log/elasticsearch
network.host: 192.168.1.10
http.port: 9200
transport.port: 9300
discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"]
cluster.initial_master_nodes: ["master-1", "master-2", "master-3"]
bootstrap.memory_lock: true
xpack.security.enabled: true
## 4. 配置主节点2(192.168.1.11)
$ vi /etc/elasticsearch/elasticsearch.yml
cluster.name: fgedu-cluster
node.name: master-2
node.roles: [master]
path.data: /es/fgdata
path.logs: /var/log/elasticsearch
network.host: 192.168.1.11
http.port: 9200
transport.port: 9300
discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"]
cluster.initial_master_nodes: ["master-1", "master-2", "master-3"]
bootstrap.memory_lock: true
xpack.security.enabled: true
## 5. 配置主节点3(192.168.1.12)
$ vi /etc/elasticsearch/elasticsearch.yml
cluster.name: fgedu-cluster
node.name: master-3
node.roles: [master]
path.data: /es/fgdata
path.logs: /var/log/elasticsearch
network.host: 192.168.1.12
http.port: 9200
transport.port: 9300
discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"]
cluster.initial_master_nodes: ["master-1", "master-2", "master-3"]
bootstrap.memory_lock: true
xpack.security.enabled: true
## 6. 配置数据节点1(192.168.1.13)
$ vi /etc/elasticsearch/elasticsearch.yml
cluster.name: fgedu-cluster
node.name: data-1
node.roles: [data, ingest]
path.data: /es/fgdata
path.logs: /var/log/elasticsearch
network.host: 192.168.1.13
http.port: 9200
transport.port: 9300
discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"]
bootstrap.memory_lock: true
xpack.security.enabled: true
## 7. 配置数据节点2(192.168.1.14)
$ vi /etc/elasticsearch/elasticsearch.yml
cluster.name: fgedu-cluster
node.name: data-2
node.roles: [data, ingest]
path.data: /es/fgdata
path.logs: /var/log/elasticsearch
network.host: 192.168.1.14
http.port: 9200
transport.port: 9300
discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"]
bootstrap.memory_lock: true
xpack.security.enabled: true
3.2 节点配置实战
ElasticSearch节点配置:
## 1. 配置JVM堆大小
# 在所有节点上执行
$ vi /etc/elasticsearch/jvm.options
# 修改堆大小
-Xms16g
-Xmx16g
## 2. 配置系统参数
# 在所有节点上执行
$ vi /etc/sysctl.conf
# 添加以下参数
vm.max_map_count=262144
$ sysctl -p
## 3. 配置文件描述符
# 在所有节点上执行
$ vi /etc/security/limits.conf
# 添加以下参数
elasticsearch soft nofile 65536
elasticsearch hard nofile 65536
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
## 4. 启动服务
# 在所有节点上执行
$ systemctl daemon-reload
$ systemctl enable elasticsearch
$ systemctl start elasticsearch
## 5. 检查服务状态
# 在所有节点上执行
$ systemctl status elasticsearch
● elasticsearch.service – Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2026-04-07 10:00:00 CST; 1min ago
Docs: https://www.elastic.co
Main PID: 12345 (java)
Tasks: 68
Memory: 2.1G
CGroup: /system.slice/elasticsearch.service
└─12345 /usr/share/elasticsearch/jdk/bin/java -Xms16g -Xmx16g -XX:+UseG1GC -XX:MaxGCPauseMillis=200…
3.3 集群初始化实战
ElasticSearch集群初始化:
## 1. 生成证书
# 在主节点1上执行
$ /usr/share/elasticsearch/bin/elasticsearch-certutil cert -out /etc/elasticsearch/elastic-certificates.p12 -pass “”
## 2. 复制证书到所有节点
# 在主节点1上执行
$ scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.11:/etc/elasticsearch/
$ scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.12:/etc/elasticsearch/
$ scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.13:/etc/elasticsearch/
$ scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.14:/etc/elasticsearch/
## 3. 配置SSL
# 在所有节点上执行
$ vi /etc/elasticsearch/elasticsearch.yml
# 添加以下配置
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
## 4. 重启所有节点
# 在所有节点上执行
$ systemctl restart elasticsearch
## 5. 设置内置用户密码
# 在主节点1上执行
$ /usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
You will be prompted to enter passwords as the process progresses.
Please confirm that you would like to continue [y/N]y
Enter password for [elastic]:
Reenter password for [elastic]:
Enter password for [apm_system]:
Reenter password for [apm_system]:
Enter password for [kibana_system]:
Reenter password for [kibana_system]:
Enter password for [logstash_system]:
Reenter password for [logstash_system]:
Enter password for [beats_system]:
Reenter password for [beats_system]:
Enter password for [remote_monitoring_user]:
Reenter password for [remote_monitoring_user]:
Changed password for user [apm_system]
Changed password for user [kibana_system]
Changed password for user [kibana]
Changed password for user [logstash_system]
Changed password for user [beats_system]
Changed password for user [remote_monitoring_user]
Changed password for user [elastic]
Part04-生产案例与实战讲解
4.1 集群健康检查实战
ElasticSearch集群健康检查:
## 1. 检查集群健康状态
$ curl -u elastic:your_password -X GET “http://192.168.1.10:9200/_cluster/health?pretty”
{
“cluster_name” : “fgedu-cluster”,
“status” : “green”,
“timed_out” : false,
“number_of_nodes” : 5,
“number_of_data_nodes” : 2,
“active_primary_shards” : 0,
“active_shards” : 0,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 0,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 100.0
}
## 2. 检查节点信息
$ curl -u elastic:your_password -X GET “http://192.168.1.10:9200/_cat/nodes?v”
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.1.10 25 75 2 0.10 0.15 0.12 m * master-1
192.168.1.11 22 70 1 0.05 0.10 0.08 m – master-2
192.168.1.12 20 68 1 0.08 0.12 0.10 m – master-3
192.168.1.13 18 65 1 0.05 0.08 0.09 d – data-1
192.168.1.14 15 62 1 0.03 0.06 0.07 d – data-2
## 3. 检查集群状态
$ curl -u elastic:your_password -X GET “http://192.168.1.10:9200/_cluster/state?pretty”
{
“cluster_name” : “fgedu-cluster”,
“cluster_uuid” : “cluster_uuid”,
“version” : 1,
“state_uuid” : “state_uuid”,
“master_node” : “master-1”,
“nodes” : {
“master-1” : {
“name” : “master-1”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.10:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
},
“master-2” : {
“name” : “master-2”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.11:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
},
“master-3” : {
“name” : “master-3”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.12:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
},
“data-1” : {
“name” : “data-1”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.13:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
},
“data-2” : {
“name” : “data-2”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.14:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
}
}
}
4.2 节点管理实战
ElasticSearch节点管理:
## 1. 查看节点详细信息
$ curl -u elastic:your_password -X GET “http://192.168.1.10:9200/_nodes?pretty”
{
“_nodes” : {
“total” : 5,
“successful” : 5,
“failed” : 0
},
“nodes” : {
“master-1” : {
“name” : “master-1”,
“transport_address” : “192.168.1.10:9300”,
“host” : “192.168.1.10”,
“ip” : “192.168.1.10”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “master” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
},
“master-2” : {
“name” : “master-2”,
“transport_address” : “192.168.1.11:9300”,
“host” : “192.168.1.11”,
“ip” : “192.168.1.11”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “master” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
},
“master-3” : {
“name” : “master-3”,
“transport_address” : “192.168.1.12:9300”,
“host” : “192.168.1.12”,
“ip” : “192.168.1.12”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “master” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
},
“data-1” : {
“name” : “data-1”,
“transport_address” : “192.168.1.13:9300”,
“host” : “192.168.1.13”,
“ip” : “192.168.1.13”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “data”, “ingest” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
},
“data-2” : {
“name” : “data-2”,
“transport_address” : “192.168.1.14:9300”,
“host” : “192.168.1.14”,
“ip” : “192.168.1.14”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “data”, “ingest” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
}
}
}
## 2. 临时禁用节点分配
$ curl -u elastic:your_password -X PUT “http://192.168.1.10:9200/_cluster/settings” -H “Content-Type: application/json” -d ‘{
“transient”: {
“cluster.routing.allocation.enable”: “none”
}
}’
{
“acknowledged” : true,
“persistent” : { },
“transient” : {
“cluster” : {
“routing” : {
“allocation” : {
“enable” : “none”
}
}
}
}
}
## 3. 重启节点
$ systemctl restart elasticsearch
## 4. 重新启用节点分配
$ curl -u elastic:your_password -X PUT “http://192.168.1.10:9200/_cluster/settings” -H “Content-Type: application/json” -d ‘{
“transient”: {
“cluster.routing.allocation.enable”: “all”
}
}’
{
“acknowledged” : true,
“persistent” : { },
“transient” : {
“cluster” : {
“routing” : {
“allocation” : {
“enable” : “all”
}
}
}
}
}
4.3 集群扩展实战
ElasticSearch集群扩展:
## 1. 添加新数据节点
# 准备新节点:192.168.1.15
## 2. 安装ElasticSearch
$ rpm –import https://artifacts.elastic.co/GPG-KEY-elasticsearch
$ cat > /etc/yum.repos.d/elasticsearch.repo << EOF
[elasticsearch-8.x]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF
$ yum install -y elasticsearch
## 3. 配置新节点
$ vi /etc/elasticsearch/elasticsearch.yml
cluster.name: fgedu-cluster
node.name: data-3
node.roles: [data, ingest]
path.data: /es/fgdata
path.logs: /var/log/elasticsearch
network.host: 192.168.1.15
http.port: 9200
transport.port: 9300
discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"]
bootstrap.memory_lock: true
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
## 4. 复制证书到新节点
$ scp root@192.168.1.10:/etc/elasticsearch/elastic-certificates.p12 /etc/elasticsearch/
## 5. 配置JVM堆大小
$ vi /etc/elasticsearch/jvm.options
-Xms16g
-Xmx16g
## 6. 启动新节点
$ systemctl daemon-reload
$ systemctl enable elasticsearch
$ systemctl start elasticsearch
## 7. 检查新节点是否加入集群
$ curl -u elastic:your_password -X GET "http://192.168.1.10:9200/_cat/nodes?v"
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.1.10 25 75 2 0.10 0.15 0.12 m * master-1
192.168.1.11 22 70 1 0.05 0.10 0.08 m - master-2
192.168.1.12 20 68 1 0.08 0.12 0.10 m - master-3
192.168.1.13 18 65 1 0.05 0.08 0.09 d - data-1
192.168.1.14 15 62 1 0.03 0.06 0.07 d - data-2
192.168.1.15 10 55 1 0.02 0.04 0.05 d - data-3
## 8. 查看分片重新分配
$ curl -u elastic:your_password -X GET "http://192.168.1.10:9200/_cluster/health?wait_for_status=green&timeout=1m"
{
"cluster_name" : "fgedu-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 6,
"number_of_data_nodes" : 3,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Part05-风哥经验总结与分享
5.1 集群搭建最佳实践
ElasticSearch集群搭建最佳实践:
- 节点角色分离:
- 主节点与数据节点分离
- 协调节点与数据节点分离
- 根据负载调整节点角色
- 硬件配置:
- 主节点:高内存,中等CPU
- 数据节点:高内存,多核CPU,SSD存储
- 协调节点:中等内存,多核CPU
- 网络配置:
- 使用低延迟网络
- 配置合适的网络参数
- 确保节点间网络稳定
- 安全配置:
- 启用X-Pack安全功能
- 设置强密码
- 配置SSL/TLS加密
- 限制访问权限
5.2 常见问题与解决方案
ElasticSearch集群搭建常见问题与解决方案:
## 1. 集群无法形成
– 错误信息:master not discovered yet
– 解决方案:
– 检查discovery.seed_hosts配置
– 检查network.host配置
– 检查防火墙设置
– 检查节点间网络连接
## 2. 节点无法加入集群
– 错误信息:failed to join cluster
– 解决方案:
– 检查集群名称是否一致
– 检查节点角色配置
– 检查SSL证书配置
– 检查网络连接
## 3. 集群状态为yellow或red
– 错误信息:cluster status is yellow/red
– 解决方案:
– 检查分片分配情况
– 检查节点状态
– 检查磁盘空间
– 检查硬件资源
## 4. 主节点选举失败
– 错误信息:master election failed
– 解决方案:
– 确保有足够的主节点候选
– 检查节点间网络连接
– 检查节点配置
– 重启节点
## 5. 分片分配失败
– 错误信息:shard allocation failed
– 解决方案:
– 检查磁盘空间
– 检查节点状态
– 检查分片分配设置
– 手动触发分片分配
5.3 集群维护建议
ElasticSearch集群维护建议:
- 定期检查:
- 集群健康状态
- 节点状态
- 磁盘空间
- 内存使用情况
- 备份策略:
- 定期执行快照备份
- 备份到外部存储
- 测试备份恢复
- 升级策略:
- 使用滚动升级
- 在测试环境验证
- 备份数据前升级
- 监控与告警:
- 配置监控系统
- 设置关键指标告警
- 定期分析监控数据
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
