1. 首页 > ElasticSearch教程 > 正文

ElasticSearch教程FG005-ElasticSearch集群搭建与节点规划实战

本文档风哥主要介绍ElasticSearch数据库的集群搭建与节点规划,包括集群概念、节点角色、集群形成原理、集群规模规划、节点角色规划、网络规划、集群部署、节点配置、集群初始化、集群健康检查、节点管理、集群扩展等内容,风哥教程参考ElasticSearch官方文档Discovery and cluster formation、Nodes等内容编写,适合DBA人员和开发人员在学习和测试中使用,如果要应用于生产环境则需要自行确认。

Part01-基础概念与理论知识

1.1 集群概念

ElasticSearch集群是由多个节点组成的集合,共同存储数据并提供搜索和分析服务。集群中的节点通过网络相互通信,形成一个统一的逻辑单元。更多视频教程www.fgedu.net.cn

1.2 节点角色

ElasticSearch节点角色包括:

  • 主节点(Master):负责集群管理,包括索引创建、分片分配、节点发现等
  • 数据节点(Data):负责存储数据和执行搜索、聚合等操作
  • 协调节点(Coordinating):负责接收客户端请求,路由到相应节点,并返回结果
  • Ingest节点:负责数据预处理,如转换、丰富等
  • 机器学习节点:负责运行机器学习任务

1.3 集群形成原理

ElasticSearch集群形成的原理:

  1. 节点发现:节点通过discovery.seed_hosts配置发现其他节点
  2. 主节点选举:如果没有主节点,节点会进行选举
  3. 集群状态同步:主节点维护集群状态,并同步到所有节点
  4. 分片分配:主节点根据规则分配分片到各个数据节点
风哥提示:集群中的每个节点都有一个唯一的名称,主节点负责集群的管理和协调,数据节点负责存储和处理数据。学习交流加群风哥微信: itpux-com

Part02-生产环境规划与建议

2.1 集群规模规划

ElasticSearch集群规模规划:

# 集群规模规划

## 1. 小型集群(3-5节点)
– 适用场景:中小型应用,数据量100GB以下
– 配置建议:
– 3个主节点(确保高可用)
– 2-3个数据节点
– 每个节点8-16GB内存
– SSD存储

## 2. 中型集群(6-10节点)
– 适用场景:中型应用,数据量100GB-1TB
– 配置建议:
– 3个主节点
– 4-7个数据节点
– 每个节点16-32GB内存
– SSD存储

## 3. 大型集群(10+节点)
– 适用场景:大型应用,数据量1TB以上
– 配置建议:
– 3-5个主节点
– 10+个数据节点
– 每个节点32-64GB内存
– SSD存储或NVMe

2.2 节点角色规划

ElasticSearch节点角色规划:

  • 主节点:
    • 建议3个,确保高可用
    • 配置较高的内存(16GB+)
    • CPU要求不高
    • 不存储数据
  • 数据节点:
    • 根据数据量和查询负载确定数量
    • 配置较高的内存(16GB+)
    • 多核CPU
    • SSD存储
  • 协调节点:
    • 根据客户端请求量确定数量
    • 配置中等内存(8GB+)
    • 多核CPU
    • 不存储数据
  • Ingest节点:
    • 根据数据处理需求确定数量
    • 配置中等内存(8GB+)
    • 多核CPU
    • 不存储数据

2.3 网络规划

ElasticSearch集群网络规划:

  • 网络拓扑:
    • 使用专用网络段
    • 确保节点间网络延迟低(< 1ms)
    • 避免跨数据中心部署(除非使用跨集群复制)
  • 端口配置:
    • 9200:API访问
    • 9300:节点间通信
    • 确保防火墙开放这些端口
  • 网络参数:
    • 调整TCP参数,优化网络性能
    • 启用TCP keepalive
    • 调整网络缓冲区大小
生产环境建议:网络规划对于ElasticSearch集群的稳定性至关重要,建议使用低延迟、高带宽的网络环境,并确保所有节点之间的网络连接稳定。学习交流加群风哥QQ113257174

Part03-生产环境项目实施方案

3.1 集群部署实战

ElasticSearch集群部署:

# 集群部署步骤

## 1. 准备服务器
# 节点1(主节点):192.168.1.10
# 节点2(主节点):192.168.1.11
# 节点3(主节点):192.168.1.12
# 节点4(数据节点):192.168.1.13
# 节点5(数据节点):192.168.1.14

## 2. 在所有节点上安装ElasticSearch
# 导入GPG密钥
$ rpm –import https://artifacts.elastic.co/GPG-KEY-elasticsearch

# 创建yum仓库
$ cat > /etc/yum.repos.d/elasticsearch.repo << EOF [elasticsearch-8.x] name=Elasticsearch repository for 8.x packages baseurl=https://artifacts.elastic.co/packages/8.x/yum gpgcheck=1 gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch enabled=1 autorefresh=1 type=rpm-md EOF # 安装ElasticSearch $ yum install -y elasticsearch ## 3. 配置主节点1(192.168.1.10) $ vi /etc/elasticsearch/elasticsearch.yml cluster.name: fgedu-cluster node.name: master-1 node.roles: [master] path.data: /es/fgdata path.logs: /var/log/elasticsearch network.host: 192.168.1.10 http.port: 9200 transport.port: 9300 discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"] cluster.initial_master_nodes: ["master-1", "master-2", "master-3"] bootstrap.memory_lock: true xpack.security.enabled: true ## 4. 配置主节点2(192.168.1.11) $ vi /etc/elasticsearch/elasticsearch.yml cluster.name: fgedu-cluster node.name: master-2 node.roles: [master] path.data: /es/fgdata path.logs: /var/log/elasticsearch network.host: 192.168.1.11 http.port: 9200 transport.port: 9300 discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"] cluster.initial_master_nodes: ["master-1", "master-2", "master-3"] bootstrap.memory_lock: true xpack.security.enabled: true ## 5. 配置主节点3(192.168.1.12) $ vi /etc/elasticsearch/elasticsearch.yml cluster.name: fgedu-cluster node.name: master-3 node.roles: [master] path.data: /es/fgdata path.logs: /var/log/elasticsearch network.host: 192.168.1.12 http.port: 9200 transport.port: 9300 discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"] cluster.initial_master_nodes: ["master-1", "master-2", "master-3"] bootstrap.memory_lock: true xpack.security.enabled: true ## 6. 配置数据节点1(192.168.1.13) $ vi /etc/elasticsearch/elasticsearch.yml cluster.name: fgedu-cluster node.name: data-1 node.roles: [data, ingest] path.data: /es/fgdata path.logs: /var/log/elasticsearch network.host: 192.168.1.13 http.port: 9200 transport.port: 9300 discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"] bootstrap.memory_lock: true xpack.security.enabled: true ## 7. 配置数据节点2(192.168.1.14) $ vi /etc/elasticsearch/elasticsearch.yml cluster.name: fgedu-cluster node.name: data-2 node.roles: [data, ingest] path.data: /es/fgdata path.logs: /var/log/elasticsearch network.host: 192.168.1.14 http.port: 9200 transport.port: 9300 discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"] bootstrap.memory_lock: true xpack.security.enabled: true

3.2 节点配置实战

ElasticSearch节点配置:

# 节点配置

## 1. 配置JVM堆大小
# 在所有节点上执行
$ vi /etc/elasticsearch/jvm.options
# 修改堆大小
-Xms16g
-Xmx16g

## 2. 配置系统参数
# 在所有节点上执行
$ vi /etc/sysctl.conf
# 添加以下参数
vm.max_map_count=262144

$ sysctl -p

## 3. 配置文件描述符
# 在所有节点上执行
$ vi /etc/security/limits.conf
# 添加以下参数
elasticsearch soft nofile 65536
elasticsearch hard nofile 65536
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited

## 4. 启动服务
# 在所有节点上执行
$ systemctl daemon-reload
$ systemctl enable elasticsearch
$ systemctl start elasticsearch

## 5. 检查服务状态
# 在所有节点上执行
$ systemctl status elasticsearch
● elasticsearch.service – Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2026-04-07 10:00:00 CST; 1min ago
Docs: https://www.elastic.co
Main PID: 12345 (java)
Tasks: 68
Memory: 2.1G
CGroup: /system.slice/elasticsearch.service
└─12345 /usr/share/elasticsearch/jdk/bin/java -Xms16g -Xmx16g -XX:+UseG1GC -XX:MaxGCPauseMillis=200…

3.3 集群初始化实战

ElasticSearch集群初始化:

# 集群初始化

## 1. 生成证书
# 在主节点1上执行
$ /usr/share/elasticsearch/bin/elasticsearch-certutil cert -out /etc/elasticsearch/elastic-certificates.p12 -pass “”

## 2. 复制证书到所有节点
# 在主节点1上执行
$ scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.11:/etc/elasticsearch/
$ scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.12:/etc/elasticsearch/
$ scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.13:/etc/elasticsearch/
$ scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.14:/etc/elasticsearch/

## 3. 配置SSL
# 在所有节点上执行
$ vi /etc/elasticsearch/elasticsearch.yml
# 添加以下配置
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12

## 4. 重启所有节点
# 在所有节点上执行
$ systemctl restart elasticsearch

## 5. 设置内置用户密码
# 在主节点1上执行
$ /usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
You will be prompted to enter passwords as the process progresses.
Please confirm that you would like to continue [y/N]y

Enter password for [elastic]:
Reenter password for [elastic]:
Enter password for [apm_system]:
Reenter password for [apm_system]:
Enter password for [kibana_system]:
Reenter password for [kibana_system]:
Enter password for [logstash_system]:
Reenter password for [logstash_system]:
Enter password for [beats_system]:
Reenter password for [beats_system]:
Enter password for [remote_monitoring_user]:
Reenter password for [remote_monitoring_user]:
Changed password for user [apm_system]
Changed password for user [kibana_system]
Changed password for user [kibana]
Changed password for user [logstash_system]
Changed password for user [beats_system]
Changed password for user [remote_monitoring_user]
Changed password for user [elastic]

Part04-生产案例与实战讲解

4.1 集群健康检查实战

ElasticSearch集群健康检查:

# 集群健康检查

## 1. 检查集群健康状态
$ curl -u elastic:your_password -X GET “http://192.168.1.10:9200/_cluster/health?pretty”
{
“cluster_name” : “fgedu-cluster”,
“status” : “green”,
“timed_out” : false,
“number_of_nodes” : 5,
“number_of_data_nodes” : 2,
“active_primary_shards” : 0,
“active_shards” : 0,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 0,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 100.0
}

## 2. 检查节点信息
$ curl -u elastic:your_password -X GET “http://192.168.1.10:9200/_cat/nodes?v”
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.1.10 25 75 2 0.10 0.15 0.12 m * master-1
192.168.1.11 22 70 1 0.05 0.10 0.08 m – master-2
192.168.1.12 20 68 1 0.08 0.12 0.10 m – master-3
192.168.1.13 18 65 1 0.05 0.08 0.09 d – data-1
192.168.1.14 15 62 1 0.03 0.06 0.07 d – data-2

## 3. 检查集群状态
$ curl -u elastic:your_password -X GET “http://192.168.1.10:9200/_cluster/state?pretty”
{
“cluster_name” : “fgedu-cluster”,
“cluster_uuid” : “cluster_uuid”,
“version” : 1,
“state_uuid” : “state_uuid”,
“master_node” : “master-1”,
“nodes” : {
“master-1” : {
“name” : “master-1”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.10:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
},
“master-2” : {
“name” : “master-2”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.11:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
},
“master-3” : {
“name” : “master-3”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.12:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
},
“data-1” : {
“name” : “data-1”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.13:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
},
“data-2” : {
“name” : “data-2”,
“ephemeral_id” : “ephemeral_id”,
“transport_address” : “192.168.1.14:9300”,
“attributes” : {
“ml.machine_memory” : “33554432000”,
“ml.max_open_jobs” : “20”,
“xpack.installed” : “true”
}
}
}
}

4.2 节点管理实战

ElasticSearch节点管理:

# 节点管理

## 1. 查看节点详细信息
$ curl -u elastic:your_password -X GET “http://192.168.1.10:9200/_nodes?pretty”
{
“_nodes” : {
“total” : 5,
“successful” : 5,
“failed” : 0
},
“nodes” : {
“master-1” : {
“name” : “master-1”,
“transport_address” : “192.168.1.10:9300”,
“host” : “192.168.1.10”,
“ip” : “192.168.1.10”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “master” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
},
“master-2” : {
“name” : “master-2”,
“transport_address” : “192.168.1.11:9300”,
“host” : “192.168.1.11”,
“ip” : “192.168.1.11”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “master” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
},
“master-3” : {
“name” : “master-3”,
“transport_address” : “192.168.1.12:9300”,
“host” : “192.168.1.12”,
“ip” : “192.168.1.12”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “master” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
},
“data-1” : {
“name” : “data-1”,
“transport_address” : “192.168.1.13:9300”,
“host” : “192.168.1.13”,
“ip” : “192.168.1.13”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “data”, “ingest” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
},
“data-2” : {
“name” : “data-2”,
“transport_address” : “192.168.1.14:9300”,
“host” : “192.168.1.14”,
“ip” : “192.168.1.14”,
“version” : “8.7.0”,
“build_flavor” : “default”,
“build_type” : “rpm”,
“build_hash” : “build_hash”,
“roles” : [ “data”, “ingest” ],
“jvm” : {
“mem” : {
“heap_used_in_bytes” : 4294967296,
“heap_used_percent” : 25,
“heap_committed_in_bytes” : 17179869184,
“heap_max_in_bytes” : 17179869184
}
}
}
}
}

## 2. 临时禁用节点分配
$ curl -u elastic:your_password -X PUT “http://192.168.1.10:9200/_cluster/settings” -H “Content-Type: application/json” -d ‘{
“transient”: {
“cluster.routing.allocation.enable”: “none”
}
}’
{
“acknowledged” : true,
“persistent” : { },
“transient” : {
“cluster” : {
“routing” : {
“allocation” : {
“enable” : “none”
}
}
}
}
}

## 3. 重启节点
$ systemctl restart elasticsearch

## 4. 重新启用节点分配
$ curl -u elastic:your_password -X PUT “http://192.168.1.10:9200/_cluster/settings” -H “Content-Type: application/json” -d ‘{
“transient”: {
“cluster.routing.allocation.enable”: “all”
}
}’
{
“acknowledged” : true,
“persistent” : { },
“transient” : {
“cluster” : {
“routing” : {
“allocation” : {
“enable” : “all”
}
}
}
}
}

4.3 集群扩展实战

ElasticSearch集群扩展:

# 集群扩展

## 1. 添加新数据节点
# 准备新节点:192.168.1.15

## 2. 安装ElasticSearch
$ rpm –import https://artifacts.elastic.co/GPG-KEY-elasticsearch
$ cat > /etc/yum.repos.d/elasticsearch.repo << EOF [elasticsearch-8.x] name=Elasticsearch repository for 8.x packages baseurl=https://artifacts.elastic.co/packages/8.x/yum gpgcheck=1 gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch enabled=1 autorefresh=1 type=rpm-md EOF $ yum install -y elasticsearch ## 3. 配置新节点 $ vi /etc/elasticsearch/elasticsearch.yml cluster.name: fgedu-cluster node.name: data-3 node.roles: [data, ingest] path.data: /es/fgdata path.logs: /var/log/elasticsearch network.host: 192.168.1.15 http.port: 9200 transport.port: 9300 discovery.seed_hosts: ["192.168.1.10", "192.168.1.11", "192.168.1.12"] bootstrap.memory_lock: true xpack.security.enabled: true xpack.security.transport.ssl.enabled: true xpack.security.transport.ssl.verification_mode: certificate xpack.security.transport.ssl.keystore.path: elastic-certificates.p12 xpack.security.transport.ssl.truststore.path: elastic-certificates.p12 ## 4. 复制证书到新节点 $ scp root@192.168.1.10:/etc/elasticsearch/elastic-certificates.p12 /etc/elasticsearch/ ## 5. 配置JVM堆大小 $ vi /etc/elasticsearch/jvm.options -Xms16g -Xmx16g ## 6. 启动新节点 $ systemctl daemon-reload $ systemctl enable elasticsearch $ systemctl start elasticsearch ## 7. 检查新节点是否加入集群 $ curl -u elastic:your_password -X GET "http://192.168.1.10:9200/_cat/nodes?v" ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 192.168.1.10 25 75 2 0.10 0.15 0.12 m * master-1 192.168.1.11 22 70 1 0.05 0.10 0.08 m - master-2 192.168.1.12 20 68 1 0.08 0.12 0.10 m - master-3 192.168.1.13 18 65 1 0.05 0.08 0.09 d - data-1 192.168.1.14 15 62 1 0.03 0.06 0.07 d - data-2 192.168.1.15 10 55 1 0.02 0.04 0.05 d - data-3 ## 8. 查看分片重新分配 $ curl -u elastic:your_password -X GET "http://192.168.1.10:9200/_cluster/health?wait_for_status=green&timeout=1m" { "cluster_name" : "fgedu-cluster", "status" : "green", "timed_out" : false, "number_of_nodes" : 6, "number_of_data_nodes" : 3, "active_primary_shards" : 0, "active_shards" : 0, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }

风哥提示:集群扩展时,新节点会自动加入集群并参与分片分配,无需手动干预。建议在低峰期进行集群扩展,以减少对生产环境的影响。更多学习教程公众号风哥教程itpux_com

Part05-风哥经验总结与分享

5.1 集群搭建最佳实践

ElasticSearch集群搭建最佳实践:

  • 节点角色分离:
    • 主节点与数据节点分离
    • 协调节点与数据节点分离
    • 根据负载调整节点角色
  • 硬件配置:
    • 主节点:高内存,中等CPU
    • 数据节点:高内存,多核CPU,SSD存储
    • 协调节点:中等内存,多核CPU
  • 网络配置:
    • 使用低延迟网络
    • 配置合适的网络参数
    • 确保节点间网络稳定
  • 安全配置:
    • 启用X-Pack安全功能
    • 设置强密码
    • 配置SSL/TLS加密
    • 限制访问权限

5.2 常见问题与解决方案

ElasticSearch集群搭建常见问题与解决方案:

# 常见问题与解决方案

## 1. 集群无法形成
– 错误信息:master not discovered yet
– 解决方案:
– 检查discovery.seed_hosts配置
– 检查network.host配置
– 检查防火墙设置
– 检查节点间网络连接

## 2. 节点无法加入集群
– 错误信息:failed to join cluster
– 解决方案:
– 检查集群名称是否一致
– 检查节点角色配置
– 检查SSL证书配置
– 检查网络连接

## 3. 集群状态为yellow或red
– 错误信息:cluster status is yellow/red
– 解决方案:
– 检查分片分配情况
– 检查节点状态
– 检查磁盘空间
– 检查硬件资源

## 4. 主节点选举失败
– 错误信息:master election failed
– 解决方案:
– 确保有足够的主节点候选
– 检查节点间网络连接
– 检查节点配置
– 重启节点

## 5. 分片分配失败
– 错误信息:shard allocation failed
– 解决方案:
– 检查磁盘空间
– 检查节点状态
– 检查分片分配设置
– 手动触发分片分配

5.3 集群维护建议

ElasticSearch集群维护建议:

  • 定期检查:
    • 集群健康状态
    • 节点状态
    • 磁盘空间
    • 内存使用情况
  • 备份策略:
    • 定期执行快照备份
    • 备份到外部存储
    • 测试备份恢复
  • 升级策略:
    • 使用滚动升级
    • 在测试环境验证
    • 备份数据前升级
  • 监控与告警:
    • 配置监控系统
    • 设置关键指标告警
    • 定期分析监控数据
持续优化:ElasticSearch集群的维护是一个持续的过程,需要定期检查和优化。建议建立完善的集群维护流程,确保集群的稳定运行。from ElasticSearch视频:www.itpux.com

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息