Rancher教程FG003-Rancher高可用架构(ETCD+Nginx)部署项目实战
本文档风哥主要介绍Rancher高可用架构(ETCD+Nginx)部署项目实战,包括Rancher数据库高可用概念、Rancher数据库高可用架构、Rancher数据库高可用要求、Rancher数据库高可用硬件规划、Rancher数据库高可用网络规划、Rancher数据库高可用存储规划、Rancher数据库Nginx负载均衡安装、Rancher数据库ETCD集群部署、Rancher数据库高可用部署、Rancher数据库高可用验证、Rancher数据库故障切换测试、Rancher数据库高可用备份等内容,风哥教程参考Rancher官方文档安装与升级、高可用部署等内容,适合运维人员在学习和测试中使用,如果要应用于生产环境则需要自行确认。
Part01-基础概念与理论知识
1.1 Rancher数据库高可用概念
Rancher数据库高可用是指通过部署多个Rancher Server实例,使用Nginx作为负载均衡器,ETCD作为分布式键值存储,实现Rancher管理平台的高可用性。高可用架构可以避免单点故障,提高系统的可靠性和可用性。当某个节点发生故障时,其他节点可以继续提供服务,确保业务连续性。Rancher数据库高可用架构适合对可用性要求较高的生产环境。更多视频教程www.fgedu.net.cn
- 多节点部署,避免单点故障
- 自动故障切换,提高可用性
- 负载均衡,提高性能
- 数据冗余,保证数据安全
- 适合生产环境部署
1.2 Rancher数据库高可用架构
Rancher数据库高可用架构:
┌─────────────────────────────────────────────────────────┐
│ Nginx负载均衡器 │
│ 192.168.1.10:80/443 │
└────────────────────┬────────────────────────────────────┘
│
┌────────────┼────────────┐
│ │ │
┌───────▼────┐ ┌─────▼──────┐ ┌─────▼──────┐
│ Rancher-1 │ │ Rancher-2 │ │ Rancher-3 │
│ 192.168.1.11│ │192.168.1.12│ │192.168.1.13│
│ │ │ │ │ │
│ – Rancher │ │ – Rancher │ │ – Rancher │
│ – Agent │ │ – Agent │ │ – Agent │
└──────┬──────┘ └─────┬──────┘ └─────┬──────┘
│ │ │
└──────────────┼──────────────┘
│
┌─────────────┼─────────────┐
│ │ │
┌───────▼────┐ ┌─────▼──────┐ ┌─────▼──────┐
│ ETCD-1 │ │ ETCD-2 │ │ ETCD-3 │
│192.168.1.11│ │192.168.1.12│ │192.168.1.13│
│ │ │ │ │ │
│ – ETCD │ │ – ETCD │ │ – ETCD │
│ – Data │ │ – Data │ │ – Data │
└─────────────┘ └────────────┘ └────────────┘
# 架构说明
1. Nginx负载均衡器:负责将请求分发到多个Rancher Server
2. Rancher Server:提供Rancher管理功能,每个节点都运行Rancher容器
3. ETCD集群:分布式键值存储,存储Rancher配置数据
4. 负载均衡策略:支持轮询、最少连接、IP哈希等
5. 故障切换:当某个节点故障时,自动切换到其他节点
1.3 Rancher数据库高可用要求
Rancher数据库高可用要求:
- 节点数量:至少3个节点,建议奇数个节点
- 网络要求:节点之间网络互通,延迟小于10ms
- 时间同步:所有节点时间同步,误差小于1秒
- 存储要求:每个节点配置独立存储,支持快照
- 负载均衡:配置Nginx或HAProxy作为负载均衡器
- 证书管理:使用统一的SSL证书
- 监控告警:配置监控和告警系统
Part02-生产环境规划与建议
2.1 Rancher数据库高可用硬件规划
Rancher数据库高可用硬件规划:
# 节点规划
节点数量:3个(最小)或5个(推荐)
节点配置:
– CPU:8核或以上
– 内存:16GB或以上
– 磁盘:200GB SSD或以上
– 网络:千兆网卡或以上
# Nginx负载均衡器规划
CPU:4核或以上
内存:8GB或以上
磁盘:50GB SSD或以上
网络:千兆网卡或以上
# 磁盘分区规划
/dev/sda1 / 100GB SSD 系统盘
/dev/sdb1 /Rancher/fgdata 200GB SSD 数据盘
/dev/sdc1 /var/log 50GB SSD 日志盘
# 磁盘IO要求
随机读IOPS:> 10000
随机写IOPS:> 5000
顺序读写:> 500MB/s
# 网络带宽要求
节点间带宽:> 1Gbps
外部访问带宽:> 1Gbps
网络延迟:< 10ms
2.2 Rancher数据库高可用网络规划
Rancher数据库高可用网络规划:
# IP地址规划
管理网络:192.168.1.0/24
Nginx负载均衡器:192.168.1.10
Rancher-1:192.168.1.11
Rancher-2:192.168.1.12
Rancher-3:192.168.1.13
网关:192.168.1.1
DNS:8.8.8.8, 8.8.4.4
# 端口规划
Nginx负载均衡器:
80/tcp:HTTP访问
443/tcp:HTTPS访问
Rancher节点:
80/tcp:HTTP访问
443/tcp:HTTPS访问
6443/tcp:Kubernetes API
2376/tcp:Docker API
2379/tcp:ETCD客户端
2380/tcp:ETCD节点通信
# 防火墙规则
开放80/tcp、443/tcp端口
限制访问来源IP
配置端口转发规则
# 网络性能要求
网络延迟:< 10ms(同机房)
网络带宽:> 1Gbps
网络稳定性:99.9%以上
2.3 Rancher数据库高可用存储规划
Rancher数据库高可用存储规划:
# 存储目录规划
/Rancher/app 安装目录
/Rancher/fgdata 数据目录
/docker Docker数据
/rancher Rancher数据
/etcd ETCD数据
/backups 备份数据
/var/log 日志目录
# 存储容量规划(每个节点)
Docker数据:100GB
Rancher数据:50GB
ETCD数据:20GB
备份数据:50GB
日志数据:20GB
# 存储性能要求
SSD存储,支持高IOPS
支持快照和备份
支持扩容
支持RAID保护
# 存储备份策略
每日增量备份
每周全量备份
异地备份
保留7天备份
# ETCD存储要求
ETCD数据目录:/Rancher/fgdata/etcd
ETCD快照间隔:1000次请求或1小时
ETCD快照保留:5个
ETCD数据压缩:启用
Part03-生产环境项目实施方案
3.1 Rancher数据库Nginx负载均衡安装
3.1.1 Rancher数据库安装Nginx
[root@fgedu-lb ~]# yum install -y nginx
# 启动Nginx服务
[root@fgedu-lb ~]# systemctl start nginx
[root@fgedu-lb ~]# systemctl enable nginx
# 验证Nginx安装
[root@fgedu-lb ~]# nginx -v
nginx version: nginx/1.20.1
# 查看Nginx服务状态
[root@fgedu-lb ~]# systemctl status nginx
● nginx.service – The nginx HTTP and reverse proxy server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2026-04-10 10:00:00 CST; 5s ago
Process: 12345 ExecStart=/usr/sbin/nginx (code=exited, status=0/SUCCESS)
Main PID: 12346 (nginx)
Tasks: 3 (limit: 4915)
Memory: 5.2M
CGroup: /system.slice/nginx.service
├─12346 nginx: master process /usr/sbin/nginx
├─12347 nginx: worker process
└─12348 nginx: worker process
Apr 10 10:00:00 fgedu-lb systemd[1]: Starting The nginx HTTP and reverse proxy server…
Apr 10 10:00:00 fgedu-lb nginx[12345]: nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
Apr 10 10:00:00 fgedu-lb nginx[12345]: nginx: configuration file /etc/nginx/nginx.conf test is successful
Apr 10 10:00:00 fgedu-lb systemd[1]: Started The nginx HTTP and reverse proxy server.
3.1.2 Rancher数据库配置Nginx负载均衡
[root@fgedu-lb ~]# cat > /etc/nginx/conf.d/rancher.conf << EOF upstream rancher_servers { least_conn; server 192.168.1.11:443 max_fails=3 fail_timeout=5s; server 192.168.1.12:443 max_fails=3 fail_timeout=5s; server 192.168.1.13:443 max_fails=3 fail_timeout=5s; } server { listen 80; server_name rancher.fgedu.net.cn; return 301 https://$host$request_uri; } server { listen 443 ssl; server_name rancher.fgedu.net.cn; ssl_certificate /etc/nginx/ssl/rancher.crt; ssl_certificate_key /etc/nginx/ssl/rancher.key; ssl_protocols TLSv1.2 TLSv1.3; ssl_ciphers HIGH:!aNULL:!MD5; ssl_prefer_server_ciphers on; location / { proxy_pass https://rancher_servers; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_read_timeout 86400; } } EOF # 创建SSL证书目录 [root@fgedu-lb ~]# mkdir -p /etc/nginx/ssl # 生成自签名证书(生产环境建议使用正式证书) [root@fgedu-lb ~]# openssl req -x509 -nodes -days 365 -newkey rsa:2048 \ -keyout /etc/nginx/ssl/rancher.key \ -out /etc/nginx/ssl/rancher.crt \ -subj "/C=CN/ST=Beijing/L=Beijing/O=FGEDU/OU=IT/CN=rancher.fgedu.net.cn" Generating a RSA private key .....+++++ ........................................+++++ writing new private key to '/etc/nginx/ssl/rancher.key' ----- # 验证Nginx配置 [root@fgedu-lb ~]# nginx -t nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful # 重启Nginx服务 [root@fgedu-lb ~]# systemctl restart nginx [root@fgedu-lb ~]# systemctl status nginx ● nginx.service - The nginx HTTP and reverse proxy server Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2026-04-10 10:00:00 CST; 5s ago Process: 12345 ExecStart=/usr/sbin/nginx (code=exited, status=0/SUCCESS) Main PID: 12346 (nginx) Tasks: 3 (limit: 4915) Memory: 5.2M CGroup: /system.slice/nginx.service ├─12346 nginx: master process /usr/sbin/nginx ├─12347 nginx: worker process └─12348 nginx: worker process Apr 10 10:00:00 fgedu-lb systemd[1]: Starting The nginx HTTP and reverse proxy server... Apr 10 10:00:00 fgedu-lb nginx[12345]: nginx: the configuration file /etc/nginx/nginx.conf syntax is ok Apr 10 10:00:00 fgedu-lb nginx[12345]: nginx: configuration file /etc/nginx/nginx.conf test is successful Apr 10 10:00:00 fgedu-lb systemd[1]: Started The nginx HTTP and reverse proxy server.
3.2 Rancher数据库ETCD集群部署
3.2.1 Rancher数据库安装ETCD
[root@fgedu-1 ~]# yum install -y etcd
# 配置ETCD节点1
[root@fgedu-1 ~]# cat > /etc/etcd/etcd.conf << EOF
ETCD_NAME="etcd-1"
ETCD_DATA_DIR="/Rancher/fgdata/etcd"
ETCD_LISTEN_PEER_URLS="http://192.168.1.11:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.11:2379,http://127.0.0.1:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.11:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.1.11:2380"
ETCD_INITIAL_CLUSTER="etcd-1=http://192.168.1.11:2380,etcd-2=http://192.168.1.12:2380,etcd-3=http://192.168.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="rancher-etcd-cluster"
EOF
# 配置ETCD节点2
[root@fgedu-2 ~]# cat > /etc/etcd/etcd.conf << EOF
ETCD_NAME="etcd-2"
ETCD_DATA_DIR="/Rancher/fgdata/etcd"
ETCD_LISTEN_PEER_URLS="http://192.168.1.12:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.12:2379,http://127.0.0.1:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.12:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.1.12:2380"
ETCD_INITIAL_CLUSTER="etcd-1=http://192.168.1.11:2380,etcd-2=http://192.168.1.12:2380,etcd-3=http://192.168.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="rancher-etcd-cluster"
EOF
# 配置ETCD节点3
[root@fgedu-3 ~]# cat > /etc/etcd/etcd.conf << EOF
ETCD_NAME="etcd-3"
ETCD_DATA_DIR="/Rancher/fgdata/etcd"
ETCD_LISTEN_PEER_URLS="http://192.168.1.13:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.1.13:2379,http://127.0.0.1:2379"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.1.13:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.1.13:2380"
ETCD_INITIAL_CLUSTER="etcd-1=http://192.168.1.11:2380,etcd-2=http://192.168.1.12:2380,etcd-3=http://192.168.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="rancher-etcd-cluster"
EOF
# 启动ETCD服务
[root@fgedu-1 ~]# systemctl start etcd
[root@fgedu-1 ~]# systemctl enable etcd
[root@fgedu-2 ~]# systemctl start etcd
[root@fgedu-2 ~]# systemctl enable etcd
[root@fgedu-3 ~]# systemctl start etcd
[root@fgedu-3 ~]# systemctl enable etcd
# 验证ETCD集群状态
[root@fgedu-1 ~]# etcdctl member list
1234567890abcdef, started, etcd-1, http://192.168.1.11:2380, http://192.168.1.11:2379
2345678901abcdef, started, etcd-2, http://192.168.1.12:2380, http://192.168.1.12:2379
3456789012abcdef, started, etcd-3, http://192.168.1.13:2380, http://192.168.1.13:2379
# 验证ETCD集群健康状态
[root@fgedu-1 ~]# etcdctl cluster-health
member 1234567890abcdef is healthy: got healthy result from http://192.168.1.11:2379
member 2345678901abcdef is healthy: got healthy result from http://192.168.1.12:2379
member 3456789012abcdef is healthy: got healthy result from http://192.168.1.13:2379
cluster is healthy
3.3 Rancher数据库高可用部署
3.3.1 Rancher数据库部署Rancher Server
[root@fgedu-1 ~]# docker pull rancher/rancher:v2.8.5
[root@fgedu-2 ~]# docker pull rancher/rancher:v2.8.5
[root@fgedu-3 ~]# docker pull rancher/rancher:v2.8.5
# 在节点1启动Rancher容器
[root@fgedu-1 ~]# docker run -d –restart=unless-stopped \
–name rancher \
-p 80:80 -p 443:443 \
-v /Rancher/fgdata/rancher:/var/lib/rancher \
-v /Rancher/fgdata/rancher/log:/var/log/rancher \
-e CATTLE_SYSTEM_DEFAULT_REGISTRY=registry.cn-hangzhou.aliyuncs.com \
-e CATTLE_SYSTEM_CATALOG=bundled \
-e CATTLE_ETCD_ENDPOINTS=http://192.168.1.11:2379,http://192.168.1.12:2379,http://192.168.1.13:2379 \
-e CATTLE_SERVER_URL=https://rancher.fgedu.net.cn \
–privileged \
rancher/rancher:v2.8.5
# 在节点2启动Rancher容器
[root@fgedu-2 ~]# docker run -d –restart=unless-stopped \
–name rancher \
-p 80:80 -p 443:443 \
-v /Rancher/fgdata/rancher:/var/lib/rancher \
-v /Rancher/fgdata/rancher/log:/var/log/rancher \
-e CATTLE_SYSTEM_DEFAULT_REGISTRY=registry.cn-hangzhou.aliyuncs.com \
-e CATTLE_SYSTEM_CATALOG=bundled \
-e CATTLE_ETCD_ENDPOINTS=http://192.168.1.11:2379,http://192.168.1.12:2379,http://192.168.1.13:2379 \
-e CATTLE_SERVER_URL=https://rancher.fgedu.net.cn \
–privileged \
rancher/rancher:v2.8.5
# 在节点3启动Rancher容器
[root@fgedu-3 ~]# docker run -d –restart=unless-stopped \
–name rancher \
-p 80:80 -p 443:443 \
-v /Rancher/fgdata/rancher:/var/lib/rancher \
-v /Rancher/fgdata/rancher/log:/var/log/rancher \
-e CATTLE_SYSTEM_DEFAULT_REGISTRY=registry.cn-hangzhou.aliyuncs.com \
-e CATTLE_SYSTEM_CATALOG=bundled \
-e CATTLE_ETCD_ENDPOINTS=http://192.168.1.11:2379,http://192.168.1.12:2379,http://192.168.1.13:2379 \
-e CATTLE_SERVER_URL=https://rancher.fgedu.net.cn \
–privileged \
rancher/rancher:v2.8.5
# 查看Rancher容器状态
[root@fgedu-1 ~]# docker ps | grep rancher
1234567890ab rancher/rancher:v2.8.5 “entrypoint.sh” 10 seconds ago Up 9 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp rancher
[root@fgedu-2 ~]# docker ps | grep rancher
2345678901bc rancher/rancher:v2.8.5 “entrypoint.sh” 10 seconds ago Up 9 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp rancher
[root@fgedu-3 ~]# docker ps | grep rancher
3456789012cd rancher/rancher:v2.8.5 “entrypoint.sh” 10 seconds ago Up 9 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp rancher
# 查看Rancher容器日志
[root@fgedu-1 ~]# docker logs rancher
INFO: Starting Rancher
INFO: Rancher is starting
INFO: Waiting for Rancher to be ready…
INFO: Rancher is ready
INFO: Rancher is running
INFO: Connected to ETCD cluster
Part04-生产案例与实战讲解
4.1 Rancher数据库高可用验证
4.1.1 Rancher数据库访问验证
# URL: https://rancher.fgedu.net.cn
# 使用初始密码登录
# 测试Nginx负载均衡
[root@fgedu-lb ~]# curl -k https://localhost/ping
pong
# 测试Rancher节点1
[root@fgedu-lb ~]# curl -k https://192.168.1.11/ping
pong
# 测试Rancher节点2
[root@fgedu-lb ~]# curl -k https://192.168.1.12/ping
pong
# 测试Rancher节点3
[root@fgedu-lb ~]# curl -k https://192.168.1.13/ping
pong
# 查看Nginx访问日志
[root@fgedu-lb ~]# tail -f /var/log/nginx/access.log
192.168.1.100 – – [10/Apr/2026:10:00:00 +0800] “GET /ping HTTP/1.1” 200 4 “-” “curl/7.76.1”
192.168.1.100 – – [10/Apr/2026:10:00:01 +0800] “GET /ping HTTP/1.1” 200 4 “-” “curl/7.76.1”
192.168.1.100 – – [10/Apr/2026:10:00:02 +0800] “GET /ping HTTP/1.1” 200 4 “-” “curl/7.76.1”
# 查看ETCD集群状态
[root@fgedu-1 ~]# etcdctl cluster-health
member 1234567890abcdef is healthy: got healthy result from http://192.168.1.11:2379
member 2345678901abcdef is healthy: got healthy result from http://192.168.1.12:2379
member 3456789012abcdef is healthy: got healthy result from http://192.168.1.13:2379
cluster is healthy
4.2 Rancher数据库故障切换测试
4.2.1 Rancher数据库节点故障测试
[root@fgedu-1 ~]# docker stop rancher
rancher
# 验证Rancher服务是否正常
[root@fgedu-lb ~]# curl -k https://localhost/ping
pong
# 查看Nginx访问日志,验证请求是否分发到其他节点
[root@fgedu-lb ~]# tail -f /var/log/nginx/access.log
192.168.1.100 – – [10/Apr/2026:10:00:00 +0800] “GET /ping HTTP/1.1” 200 4 “-” “curl/7.76.1”
192.168.1.100 – – [10/Apr/2026:10:00:01 +0800] “GET /ping HTTP/1.1” 200 4 “-” “curl/7.76.1”
192.168.1.100 – – [10/Apr/2026:10:00:02 +0800] “GET /ping HTTP/1.1” 200 4 “-” “curl/7.76.1”
# 启动节点1的Rancher容器
[root@fgedu-1 ~]# docker start rancher
rancher
# 验证Rancher服务是否正常
[root@fgedu-lb ~]# curl -k https://localhost/ping
pong
# 查看所有Rancher容器状态
[root@fgedu-1 ~]# docker ps | grep rancher
1234567890ab rancher/rancher:v2.8.5 “entrypoint.sh” 10 seconds ago Up 9 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp rancher
[root@fgedu-2 ~]# docker ps | grep rancher
2345678901bc rancher/rancher:v2.8.5 “entrypoint.sh” 10 seconds ago Up 9 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp rancher
[root@fgedu-3 ~]# docker ps | grep rancher
3456789012cd rancher/rancher:v2.8.5 “entrypoint.sh” 10 seconds ago Up 9 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp rancher
4.3 Rancher数据库高可用备份
4.3.1 Rancher数据库ETCD备份
[root@fgedu-1 ~]# cat > /Rancher/app/etcd_backup.sh << 'EOF' #!/bin/bash # etcd_backup.sh # from:www.itpux.com.qq113257174.wx:itpux-com # web: http://www.fgedu.net.cn BACKUP_DIR="/Rancher/fgdata/backups/etcd" ETCDCTL_ENDPOINTS="http://192.168.1.11:2379,http://192.168.1.12:2379,http://192.168.1.13:2379" DATE=$(date +%Y%m%d_%H%M%S) mkdir -p $BACKUP_DIR etcdctl --endpoints=$ETCDCTL_ENDPOINTS snapshot save $BACKUP_DIR/etcd_snapshot_$DATE.db if [ $? -eq 0 ]; then echo "ETCD backup completed successfully: $BACKUP_DIR/etcd_snapshot_$DATE.db" else echo "ETCD backup failed" exit 1 fi # 保留最近7天的备份 find $BACKUP_DIR -name "etcd_snapshot_*.db" -mtime +7 -delete EOF # 添加执行权限 [root@fgedu-1 ~]# chmod +x /Rancher/app/etcd_backup.sh # 执行ETCD备份 [root@fgedu-1 ~]# /Rancher/app/etcd_backup.sh Snapshot saved at /Rancher/fgdata/backups/etcd/etcd_snapshot_20260410_100000.db ETCD backup completed successfully: /Rancher/fgdata/backups/etcd/etcd_snapshot_20260410_100000.db # 查看备份文件 [root@fgedu-1 ~]# ls -lh /Rancher/fgdata/backups/etcd/ total 20M -rw-r--r-- 1 root root 20M Apr 10 10:00:00 etcd_snapshot_20260410_100000.db # 配置定时备份 [root@fgedu-1 ~]# crontab -e # 添加以下行,每天凌晨2点执行备份 0 2 * * * /Rancher/app/etcd_backup.sh >> /var/log/etcd_backup.log 2>&1
# 查看定时任务
[root@fgedu-1 ~]# crontab -l
0 2 * * * /Rancher/app/etcd_backup.sh >> /var/log/etcd_backup.log 2>&1
Part05-风哥经验总结与分享
5.1 Rancher数据库高可用最佳实践
Rancher数据库高可用最佳实践:
- 节点规划:至少3个节点,建议奇数个节点
- 网络规划:节点之间网络互通,延迟小于10ms
- 时间同步:所有节点时间同步,误差小于1秒
- 负载均衡:使用Nginx或HAProxy作为负载均衡器
- 数据备份:定期备份ETCD数据,配置异地备份
- 监控告警:配置监控和告警系统,及时发现和处理问题
- 故障演练:定期进行故障演练,验证高可用性
5.2 Rancher数据库高可用问题排查
Rancher数据库高可用问题排查:
# 问题1:ETCD集群不健康
# 现象:etcdctl cluster-health显示不健康
# 原因:网络不通、节点故障、配置错误
# 解决:
[root@fgedu-1 ~]# etcdctl cluster-health
[root@fgedu-1 ~]# etcdctl member list
[root@fgedu-1 ~]# systemctl status etcd
[root@fgedu-1 ~]# journalctl -u etcd -f
# 问题2:Rancher容器无法启动
# 现象:docker ps看不到rancher容器
# 原因:ETCD连接失败、配置错误、资源不足
# 解决:
[root@fgedu-1 ~]# docker logs rancher
[root@fgedu-1 ~]# docker inspect rancher
[root@fgedu-1 ~]# etcdctl cluster-health
[root@fgedu-1 ~]# free -h
# 问题3:Nginx负载均衡不工作
# 现象:访问Nginx无法连接到Rancher
# 原因:后端节点故障、配置错误、网络不通
# 解决:
[root@fgedu-lb ~]# nginx -t
[root@fgedu-lb ~]# systemctl status nginx
[root@fgedu-lb ~]# tail -f /var/log/nginx/access.log
[root@fgedu-lb ~]# tail -f /var/log/nginx/error.log
# 问题4:数据不一致
# 现象:不同节点数据不一致
# 原因:ETCD集群分裂、网络分区、时钟不同步
# 解决:
[root@fgedu-1 ~]# etcdctl cluster-health
[root@fgedu-1 ~]# ntpdate -u time.nist.gov
[root@fgedu-1 ~]# systemctl restart etcd
5.3 Rancher数据库高可用性能优化
Rancher数据库高可用性能优化:
# 1. ETCD性能优化
– 增加ETCD心跳间隔
– 调整ETCD选举超时
– 优化ETCD快照间隔
– 增加ETCD缓存大小
# 2. Nginx性能优化
– 增加worker进程数
– 调整keepalive超时
– 启用gzip压缩
– 优化缓冲区大小
# 3. 网络性能优化
– 使用万兆网卡
– 优化TCP参数
– 配置网络MTU
– 减少网络延迟
# 4. 系统性能优化
– 增加内存大小
– 使用SSD存储
– 优化内核参数
– 调整文件描述符限制
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
