1. 首页 > Rancher教程 > 正文

Rancher教程FG017-Rancher节点管理与维护模式实战

本文档风哥主要介绍Rancher节点管理与维护模式实战,包括Rancher数据库节点概念、Rancher数据库节点角色、Rancher数据库维护模式、Rancher数据库节点准备、Rancher数据库节点要求、Rancher数据库节点规划、Rancher数据库添加节点、Rancher数据库移除节点、Rancher数据库维护节点、Rancher数据库驱逐节点、Rancher数据库封锁节点、Rancher数据库优化节点等内容,风哥教程参考Rancher官方文档节点、维护模式、节点管理等内容,适合运维人员在学习和测试中使用,如果要应用于生产环境则需要自行确认。

Part01-基础概念与理论知识

1.1 Rancher数据库节点概念

Rancher数据库节点是Kubernetes集群中的工作节点,负责运行Pod。节点可以是物理服务器、虚拟机或云主机。节点需要安装Kubernetes组件(kubelet、kube-proxy)和容器运行时(如Docker、containerd)。更多视频教程www.fgedu.net.cn

Rancher数据库节点特点:

  • 工作节点:运行Pod
  • 资源管理:管理CPU、内存、存储
  • 网络管理:管理网络配置
  • 健康检查:检查节点健康状态
  • 自动扩展:支持自动扩展

1.2 Rancher数据库节点角色

Rancher数据库节点角色是指节点在集群中的角色,如控制平面节点(Control Plane)、工作节点(Worker)、ETCD节点等。控制平面节点负责管理集群,工作节点负责运行Pod,ETCD节点负责存储集群数据。学习交流加群风哥微信: itpux-com

Rancher数据库节点角色特点:

  • 控制平面:管理集群
  • 工作节点:运行Pod
  • ETCD节点:存储数据
  • 混合节点:多种角色
  • 自定义角色:自定义角色

1.3 Rancher数据库维护模式

Rancher数据库维护模式是指将节点设置为维护状态,避免调度新的Pod。维护模式包括封锁(Cordon)和驱逐(Drain)。封锁节点禁止调度新的Pod,驱逐节点会驱逐所有Pod。学习交流加群风哥QQ113257174

Rancher数据库维护模式特点:

  • 封锁节点:禁止调度新Pod
  • 驱逐节点:驱逐所有Pod
  • 维护窗口:维护时间窗口
  • 自动恢复:自动恢复正常
  • 通知机制:通知维护状态
风哥提示:Rancher节点管理可以帮助运维人员管理集群节点,提高集群的可用性和性能。建议定期维护节点,确保节点健康。学习交流加群风哥QQ113257174

Part02-生产环境规划与建议

2.1 Rancher数据库节点准备

Rancher数据库节点准备:

# Rancher数据库节点准备清单

# 1. 操作系统准备
– 操作系统:Oracle Linux 9.3 / RHEL 9.3 / 8.x / 7.x
– 内核版本:>= 5.4.0
– 系统更新:最新补丁

# 2. 硬件准备
– CPU:>= 4核
– 内存:>= 8GB
– 磁盘:>= 100GB
– 网络:>= 1Gbps

# 3. 网络准备
– 网络互通:集群网络互通
– DNS配置:配置DNS
– 时间同步:配置NTP
– 防火墙:开放必要端口

# 4. 软件准备
– Docker:>= 20.10.0
– containerd:>= 1.6.0
– Kubernetes:>= 1.23.0
– RKE2:>= 1.25.0

# 5. 配置准备
– 主机名:配置主机名
– SSH密钥:配置SSH密钥
– 用户权限:配置用户权限
– 系统参数:配置系统参数

2.2 Rancher数据库节点要求

Rancher数据库节点要求:

# Rancher数据库节点要求

# 控制平面节点要求
CPU:>= 4核
内存:>= 8GB
磁盘:>= 100GB
网络:>= 1Gbps
副本数:>= 3个

# 工作节点要求
CPU:>= 4核
内存:>= 8GB
磁盘:>= 100GB
网络:>= 1Gbps
副本数:>= 3个

# ETCD节点要求
CPU:>= 4核
内存:>= 8GB
磁盘:>= 100GB
网络:>= 1Gbps
副本数:>= 3个

# 操作系统要求
Oracle Linux:>= 9.3
RHEL:>= 9.3
CentOS:>= 7.9
Ubuntu:>= 20.04

# 网络要求
网络带宽:>= 1Gbps
网络延迟:< 10ms 端口开放:6443、2379、2380、10250等

2.3 Rancher数据库节点规划

Rancher数据库节点规划:

# Rancher数据库节点规划

# 控制平面节点规划
节点1:fgedu-control-plane-1(192.168.1.10)
节点2:fgedu-control-plane-2(192.168.1.11)
节点3:fgedu-control-plane-3(192.168.1.12)

# 工作节点规划
节点1:fgedu-worker-1(192.168.1.20)
节点2:fgedu-worker-2(192.168.1.21)
节点3:fgedu-worker-3(192.168.1.22)

# ETCD节点规划
节点1:fgedu-control-plane-1(192.168.1.10)
节点2:fgedu-control-plane-2(192.168.1.11)
节点3:fgedu-control-plane-3(192.168.1.12)

# 资源规划
控制平面:CPU 4核,内存 8GB
工作节点:CPU 8核,内存 16GB
ETCD节点:CPU 4核,内存 8GB

# 网络规划
管理网络:192.168.1.0/24
Pod网络:10.244.0.0/16
Service网络:10.96.0.0/12

生产环境建议:Rancher数据库节点建议根据业务需求规划节点数量和资源。定期检查节点健康状态,及时维护节点。更多学习教程公众号风哥教程itpux_com

Part03-生产环境项目实施方案

3.1 Rancher数据库添加节点

3.1.1 Rancher数据库通过Web界面添加节点

# 通过Web界面添加节点
# 步骤1:登录Rancher管理界面
# 步骤2:点击”集群” – 选择集群 – 点击”节点”
# 步骤3:点击”添加节点”按钮
# 步骤4:选择节点类型:Worker
# 步骤5:填写节点信息:
# 节点名称:fgedu-worker-4
# 节点描述:Rancher数据库工作节点4
# 节点IP:192.168.1.23
# 步骤6:点击”创建”按钮

# 通过CLI添加节点
[root@rancher ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-control-plane-1 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-2 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-3 Ready control-plane,etcd,master 10m v1.28.0
fgedu-worker-1 Ready 10m v1.28.0
fgedu-worker-2 Ready 10m v1.28.0
fgedu-worker-3 Ready 10m v1.28.0

# 在新节点上安装RKE2
[root@fgedu-worker-4 ~]# curl -sfL https://get.rke2.io | sh –

[root@fgedu-worker-4 ~]# yum install -y rke2-1.28.0-1.el7.x86_64

# 配置RKE2 Agent
[root@fgedu-worker-4 ~]# cat > /etc/rancher/rke2/config.yaml < 10m v1.28.0
fgedu-worker-2 Ready 10m v1.28.0
fgedu-worker-3 Ready 10m v1.28.0
fgedu-worker-4 Ready 1m v1.28.0

# 查看节点详情
[root@rancher ~]# kubectl describe node fgedu-worker-4
Name: fgedu-worker-4
Roles:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=fgedu-worker-4
kubernetes.io/os=linux
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp: Fri, 10 Apr 2026 10:00:00 +0000
Taints:
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
—- —— —————– —————— —— ——-
NetworkUnavailable False Fri, 10 Apr 2026 10:00:00 +0000 Fri, 10 Apr 2026 10:00:00 +0000 Kubelet has sufficient network available
MemoryPressure False Fri, 10 Apr 2026 10:00:00 +0000 Fri, 10 Apr 2026 10:00:00 +0000 Kubelet has sufficient memory available
DiskPressure False Fri, 10 Apr 2026 10:00:00 +0000 Fri, 10 Apr 2026 10:00:00 +0000 Kubelet has no disk pressure
PIDPressure False Fri, 10 Apr 2026 10:00:00 +0000 Fri, 10 Apr 2026 10:00:00 +0000 Kubelet has sufficient PID available
Ready True Fri, 10 Apr 2026 10:00:00 +0000 Fri, 10 Apr 2026 10:00:00 +0000 Kubelet is posting ready status
Addresses:
InternalIP: 192.168.1.23
Hostname: fgedu-worker-4
Capacity:
cpu: 8
ephemeral-storage: 100Gi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16384200Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 100Gi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16384200Ki
pods: 110
System Info:
Machine ID: 1234567890abcdef
System UUID: 12345678-90ab-cdef-1234-567890abcdef
Boot ID: 12345678-90ab-cdef-1234-567890abcdef
Kernel Version: 5.4.0-91-generic
OS Image: Oracle Linux Server 9.3
Operating System: Linux
Architecture: amd64
Container Runtime Version: containerd://1.6.0
Kubelet Version: v1.28.0
Kube-Proxy Version: v1.28.0

3.2 Rancher数据库移除节点

3.2.1 Rancher数据库通过CLI移除节点

# 查看节点状态
[root@rancher ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-control-plane-1 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-2 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-3 Ready control-plane,etcd,master 10m v1.28.0
fgedu-worker-1 Ready 10m v1.28.0
fgedu-worker-2 Ready 10m v1.28.0
fgedu-worker-3 Ready 10m v1.28.0
fgedu-worker-4 Ready 1m v1.28.0

# 驱逐节点上的Pod
[root@rancher ~]# kubectl drain fgedu-worker-4 –ignore-daemonsets –delete-emptydir-data
node/fgedu-worker-4 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-proxy-1234567890-abcde
evicting pod fgedu-dev/fgedu-nginx-1234567890-abcde
pod/fgedu-nginx-1234567890-abcde evicted

# 查看Pod状态
[root@rancher ~]# kubectl get pods -n fgedu-dev
NAME READY STATUS RESTARTS AGE
fgedu-nginx-1234567890-fghij 1/1 Running 0 1m
fgedu-nginx-1234567890-klmno 1/1 Running 0 1m
fgedu-nginx-1234567890-opqr 1/1 Running 0 1m

# 删除节点
[root@rancher ~]# kubectl delete node fgedu-worker-4
node “fgedu-worker-4” deleted

# 查看节点状态
[root@rancher ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-control-plane-1 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-2 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-3 Ready control-plane,etcd,master 10m v1.28.0
fgedu-worker-1 Ready 10m v1.28.0
fgedu-worker-2 Ready 10m v1.28.0
fgedu-worker-3 Ready 10m v1.28.0

# 在节点上停止RKE2 Agent
[root@fgedu-worker-4 ~]# systemctl stop rke2-agent
[root@fgedu-worker-4 ~]# systemctl disable rke2-agent

# 清理节点配置
[root@fgedu-worker-4 ~]# rm -rf /etc/rancher/rke2
[root@fgedu-worker-4 ~]# rm -rf /var/lib/rancher/rke2

3.3 Rancher数据库维护节点

3.3.1 Rancher数据库通过CLI维护节点

# 封锁节点
[root@rancher ~]# kubectl cordon fgedu-worker-1
node/fgedu-worker-1 cordoned

# 查看节点状态
[root@rancher ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-control-plane-1 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-2 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-3 Ready control-plane,etcd,master 10m v1.28.0
fgedu-worker-1 Ready,SchedulingDisabled 10m v1.28.0
fgedu-worker-2 Ready 10m v1.28.0
fgedu-worker-3 Ready 10m v1.28.0

# 解锁节点
[root@rancher ~]# kubectl uncordon fgedu-worker-1
node/fgedu-worker-1 uncordoned

# 查看节点状态
[root@rancher ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-control-plane-1 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-2 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-3 Ready control-plane,etcd,master 10m v1.28.0
fgedu-worker-1 Ready 10m v1.28.0
fgedu-worker-2 Ready 10m v1.28.0
fgedu-worker-3 Ready 10m v1.28.0

# 添加节点标签
[root@rancher ~]# kubectl label node fgedu-worker-1 environment=production
node/fgedu-worker-1 labeled

# 查看节点标签
[root@rancher ~]# kubectl get nodes –show-labels
NAME STATUS ROLES AGE VERSION LABELS
fgedu-control-plane-1 Ready control-plane,etcd,master 10m v1.28.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,environment=production,kubernetes.io/arch=amd64,kubernetes.io/hostname=fgedu-control-plane-1,kubernetes.io/os=linux
fgedu-worker-1 Ready 10m v1.28.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,environment=production,kubernetes.io/arch=amd64,kubernetes.io/hostname=fgedu-worker-1,kubernetes.io/os=linux

# 添加节点污点
[root@rancher ~]# kubectl taint node fgedu-worker-1 key=value:NoSchedule
node/fgedu-worker-1 tainted

# 查看节点污点
[root@rancher ~]# kubectl describe node fgedu-worker-1 | grep -A 5 Taints
Taints: key=value:NoSchedule

风哥提示:Rancher节点管理可以帮助运维人员管理集群节点,提高集群的可用性和性能。建议定期维护节点,确保节点健康。from Rancher视频:www.itpux.com

Part04-生产案例与实战讲解

4.1 Rancher数据库驱逐节点

4.1.1 Rancher数据库通过CLI驱逐节点

# 查看节点上的Pod
[root@rancher ~]# kubectl get pods -o wide -n fgedu-dev
NAME READY STATUS RESTARTS AGE IP NODE
fgedu-nginx-1234567890-abcde 1/1 Running 0 5m 10.244.0.5 fgedu-worker-1
fgedu-nginx-1234567890-fghij 1/1 Running 0 5m 10.244.0.6 fgedu-worker-2
fgedu-nginx-1234567890-klmno 1/1 Running 0 5m 10.244.0.7 fgedu-worker-3

# 驱逐节点上的Pod
[root@rancher ~]# kubectl drain fgedu-worker-1 –ignore-daemonsets –delete-emptydir-data
node/fgedu-worker-1 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-proxy-1234567890-abcde
evicting pod fgedu-dev/fgedu-nginx-1234567890-abcde
pod/fgedu-nginx-1234567890-abcde evicted

# 查看Pod状态
[root@rancher ~]# kubectl get pods -o wide -n fgedu-dev
NAME READY STATUS RESTARTS AGE IP NODE
fgedu-nginx-1234567890-fghij 1/1 Running 0 5m 10.244.0.6 fgedu-worker-2
fgedu-nginx-1234567890-klmno 1/1 Running 0 5m 10.244.0.7 fgedu-worker-3
fgedu-nginx-1234567890-opqr 1/1 Running 0 1m 10.244.0.8 fgedu-worker-2

# 查看节点状态
[root@rancher ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-control-plane-1 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-2 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-3 Ready control-plane,etcd,master 10m v1.28.0
fgedu-worker-1 Ready,SchedulingDisabled 10m v1.28.0
fgedu-worker-2 Ready 10m v1.28.0
fgedu-worker-3 Ready 10m v1.28.0

# 解锁节点
[root@rancher ~]# kubectl uncordon fgedu-worker-1
node/fgedu-worker-1 uncordoned

4.2 Rancher数据库封锁节点

4.2.1 Rancher数据库通过CLI封锁节点

# 封锁节点
[root@rancher ~]# kubectl cordon fgedu-worker-1
node/fgedu-worker-1 cordoned

# 查看节点状态
[root@rancher ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-control-plane-1 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-2 Ready control-plane,etcd,master 10m v1.28.0
fgedu-control-plane-3 Ready control-plane,etcd,master 10m v1.28.0
fgedu-worker-1 Ready,SchedulingDisabled 10m v1.28.0
fgedu-worker-2 Ready 10m v1.28.0
fgedu-worker-3 Ready 10m v1.28.0

# 尝试调度Pod到封锁的节点
[root@rancher ~]# cat <

4.3 Rancher数据库优化节点

4.3.1 Rancher数据库优化节点性能

# 查看节点资源使用
[root@rancher ~]# kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
fgedu-control-plane-1 2.5 62% 5Gi 62%
fgedu-control-plane-2 2.5 62% 5Gi 62%
fgedu-control-plane-3 2.5 62% 5Gi 62%
fgedu-worker-1 5.0 62% 10Gi 62%
fgedu-worker-2 5.0 62% 10Gi 62%
fgedu-worker-3 5.0 62% 10Gi 62%

# 优化节点资源限制
[root@rancher ~]# kubectl patch node fgedu-worker-1 -p ‘{“spec”:{“unschedulable”:false}}’

# 查看节点详情
[root@rancher ~]# kubectl describe node fgedu-worker-1 | grep -A 10 Capacity
Capacity:
cpu: 8
ephemeral-storage: 100Gi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16384200Ki
pods: 110
Allocatable:
cpu: 8
ephemeral-storage: 100Gi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16384200Ki
pods: 110

# 优化节点内核参数
[root@fgedu-worker-1 ~]# cat >> /etc/sysctl.conf < /etc/docker/daemon.json <

生产环境建议:Rancher数据库节点建议定期检查节点健康状态,及时维护节点。优化节点性能,提高集群的可用性和性能。更多视频教程www.fgedu.net.cn

Part05-风哥经验总结与分享

5.1 Rancher数据库节点最佳实践

Rancher数据库节点最佳实践:

  • 定期维护:定期维护节点
  • 健康检查:定期检查节点健康状态
  • 资源优化:优化节点资源使用
  • 性能优化:优化节点性能
  • 监控告警:配置节点监控告警
  • 文档记录:记录节点配置和变更
  • 备份配置:备份节点配置

5.2 Rancher数据库节点问题排查

Rancher数据库节点问题排查:

# Rancher数据库节点常见问题及解决方案

# 问题1:节点状态为NotReady
# 现象:节点状态为NotReady
# 原因:kubelet服务异常、网络不通、资源不足
# 解决:
[root@rancher ~]# kubectl get nodes
[root@rancher ~]# kubectl describe node fgedu-worker-1
[root@fgedu-worker-1 ~]# systemctl status kubelet
[root@fgedu-worker-1 ~]# journalctl -u kubelet -n 50

# 问题2:节点无法调度Pod
# 现象:Pod无法调度到节点
# 原因:节点被封锁、资源不足、污点配置
# 解决:
[root@rancher ~]# kubectl get nodes
[root@rancher ~]# kubectl describe node fgedu-worker-1
[root@rancher ~]# kubectl describe pod -n
[root@rancher ~]# kubectl get events –all-namespaces

# 问题3:节点资源不足
# 现象:节点资源使用率高
# 原因:Pod过多、资源限制不当、应用资源占用高
# 解决:
[root@rancher ~]# kubectl top nodes
[root@rancher ~]# kubectl top pods –all-namespaces
[root@rancher ~]# kubectl describe node fgedu-worker-1
[root@rancher ~]# kubectl get pods –all-namespaces -o wide

# 问题4:节点网络异常
# 现象:节点网络不通
# 原因:网络配置错误、防火墙配置、网络延迟高
# 解决:
[root@rancher ~]# kubectl get nodes
[root@rancher ~]# kubectl describe node fgedu-worker-1
[root@fgedu-worker-1 ~]# ping 192.168.1.10
[root@fgedu-worker-1 ~]# traceroute 192.168.1.10

5.3 Rancher数据库节点维护

Rancher数据库节点维护:

# Rancher数据库节点维护建议

# 1. 定期检查
– 检查节点状态
– 检查节点资源
– 检查节点健康
– 检查节点网络

# 2. 定期优化
– 优化节点资源
– 优化节点性能
– 优化节点配置
– 优化节点网络

# 3. 定期备份
– 备份节点配置
– 备份节点数据
– 备份节点日志
– 备份节点证书

# 4. 定期清理
– 清理无用Pod
– 清理无用镜像
– 清理无用日志
– 清理无用缓存

# 5. 定期审计
– 审计节点配置
– 审计节点变更
– 审计节点日志
– 审计操作记录

风哥提示:Rancher节点管理可以帮助运维人员管理集群节点,提高集群的可用性和性能。建议定期维护节点,确保节点健康。优化节点性能,提高集群的可用性和性能。学习交流加群风哥微信: itpux-com

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息