Kubernetes教程FG052-Kubernetes Taints and Tolerations实战

本文档风哥主要介绍Kubernetes Taints and Tolerations实战，包括Taints概述、Tolerations概述、使用场景、Taints规划、Tolerations规划、最佳实践规划、Taints实现、Tolerations实现、管理实现、Taints案例、Tolerations案例、集成案例等内容，风哥教程参考Kubernetes官方文档和调度相关文档，适合想使用和理解Kubernetes调度管理的开发人员和运维人员。

Part01-基础概念与理论知识

1.1 Taints概述

Taints是Kubernetes中用于标记节点的机制，它可以使节点排斥某些Pod的调度。

Taints的主要特性包括：

节点标记：为节点添加污点，使节点排斥某些Pod
多种效果：支持NoSchedule、PreferNoSchedule、NoExecute三种效果
灵活的匹配规则：支持基于键值对的匹配
与Tolerations配合：与Tolerations配合使用，实现精细的调度控制

1.2 Tolerations概述

Tolerations是Kubernetes中用于使Pod能够容忍节点Taints的机制，它可以使Pod能够调度到带有Taints的节点上。

Tolerations的主要特性包括：

Pod标记：为Pod添加容忍度，使Pod能够容忍节点的Taints
多种匹配规则：支持精确匹配、存在性匹配等多种匹配规则
与Taints配合：与Taints配合使用，实现精细的调度控制
灵活的配置：可以为不同的Taints配置不同的容忍度

1.3 使用场景

Taints and Tolerations的使用场景包括：

专用节点：将某些节点专门用于特定类型的工作负载
资源隔离：隔离不同类型的工作负载，避免资源竞争
节点维护：在节点维护期间，防止新的Pod调度到该节点
高可用性：确保关键工作负载能够调度到合适的节点
安全隔离：将不同安全级别的工作负载隔离到不同的节点

Part02-生产环境规划与建议

2.1 Taints规划

Kubernetes Taints的规划：

# Taints规划
– 目标：
– 实现节点的隔离
– 确保特定工作负载能够调度到合适的节点
– 提高资源利用率
– 增强集群的安全性
– 范围：
– 节点角色定义
– Taints策略设计
– 效果选择
– 测试和验证
– 工具选择：
– kubectl：用于管理节点Taints
– Kubernetes Dashboard：用于可视化管理
– Prometheus：用于监控
– Grafana：用于可视化监控数据
– 流程设计：
– 节点角色定义：定义不同节点的角色和用途
– Taints策略设计：根据节点角色设计Taints策略
– 效果选择：选择合适的Taints效果（NoSchedule、PreferNoSchedule、NoExecute）
– 测试和验证：测试Taints的效果，确保其正常工作
– 监控和调整：监控Pod调度情况，根据需要调整Taints策略
– 资源分配：
– 人力资源：集群管理员、运维人员
– 时间资源：规划时间、部署时间、测试时间
– 基础设施：计算资源、存储资源、网络资源

2.2 Tolerations规划

Kubernetes Tolerations的规划：

# Tolerations规划
– 目标：
– 使Pod能够容忍节点的Taints
– 确保Pod能够调度到合适的节点
– 提高Pod的部署成功率
– 增强集群的灵活性
– 范围：
– Pod角色定义
– Tolerations策略设计
– 匹配规则选择
– 测试和验证
– 工具选择：
– kubectl：用于管理Pod Tolerations
– Kubernetes Dashboard：用于可视化管理
– Prometheus：用于监控
– Grafana：用于可视化监控数据
– 流程设计：
– Pod角色定义：定义不同Pod的角色和用途
– Tolerations策略设计：根据Pod角色设计Tolerations策略
– 匹配规则选择：选择合适的Tolerations匹配规则（Equal、Exists）
– 测试和验证：测试Tolerations的效果，确保其正常工作
– 监控和调整：监控Pod调度情况，根据需要调整Tolerations策略
– 资源分配：
– 人力资源：集群管理员、运维人员、开发人员
– 时间资源：规划时间、部署时间、测试时间
– 基础设施：计算资源、存储资源、网络资源

2.3 最佳实践规划

Kubernetes Taints and Tolerations的最佳实践规划：

# 最佳实践规划
– Taints最佳实践：
– 合理使用Taints效果：根据需要选择合适的Taints效果
– 标准化Taints命名：使用清晰、一致的Taints命名规则
– 避免过度使用Taints：避免为节点添加过多的Taints，导致Pod无法调度
– 定期审查Taints：定期审查和清理不必要的Taints
– 结合节点标签：结合节点标签使用Taints，实现更精细的调度控制
– Tolerations最佳实践：
– 最小化Tolerations：只添加必要的Tolerations，避免过度容忍
– 合理使用匹配规则：根据需要选择合适的匹配规则
– 测试Tolerations：在生产环境中使用前，在测试环境中测试Tolerations效果
– 文档化Tolerations：文档化所有的Tolerations配置，便于维护和审计
– 结合Pod Affinity：结合Pod Affinity使用Tolerations，实现更精细的调度控制
– 部署最佳实践：
– 使用Deployment：使用Deployment管理Pod，确保Pod的高可用性
– 使用StatefulSet：对于有状态应用，使用StatefulSet管理Pod
– 配置资源请求和限制：配置Pod的资源请求和限制，确保资源使用合理
– 监控Pod状态：监控Pod的状态，及时发现和处理问题
– 运维最佳实践：
– 文档化Taints和Tolerations：文档化所有的Taints和Tolerations配置，便于维护和审计
– 培训和教育：对开发人员和运维人员进行培训，提高调度管理意识
– 定期审查配置：定期审查和更新Taints和Tolerations配置，确保其符合应用需求
– 持续改进：持续改进调度策略，提高调度效率和资源利用率

Part03-生产环境项目实施方案

3.1 Taints实现

Taints实现的具体步骤：

# Taints实现
1. 为节点添加Taints：
# 查看节点
$ kubectl get nodes
# 输出
NAME STATUS ROLES AGE VERSION
node1 Ready control-plane,master 1d v1.24.0
node2 Ready 1d v1.24.0
node3 Ready 1d v1.24.0
# 为节点添加Taints
$ kubectl taint nodes node1 dedicated=master:NoSchedule
$ kubectl taint nodes node2 dedicated=worker:PreferNoSchedule
$ kubectl taint nodes node3 dedicated=database:NoExecute
# 查看节点Taints
$ kubectl describe nodes | grep Taints
# 输出
Taints: dedicated=master:NoSchedule
Taints: dedicated=worker:PreferNoSchedule
Taints: dedicated=database:NoExecute
2. 测试Taints效果：
# 创建普通Pod
$ cat > normal-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: normal-pod spec: containers: - name: normal-container image: nginx:latest EOF，风哥提示：。 $ kubectl apply -f normal-pod.yaml # 查看Pod调度情况 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES normal-pod 1/1 Running 0 1m 10.244.1.2 node2
3. 测试NoSchedule效果：
# 创建Pod尝试调度到node1
$ cat > master-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: master-pod spec: containers: - name: master-container image: nginx:latest nodeName: node1 EOF $ kubectl apply -f master-pod.yaml # 查看Pod状态 $ kubectl get pods # 输出 NAME READY STATUS RESTARTS AGE master-pod 0/1 Pending 0 1m normal-pod 1/1 Running 0 2m # 查看Pod事件 $ kubectl describe pod master-pod # 输出 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 1m (x12 over 1m) default-scheduler 0/3 nodes are available: 1 node(s) had taint {dedicated: master}, that the pod didn't tolerate, 1 node(s) had taint {dedicated: database}, that the pod didn't tolerate, 1 node(s) were unschedulable. 4. 测试NoExecute效果： # 创建Pod并调度到node3 $ cat > database-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: database-pod spec: containers: - name: database-container image: nginx:latest nodeName: node3 EOF $ kubectl apply -f database-pod.yaml # 查看Pod状态 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES database-pod 1/1 Running 0 1m 10.244.2.2 node3
master-pod 0/1 Pending 0 2m
normal-pod 1/1 Running 0 3m 10.244.1.2 node2
5. 清理资源：
# 清理Pod
$ kubectl delete pod normal-pod master-pod database-pod
# 移除节点Taints
$ kubectl taint nodes node1 dedicated-
$ kubectl taint nodes node2 dedicated-
$ kubectl taint nodes node3 dedicated-

3.2 Tolerations实现

Tolerations实现的具体步骤。，风哥提示：。

# Tolerations实现
1. 为节点添加Taints：
# 为节点添加Taints
$ kubectl taint nodes node1 dedicated=master:NoSchedule
$ kubectl taint nodes node2 dedicated=worker:PreferNoSchedule
$ kubectl taint nodes node3 dedicated=database:NoExecute
2. 创建带有Tolerations的Pod：
# 创建带有Tolerations的Pod
$ cat > toleration-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: toleration-pod spec: containers: - name: toleration-container image: nginx:latest tolerations: - key: "dedicated" operator: "Equal" value: "master" effect: "NoSchedule" EOF $ kubectl apply -f toleration-pod.yaml # 查看Pod调度情况 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES toleration-pod 1/1 Running 0 1m 10.244.0.2 node1
3. 创建使用Exists匹配规则的Pod：
# 创建使用Exists匹配规则的Pod
$ cat > exists-toleration-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: exists-toleration-pod spec: containers: - name: exists-toleration-container image: nginx:latest tolerations: - key: "dedicated" operator: "Exists" effect: "NoSchedule" EOF $ kubectl apply -f exists-toleration-pod.yaml # 查看Pod调度情况 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES exists-toleration-pod 1/1 Running 0 1m 10.244.0.3 node1
toleration-pod 1/1 Running 0 2m 10.244.0.2 node1
4. 测试NoExecute效果：
# 创建带有NoExecute Toleration的Pod，学习交流加群风哥微信: itpux-com。
$ cat > noexecute-toleration-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: noexecute-toleration-pod spec: containers: - name: noexecute-toleration-container image: nginx:latest tolerations: - key: "dedicated" operator: "Equal" value: "database" effect: "NoExecute" EOF $ kubectl apply -f noexecute-toleration-pod.yaml # 查看Pod调度情况 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES exists-toleration-pod 1/1 Running 0 2m 10.244.0.3 node1
noexecute-toleration-pod 1/1 Running 0 1m 10.244.2.2 node3
toleration-pod 1/1 Running 0 3m 10.244.0.2 node1
5. 清理资源：
# 清理Pod
$ kubectl delete pod toleration-pod exists-toleration-pod noexecute-toleration-pod
# 移除节点Taints
$ kubectl taint nodes node1 dedicated-
$ kubectl taint nodes node2 dedicated-
$ kubectl taint nodes node3 dedicated-

3.3 管理实现

Taints and Tolerations管理的具体步骤。

# 管理实现
1. 监控Pod调度情况：
# 安装Prometheus和Grafana
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update
$ helm install prometheus prometheus-community/kube-prometheus-stack
# 查看监控面板
$ kubectl port-forward deployment/prometheus-grafana 3000:3000
# 打开浏览器访问 http://localhost:3000
2. 配置调度告警：
# 创建告警规则
$ cat > scheduling-alert.yaml << 'EOF' apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: scheduling-alerts spec: groups: - name: pod-scheduling rules: - alert: PodPending expr: sum by (namespace, pod) (kube_pod_status_phase{phase="Pending"}) > 0
for: 5m
labels:
severity: critical
annotations:
summary: “Pod pending”
description: “Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} has been pending for more than 5 minutes”
EOF
$ kubectl apply -f scheduling-alert.yaml
3. 管理节点Taints：
# 列出节点Taints
$ kubectl describe nodes | grep Taints
# 添加节点Taints
$ kubectl taint nodes node1 dedicated=master:NoSchedule
# 更新节点Taints
$ kubectl taint nodes node1 dedicated=master:PreferNoSchedule –overwrite
# 删除节点Taints
$ kubectl taint nodes node1 dedicated-
4. 管理Pod Tolerations：
# 查看Pod的Tolerations
$ kubectl get pod toleration-pod -o jsonpath='{.spec.tolerations}’
# 更新Pod的Tolerations
$ kubectl patch pod toleration-pod -p ‘{“spec”:{“tolerations”:[{“key”:”dedicated”,”operator”:”Equal”,”value”:”master”,”effect”:”PreferNoSchedule”}]}}’
5. 配置Taints和Tolerations策略：
# 为节点添加Taints
$ kubectl taint nodes node1 dedicated=master:NoSchedule
$ kubectl taint nodes node2 dedicated=worker:PreferNoSchedule
# 创建Deployment
$ cat > app-deployment.yaml << 'EOF' apiVersion: apps/v1 kind: Deployment metadata: name: app-deployment spec: replicas: 3 selector: matchLabels: app: app template: metadata: labels: app: app spec: containers: - name: app-container image: nginx:latest tolerations: - key: "dedicated" operator: "Equal" value: "worker" effect: "PreferNoSchedule" EOF $ kubectl apply -f app-deployment.yaml 6. 测试Taints和Tolerations策略： # 查看Pod调度情况 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES app-deployment-675949546d-2q5k2 1/1 Running 0 1m 10.244.1.2 node2
app-deployment-675949546d-5b7c8 1/1 Running 0 1m 10.244.1.3 node2
app-deployment-675949546d-7f8d9 1/1 Running 0 1m 10.244.2.2 node3
7. 清理资源：
# 清理Deployment，学习交流加群风哥QQ113257174。
$ kubectl delete deployment app-deployment
# 移除节点Taints
$ kubectl taint nodes node1 dedicated-
$ kubectl taint nodes node2 dedicated-
# 清理告警规则
$ kubectl delete prometheusrule scheduling-alerts
# 卸载Prometheus和Grafana
$ helm uninstall prometheus

Part04-生产案例与实战讲解

4.1 Taints案例

Taints的实战案例。

# 案例：使用Taints实现节点隔离
# 场景：在Kubernetes集群中，使用Taints实现节点隔离，将不同类型的工作负载隔离到不同的节点上
# 问题：
– 集群中有不同类型的节点，如GPU节点、存储节点、普通节点
– 需要将不同类型的工作负载调度到合适的节点上
– 避免普通工作负载占用特殊节点的资源
# 解决方案：
1. 为节点添加Taints：
# 查看节点
$ kubectl get nodes
# 输出
NAME STATUS ROLES AGE VERSION
node1 Ready control-plane,master 1d v1.24.0
node2 Ready 1d v1.24.0
node3 Ready 1d v1.24.0
# 为GPU节点添加Taints
$ kubectl taint nodes node2 hardware=gpu:NoSchedule
# 为存储节点添加Taints
$ kubectl taint nodes node3 storage=high:NoSchedule
# 查看节点Taints
$ kubectl describe nodes | grep Taints
# 输出
Taints:
Taints: hardware=gpu:NoSchedule
Taints: storage=high:NoSchedule
2. 测试普通Pod调度：
# 创建普通Pod
$ cat > normal-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: normal-pod spec: containers: - name: normal-container image: nginx:latest EOF $ kubectl apply -f normal-pod.yaml # 查看Pod调度情况 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES normal-pod 1/1 Running 0 1m 10.244.0.2 node1
3. 测试GPU Pod调度：
# 创建GPU Pod
$ cat > gpu-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: gpu-pod spec: containers: - name: gpu-container image: tensorflow/tensorflow:latest-gpu resources: requests: nvidia.com/gpu: 1 limits: nvidia.com/gpu: 1 tolerations: - key: "hardware" operator: "Equal" value: "gpu" effect: "NoSchedule" EOF $ kubectl apply -f gpu-pod.yaml # 查看Pod调度情况 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES gpu-pod 1/1 Running 0 1m 10.244.1.2 node2
normal-pod 1/1 Running 0 2m 10.244.0.2 node1
4. 测试存储Pod调度：
# 创建存储Pod
$ cat > storage-pod.yaml << 'EOF' apiVersion: v1 kind: Pod metadata: name: storage-pod spec: containers: - name: storage-container image: mysql:8.0 env: - name: MYSQL_ROOT_PASSWORD value: password tolerations: - key: "storage" operator: "Equal" value: "high" effect: "NoSchedule" EOF $ kubectl apply -f storage-pod.yaml # 查看Pod调度情况 $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES gpu-pod 1/1 Running 0 2m 10.244.1.2 node2
normal-pod 1/1 Running 0 3m 10.244.0.2 node1
storage-pod 1/1 Running 0 1m 10.244.2.2 node3
5. 清理资源：
# 清理Pod
$ kubectl delete pod normal-pod gpu-pod storage-pod
# 移除节点Taints
$ kubectl taint nodes node2 hardware-，更多视频教程www.fgedu.net.cn。
$ kubectl taint nodes node3 storage-
# 输出结果：
# Taints配置成功
# 普通Pod调度到普通节点
# GPU Pod调度到GPU节点
# 存储Pod调度到存储节点
# 节点隔离效果实现

4.2 Tolerations案例

Tolerations的实战案例。

# 案例：使用Tolerations实现Pod调度到特定节点
# 场景：在Kubernetes集群中，使用Tolerations使Pod能够调度到带有Taints的节点上
# 问题：
– 集群中的某些节点带有Taints
– 需要将特定的Pod调度到这些节点上
– 确保Pod能够容忍节点的Taints
# 解决方案：
1. 为节点添加Taints：
# 为节点添加Taints
$ kubectl taint nodes node1 dedicated=master:NoSchedule
$ kubectl taint nodes node2 dedicated=worker:PreferNoSchedule
2. 创建带有Tolerations的Deployment：
# 创建Deployment
$ cat > app-deployment.yaml << 'EOF' apiVersion: apps/v1 kind: Deployment metadata: name: app-deployment spec: replicas: 3 selector: matchLabels: app: app template: metadata: labels: app: app spec: containers: - name: app-container image: nginx:latest tolerations: - key: "dedicated" operator: "Exists" effect: "NoSchedule" EOF $ kubectl apply -f app-deployment.yaml 3. 查看Pod调度情况： # 查看Pod $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES app-deployment-675949546d-2q5k2 1/1 Running 0 1m 10.244.0.2 node1
app-deployment-675949546d-5b7c8 1/1 Running 0 1m 10.244.1.2 node2
app-deployment-675949546d-7f8d9 1/1 Running 0 1m 10.244.2.2 node3
4. 测试NoExecute效果：
# 为节点添加NoExecute Taint
$ kubectl taint nodes node3 dedicated=test:NoExecute
# 查看Pod状态
$ kubectl get pods -o wide
# 输出
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
app-deployment-675949546d-2q5k2 1/1 Running 0 2m 10.244.0.2 node1
app-deployment-675949546d-5b7c8 1/1 Running 0 2m 10.244.1.2 node2
app-deployment-675949546d-7f8d9 1/1 Running 0 2m 10.244.2.2 node3
# 更新Pod添加NoExecute Toleration
$ kubectl patch deployment app-deployment -p ‘{“spec”:{“template”:{“spec”:{“tolerations”:[{“key”:”dedicated”,”operator”:”Exists”,”effect”:”NoSchedule”},{“key”:”dedicated”,”operator”:”Exists”,”effect”:”NoExecute”}]}}}’
# 查看Pod状态
$ kubectl get pods -o wide
# 输出
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
app-deployment-675949546d-2q5k2 1/1 Running 0 3m 10.244.0.2 node1
app-deployment-675949546d-5b7c8 1/1 Running 0 3m 10.244.1.2 node2
app-deployment-675949546d-7f8d9 1/1 Running 0 3m 10.244.2.2 node3
5. 清理资源：
# 清理Deployment
$ kubectl delete deployment app-deployment
# 移除节点Taints
$ kubectl taint nodes node1 dedicated-
$ kubectl taint nodes node2 dedicated-
$ kubectl taint nodes node3 dedicated-
# 输出结果：
# Tolerations配置成功
# Pod能够调度到带有Taints的节点上
# NoExecute Taint效果实现
# Pod能够容忍节点的Taints

4.3 集成案例

Taints and Tolerations的集成案例。

# 案例：使用Taints and Tolerations和Node Affinity部署微服务应用
# 场景：部署一个微服务应用，包括前端、后端和数据库，使用Taints and Tolerations和Node Affinity优化调度
# 问题：
– 微服务应用需要不同的资源需求
– 前端和后端需要低延迟通信
– 数据库需要高IO性能
– 需要确保应用的高可用性
# 解决方案：
1. 为节点添加标签和Taints：
# 查看节点
$ kubectl get nodes
# 输出
NAME STATUS ROLES AGE VERSION
node1 Ready control-plane,master 1d v1.24.0
node2 Ready 1d v1.24.0
node3 Ready 1d v1.24.0
# 为节点添加标签
$ kubectl label nodes node1 role=frontend
$ kubectl label nodes node2 role=backend
$ kubectl label nodes node3 role=database
# 为节点添加Taints
$ kubectl taint nodes node1 role=frontend:NoSchedule
$ kubectl taint nodes node2 role=backend:NoSchedule
$ kubectl taint nodes node3 role=database:NoSchedule
# 查看节点标签和Taints，更多学习教程公众号风哥教程itpux_com。
$ kubectl get nodes –show-labels
$ kubectl describe nodes | grep Taints
2. 创建前端Deployment：
# 创建Deployment
$ cat > frontend-deployment.yaml << 'EOF' apiVersion: apps/v1 kind: Deployment metadata: name: frontend-deployment spec: replicas: 2 selector: matchLabels: app: frontend template: metadata: labels: app: frontend spec: containers: - name: frontend-container image: nginx:latest ports: - containerPort: 80 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: role operator: In values: - frontend tolerations: - key: "role" operator: "Equal" value: "frontend" effect: "NoSchedule" EOF $ kubectl apply -f frontend-deployment.yaml 3. 创建后端Deployment： # 创建Deployment $ cat > backend-deployment.yaml << 'EOF' apiVersion: apps/v1 kind: Deployment metadata: name: backend-deployment spec: replicas: 2 selector: matchLabels: app: backend template: metadata: labels: app: backend spec: containers: - name: backend-container image: node:latest ports: - containerPort: 3000 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: role operator: In values: - backend tolerations: - key: "role" operator: "Equal" value: "backend" effect: "NoSchedule" EOF $ kubectl apply -f backend-deployment.yaml 4. 创建数据库StatefulSet： # 创建StatefulSet $ cat > database-statefulset.yaml << 'EOF' apiVersion: apps/v1 kind: StatefulSet metadata: name: database-statefulset spec: serviceName: database replicas: 2 selector: matchLabels: app: database template: metadata: labels: app: database spec: containers: - name: database-container image: mysql:8.0 ports: - containerPort: 3306 env: - name: MYSQL_ROOT_PASSWORD value: password - name: MYSQL_DATABASE value: fgedudb affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: role operator: In values: - database podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In，from K8S+DB视频:www.itpux.com。 values: - database topologyKey: kubernetes.io/hostname tolerations: - key: "role" operator: "Equal" value: "database" effect: "NoSchedule" EOF $ kubectl apply -f database-statefulset.yaml。 5. 查看Pod调度情况： # 查看Pod $ kubectl get pods -o wide # 输出 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES frontend-deployment-675949546d-2q5k2 1/1 Running 0 1m 10.244.0.2 node1
frontend-deployment-675949546d-5b7c8 1/1 Running 0 1m 10.244.0.3 node1
backend-deployment-675949546d-7f8d9 1/1 Running 0 1m 10.244.1.2 node2
backend-deployment-675949546d-9p6q7 1/1 Running 0 1m 10.244.1.3 node2
database-statefulset-0 1/1 Running 0 1m 10.244.2.2 node3
6. 测试高可用性：
# 模拟节点故障
$ kubectl cordon node1
$ kubectl drain node1 –ignore-daemonsets
# 查看Pod状态
$ kubectl get pods -o wide
# 输出
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
frontend-deployment-675949546d-2q5k2 1/1 Running 0 2m 10.244.0.2 node1
frontend-deployment-675949546d-5b7c8 1/1 Running 0 2m 10.244.0.3 node1
backend-deployment-675949546d-7f8d9 1/1 Running 0 2m 10.244.1.2 node2
backend-deployment-675949546d-9p6q7 1/1 Running 0 2m 10.244.1.3 node2
database-statefulset-0 1/1 Running 0 2m 10.244.2.2 node3
7. 恢复节点：
# 恢复节点
$ kubectl uncordon node1
8. 清理资源：
# 清理Deployment和StatefulSet
$ kubectl delete deployment frontend-deployment backend-deployment
$ kubectl delete statefulset database-statefulset
# 移除节点标签和Taints
$ kubectl label nodes node1 role-
$ kubectl label nodes node2 role-
$ kubectl label nodes node3 role-
$ kubectl taint nodes node1 role-
$ kubectl taint nodes node2 role-
$ kubectl taint nodes node3 role-
# 输出结果：
# Taints and Tolerations配置成功
# Node Affinity配置成功
# 前端Pod调度到frontend节点
# 后端Pod调度到backend节点
# 数据库Pod调度到database节点
# 应用的资源需求得到满足
# 应用的高可用性得到保障

Part05-风哥经验总结与分享

5.1 Taints使用技巧

Kubernetes Taints使用的技巧。

合理使用Taints效果：根据需要选择合适的Taints效果（NoSchedule、PreferNoSchedule、NoExecute）
标准化Taints命名：使用清晰、一致的Taints命名规则，便于管理和维护
避免过度使用Taints：避免为节点添加过多的Taints，导致Pod无法调度
定期审查Taints：定期审查和清理不必要的Taints，保持节点配置的简洁性
结合节点标签：结合节点标签使用Taints，实现更精细的调度控制
测试Taints效果：在生产环境中使用前，在测试环境中测试Taints效果，确保其正常工作
监控Taints使用情况：监控节点Taints的使用情况，及时发现和处理问题
考虑集群规模：根据集群规模调整Taints策略，确保在小规模集群中也能正常调度

5.2 Tolerations使用技巧

Kubernetes Tolerations使用的技巧：

最小化Tolerations：只添加必要的Tolerations，避免过度容忍
合理使用匹配规则：根据需要选择合适的匹配规则（Equal、Exists）
测试Tolerations：在生产环境中使用前，在测试环境中测试Tolerations效果，确保其正常工作
文档化Tolerations：文档化所有的Tolerations配置，便于维护和审计
结合Pod Affinity：结合Pod Affinity使用Tolerations，实现更精细的调度控制
考虑Pod优先级：结合Pod优先级使用Tolerations，确保重要的Pod能够优先调度
监控Tolerations使用情况：监控Pod Tolerations的使用情况，及时发现和处理问题
定期审查Tolerations：定期审查和更新Tolerations配置，确保其符合应用需求

5.3 未来趋势

Kubernetes调度的未来趋势：

更智能的调度策略：使用AI技术实现智能化调度，根据应用需求和集群状态自动调整调度策略
多维度调度：考虑更多维度的因素，如能耗、网络延迟、成本等，实现更优化的调度
边缘计算支持：扩展调度策略到边缘节点，支持边缘计算场景
自定义调度器：提供更灵活的自定义调度器接口，允许用户根据特定需求实现自定义调度逻辑
调度可视化：提供更直观的调度可视化工具，帮助用户理解和优化调度策略
跨集群调度：支持跨多个集群的调度，实现资源的更有效利用
实时调度：实现实时调度，根据集群状态的变化及时调整Pod的调度
安全感知调度：考虑安全因素的调度，确保Pod调度到安全的节点上

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html