1. 首页 > IT综合教程 > 正文

it教程FG053-Kubernetes容器编排实践

内容大纲

1. Kubernetes概述

Kubernetes(简称K8s)是一个开源的容器编排平台,用于自动化容器的部署、扩展和管理。它由Google设计并开源,现已成为云原生技术栈的核心组件。

Kubernetes的主要功能包括:

  • 自动化部署和回滚:可以自动部署应用程序,并在需要时回滚到之前的版本
  • 水平扩展:根据CPU使用率或其他指标自动扩展或缩容应用程序
  • 服务发现和负载均衡:自动为容器分配IP地址,并在多个容器之间实现负载均衡
  • 存储编排:自动挂载所选存储系统,如本地存储、云存储等
  • 配置管理和密钥管理:集中管理配置文件和敏感信息
  • 自我修复:自动重启失败的容器,替换或重新调度受影响的容器
  • 批处理执行:管理批处理任务和CI/CD工作流

风哥风哥提示:Kubernetes已成为现代容器编排的标准,广泛应用于生产环境中,特别是在微服务架构和DevOps实践中。

2. Kubernetes核心概念

2.1 Pod

Pod是Kubernetes中最小的部署单元,它是一个或多个容器的集合,这些容器共享网络和存储资源。Pod中的容器总是在同一台主机上运行,并共享相同的网络命名空间。

2.2 Service

Service是一种抽象,用于定义一组Pod的访问方式。它为Pod提供稳定的IP地址和DNS名称,并在Pod之间实现负载均衡。

2.3 Deployment

Deployment是一种控制器,用于管理Pod的部署和更新。它确保指定数量的Pod副本正在运行,并支持滚动更新和回滚。

2.4 ReplicaSet

ReplicaSet是一种控制器,用于确保指定数量的Pod副本正在运行。它是Deployment的底层实现。

2.5 Namespace

Namespace用于将集群划分为多个逻辑单元,便于资源管理和隔离。

2.6 ConfigMap

ConfigMap用于存储配置数据,如环境变量、配置文件等。

2.7 Secret

Secret用于存储敏感信息,如密码、API密钥等。

2.8 Volume

Volume用于持久化数据,支持多种存储后端,如本地存储、网络存储等。

2.9 PersistentVolume (PV)

PersistentVolume是集群级别的存储资源,由管理员创建和管理。

2.10 PersistentVolumeClaim (PVC)

PersistentVolumeClaim是用户对存储资源的请求,用于绑定到PersistentVolume。

2.11 Node

Node是集群中的工作节点,用于运行Pod。

2.12 Cluster

Cluster是由一组Node组成的集合,用于运行容器化应用程序。

3. Kubernetes安装与配置

3.1 使用kubeadm安装Kubernetes

# 安装Docker
sudo apt-get update
sudo apt-get install -y docker.io

# 安装kubeadm、kubelet和kubectl
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add –
cat < # 初始化集群
sudo kubeadm init –pod-network-cidr=10.244.0.0/16

# 配置kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# 安装网络插件
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

[init] Using Kubernetes version: v1.21.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using ‘kubeadm config images pull’
[certs] Using certificateDir folder “/etc/kubernetes/pki”
[certs] Generating “ca” certificate and key
[certs] Generating “apiserver” certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.100]
[certs] Generating “apiserver-kubelet-client” certificate and key
[certs] Generating “front-proxy-ca” certificate and key
[certs] Generating “front-proxy-client” certificate and key
[certs] Generating “etcd/ca” certificate and key
[certs] Generating “etcd/server” certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master fgedudb] and IPs [192.168.1.100 127.0.0.1 ::1]
[certs] Generating “etcd/peer” certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master fgedudb] and IPs [192.168.1.100 127.0.0.1 ::1]
[certs] Generating “etcd/healthcheck-client” certificate and key
[certs] Generating “apiserver-etcd-client” certificate and key
[certs] Generating “sa” key and public key
[kubeconfig] Using kubeconfig folder “/etc/kubernetes”
[kubeconfig] Writing “admin.conf” kubeconfig file
[kubeconfig] Writing “kubelet.conf” kubeconfig file
[kubeconfig] Writing “controller-manager.conf” kubeconfig file
[kubeconfig] Writing “scheduler.conf” kubeconfig file
[control-plane] Using manifest folder “/etc/kubernetes/manifests”
[control-plane] Creating static Pod manifest for “kube-apiserver”
[control-plane] Creating static Pod manifest for “kube-controller-manager”
[control-plane] Creating static Pod manifest for “kube-scheduler”
[etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests”
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”
[wait-control-plane] This can take up to 4m0s
[apiclient] All control plane components are healthy after 32.501151 seconds
[upload-config] Storing the configuration used in ConfigMap “kubeadm-config” in the “kube-system” Namespace
[kubelet] Creating a ConfigMap “kubelet-config-1.21” in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see –upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the labels “node-role.kubernetes.io/master=’true'” and “node-role.kubernetes.io/control-plane=’true'”
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the “cluster-info” ConfigMap in the “kube-public” namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run “kubectl apply -f [podnetwork].yaml” with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \n –discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef

3.2 加入工作节点

# 在工作节点上运行加入命令
sudo kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \n –discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster…
[preflight] FYI: You can look at this config file with ‘kubectl -n kube-system get cm kubeadm-config -o yaml’
[kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
[kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap…

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run ‘kubectl get nodes’ on the control-plane to see this node join the cluster.

3.3 验证集群状态

# 查看节点状态
kubectl get nodes

# 查看集群组件状态
kubectl get componentstatuses

# 查看所有Pod状态
kubectl get pods –all-namespaces

# 查看节点状态输出
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane,master 10m v1.21.0
k8s-worker1 Ready 5m v1.21.0
k8s-worker2 Ready 3m v1.21.0

# 查看集群组件状态输出
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy ok

# 查看所有Pod状态输出
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-8f45k 1/1 Running 0 10m
kube-system coredns-558bd4d5db-vq22q 1/1 Running 0 10m
kube-system etcd-k8s-master 1/1 Running 0 10m
kube-system kube-apiserver-k8s-master 1/1 Running 0 10m
kube-system kube-controller-manager-k8s-master 1/1 Running 0 10m
kube-system kube-flannel-ds-amd64-4f2x7 1/1 Running 0 8m
kube-system kube-flannel-ds-amd64-8q97z 1/1 Running 0 8m
kube-system kube-flannel-ds-amd64-tk56v 1/1 Running 0 8m
kube-system kube-proxy-7b2x5 1/1 Running 0 10m
kube-system kube-proxy-8q7x6 1/1 Running 0 5m
kube-system kube-proxy-c9x4p 1/1 Running 0 3m
kube-system kube-scheduler-k8s-master 1/1 Running 0 10m

4. Kubernetes集群管理

4.1 查看集群信息

# 查看集群信息
kubectl cluster-info

# 查看节点详情
kubectl describe node k8s-master

# 查看集群信息输出
Kubernetes control plane is running at https://192.168.1.100:6443
CoreDNS is running at https://192.168.1.100:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use ‘kubectl cluster-info dump’.

4.2 管理节点

# 标记节点为不可调度
kubectl cordon k8s-worker1

# 驱逐节点上的Pod
kubectl drain k8s-worker1 –ignore-daemonsets

# 标记节点为可调度
kubectl uncordon k8s-worker1

# 给节点添加标签
kubectl label node k8s-worker1 disktype=ssd

# 查看节点标签
kubectl get nodes –show-labels

4.3 管理命名空间

# 创建命名空间
kubectl create namespace dev

# 查看命名空间
kubectl get namespaces

# 删除命名空间
kubectl delete namespace dev

# 查看命名空间输出
NAME STATUS AGE
default Active 1h
kube-node-lease Active 1h
kube-public Active 1h
kube-system Active 1h
dev Active 5m

5. 部署应用到Kubernetes

5.1 使用Deployment部署应用

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
– name: nginx
image: nginx:latest
ports:
– containerPort: 80
resources:
limits:
cpu: “1”
memory: “1Gi”
requests:
cpu: “500m”
memory: “512Mi”
# 部署应用
kubectl apply -f nginx-deployment.yaml

# 查看Deployment状态
kubectl get deployments

# 查看Pod状态
kubectl get pods

# 查看Deployment状态输出
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 3/3 3 3 5m

# 查看Pod状态输出
NAME READY STATUS RESTARTS AGE
nginx-deployment-75675f5897-4x7k8 1/1 Running 0 5m
nginx-deployment-75675f5897-67d9c 1/1 Running 0 5m
nginx-deployment-75675f5897-8f6k9 1/1 Running 0 5m

5.2 创建Service

apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
– port: 80
targetPort: 80
type: NodePort
# 创建Service
kubectl apply -f nginx-service.yaml

# 查看Service状态
kubectl get services

# 查看Service状态输出
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 443/TCP 1h
nginx-service NodePort 10.104.162.191 80:30080/TCP 2m

5.3 滚动更新应用

# 更新Deployment的镜像
kubectl set image deployment/nginx-deployment nginx=nginx:1.19.0

# 查看更新状态
kubectl rollout status deployment/nginx-deployment

# 查看更新后的Pod
kubectl get pods

# 查看更新状态输出
Waiting for deployment “nginx-deployment” rollout to finish: 1 out of 3 new replicas have been updated…
Waiting for deployment “nginx-deployment” rollout to finish: 1 out of 3 new replicas have been updated…
Waiting for deployment “nginx-deployment” rollout to finish: 2 out of 3 new replicas have been updated…
Waiting for deployment “nginx-deployment” rollout to finish: 2 out of 3 new replicas have been updated…
Waiting for deployment “nginx-deployment” rollout to finish: 3 out of 3 new replicas have been updated…
deployment “nginx-deployment” successfully rolled out

5.4 回滚应用

# 查看Deployment的历史版本
kubectl rollout history deployment/nginx-deployment

# 回滚到之前的版本
kubectl rollout undo deployment/nginx-deployment

# 查看回滚状态
kubectl rollout status deployment/nginx-deployment

6. Kubernetes网络配置

6.1 Kubernetes网络模型

Kubernetes采用扁平的网络模型,每个Pod都有自己的IP地址,Pod之间可以直接通信,不需要NAT。Kubernetes网络模型要求:

  • 所有Pod可以在集群内直接通信,无需NAT
  • 所有节点可以与所有Pod直接通信,无需NAT
  • Pod的IP地址在容器内和外部是相同的

6.2 网络插件

Kubernetes支持多种网络插件,包括:

  • Flannel:基于VXLAN的网络插件,简单易用
  • Calico:基于BGP的网络插件,支持网络策略
  • Canal:结合了Flannel和Calico的特性
  • Cilium:基于eBPF的网络插件,支持网络策略和服务网格

6.3 网络策略

网络策略用于控制Pod之间的通信,可以限制哪些Pod可以与其他Pod通信。

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: nginx-network-policy
namespace: default
spec:
podSelector:
matchLabels:
app: nginx
policyTypes:
– Ingress
– Egress
ingress:
– from:
– podSelector:
matchLabels:
app: frontend
ports:
– protocol: TCP
port: 80
egress:
– to:
– podSelector:
matchLabels:
app: backend
ports:
– protocol: TCP
port: 8080
# 应用网络策略
kubectl apply -f nginx-network-policy.yaml

# 查看网络策略
kubectl get networkpolicies

7. Kubernetes存储管理

7.1 持久卷(PV)和持久卷声明(PVC)

持久卷(PV)是集群级别的存储资源,由管理员创建和管理。持久卷声明(PVC)是用户对存储资源的请求,用于绑定到持久卷。

7.2 创建持久卷

apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-volume
spec:
capacity:
storage: 10Gi
accessModes:
– ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: /mnt/data
# 创建持久卷
kubectl apply -f pv-volume.yaml

# 查看持久卷
kubectl get pv

# 查看持久卷输出
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-volume 10Gi RWO Retain Available 2m

7.3 创建持久卷声明

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-claim
spec:
accessModes:
– ReadWriteOnce
resources:
requests:
storage: 5Gi
# 创建持久卷声明
kubectl apply -f pv-claim.yaml

# 查看持久卷声明
kubectl get pvc

# 查看持久卷状态
kubectl get pv

# 查看持久卷声明输出
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pv-claim Bound pv-volume 10Gi RWO 1m

# 查看持久卷状态输出
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-volume 10Gi RWO Retain Bound default/pv-claim 3m

7.4 在Pod中使用持久卷

apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
spec:
containers:
– name: nginx
image: nginx:latest
ports:
– containerPort: 80
volumeMounts:
– name: nginx-storage
mountPath: /usr/share/nginx/html
volumes:
– name: nginx-storage
persistentVolumeClaim:
claimName: pv-claim
# 创建Pod
kubectl apply -f nginx-pod.yaml

# 查看Pod状态
kubectl get pods

8. Kubernetes监控与日志

8.1 使用Prometheus监控Kubernetes

Prometheus是Kubernetes生态系统中最常用的监控工具,它可以收集集群的指标数据,并通过Grafana进行可视化。

# 安装Prometheus和Grafana
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack

8.2 查看Pod日志

# 查看Pod日志
kubectl logs nginx-deployment-75675f5897-4x7k8

# 实时查看Pod日志
kubectl logs -f nginx-deployment-75675f5897-4x7k8

# 查看容器日志(当Pod有多个容器时)
kubectl logs nginx-deployment-75675f5897-4x7k8 -c nginx

8.3 使用ELK Stack收集日志

ELK Stack(Elasticsearch, Logstash, Kibana)是一套用于收集、存储和分析日志的工具。

# 安装ELK Stack
helm repo add elastic https://helm.elastic.co
helm repo update
helm install elk elastic/elasticsearch
helm install kibana elastic/kibana
helm install logstash elastic/logstash

9. Kubernetes安全管理

9.1 认证与授权

Kubernetes使用RBAC(基于角色的访问控制)来管理用户对集群资源的访问权限。

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
namespace: default
rules:
– apiGroups: [“”]
resources: [“pods”]
verbs: [“get”, “watch”, “list”]
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
– kind: User
name: user1
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
# 创建角色和角色绑定
kubectl apply -f role.yaml
kubectl apply -f role-binding.yaml

9.2 Secrets管理

Secrets用于存储敏感信息,如密码、API密钥等。

# 创建Secret
kubectl create secret generic my-secret –from-literal=username=admin –from-literal=password=secret123

# 查看Secret
kubectl get secrets

# 查看Secret详情
kubectl describe secret my-secret

9.3 Pod安全策略

Pod安全策略用于限制Pod的行为,如禁止使用特权容器、限制容器的权限等。

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
spec:
privileged: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: MustRunAsNonRoot
fsGroup:
rule: RunAsAny
volumes:
– ‘configMap’
– ’emptyDir’
– ‘projected’
– ‘secret’
– ‘downwardAPI’
– ‘persistentVolumeClaim’

学习交流加群风哥微信: itpux-com

10. 生产环境最佳实践

10.1 集群设计

  • 控制平面高可用:部署多个控制平面节点,确保集群的高可用性
  • 工作节点规划:根据应用需求规划工作节点的数量和配置
  • 网络设计:选择合适的网络插件,确保网络性能和安全性
  • 存储设计:根据应用需求选择合适的存储解决方案

10.2 资源管理

  • 设置资源限制:为所有Pod设置CPU和内存限制,避免资源争用
  • 使用资源配额:为命名空间设置资源配额,限制资源使用
  • 使用LimitRange:为命名空间设置默认的资源限制

10.3 部署策略

  • 使用滚动更新:通过滚动更新部署应用,减少 downtime
  • 使用健康检查:配置就绪探针和存活探针,确保应用的健康状态
  • 使用水平 Pod 自动缩放:根据负载自动调整Pod数量
  • 使用PodDisruptionBudget:确保在节点维护期间有足够的Pod运行

10.4 安全最佳实践

  • 使用RBAC:实施最小权限原则,限制用户和服务账户的权限
  • 使用Secrets管理敏感信息:避免在配置文件中硬编码敏感信息
  • 使用Pod安全策略:限制Pod的行为,提高安全性
  • 定期更新Kubernetes版本:及时修复安全漏洞
  • 使用网络策略:限制Pod之间的通信,提高网络安全性

10.5 监控与告警

  • 部署监控系统:使用Prometheus和Grafana监控集群状态
  • 配置告警规则:设置合理的告警规则,及时发现和处理问题
  • 收集和分析日志:使用ELK Stack收集和分析日志
  • 定期备份etcd:定期备份etcd数据,确保数据安全

11. 常见问题与解决方案

11.1 Pod无法启动

问题:Pod一直处于Pending状态

解决方案:检查节点资源是否充足,查看Pod事件

# 查看Pod事件
kubectl describe pod nginx-deployment-75675f5897-4x7k8

11.2 Pod崩溃

问题:Pod一直重启

解决方案:查看Pod日志,分析崩溃原因

# 查看Pod日志
kubectl logs nginx-deployment-75675f5897-4x7k8

11.3 网络连接问题

问题:Pod之间无法通信

解决方案:检查网络插件是否正常运行,查看网络策略

# 查看网络插件Pod状态
kubectl get pods -n kube-system | grep flannel

11.4 存储问题

问题:PVC无法绑定到PV

解决方案:检查PV和PVC的配置是否匹配,查看PV状态

# 查看PV和PVC状态
kubectl get pv
kubectl get pvc

11.5 集群节点不可用

问题:节点状态为NotReady

解决方案:检查节点状态,查看kubelet日志

# 查看节点状态
kubectl describe node k8s-worker1

# 查看kubelet日志
sudo journalctl -u kubelet

更多学习教程www.fgedu.net.cn

学习交流加群风哥QQ113257174

更多学习教程公众号风哥教程itpux_com

author:www.itpux.com

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息