Kubernetes教程FG003-Kubernetes安装与部署过程实战解析
本文档风哥主要介绍Kubernetes的安装与部署过程,包括Kubernetes组件安装、集群初始化、节点加入、网络插件安装、集群验证等内容,风哥教程参考Kubernetes官方文档Setup指南,适合DevOps工程师和系统管理员在学习和测试中使用,如果要应用于生产环境则需要自行确认。
Part01-基础概念与理论知识
1.1 安装概述
Kubernetes的安装过程包括以下几个主要步骤:
- 安装Kubernetes组件:kubelet、kubeadm、kubectl
- 初始化控制平面节点
- 配置kubectl
- 加入工作节点
- 安装网络插件
- 验证集群状态
- 安装附加组件
1.2 安装方法
Kubernetes的安装方法:
- kubeadm:官方推荐的安装工具,适合生产环境
- minikube:适合开发和测试环境
- kubespray:基于Ansible的安装工具,适合大规模集群
- 云服务提供商:如EKS、GKE、AKS等
,风哥提示:。
1.3 部署架构
Kubernetes的部署架构:
- 单控制平面节点:适合开发和测试环境
- 多控制平面节点:适合生产环境,提供高可用性
- 混合部署:控制平面节点和工作节点混合部署
- 分离部署:控制平面节点和工作节点分离部署
Part02-生产环境规划与建议
2.1 安装准备
生产环境Kubernetes集群的安装准备:
# 安装Kubernetes组件
$ cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg EOF # 安装kubelet、kubeadm、kubectl $ yum install -y kubelet kubeadm kubectl # 启动kubelet服务 $ systemctl enable kubelet $ systemctl start kubelet # 检查kubelet状态 $ systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2023-04-13 10:00:00 CST; 1h ago Docs: https://kubernetes.io/docs/ Main PID: 12345 (kubelet) Tasks: 20 Memory: 100.0M CPU: 1.2s CGroup: /system.slice/kubelet.service └─12345 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7
$ cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg EOF # 安装kubelet、kubeadm、kubectl $ yum install -y kubelet kubeadm kubectl # 启动kubelet服务 $ systemctl enable kubelet $ systemctl start kubelet # 检查kubelet状态 $ systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2023-04-13 10:00:00 CST; 1h ago Docs: https://kubernetes.io/docs/ Main PID: 12345 (kubelet) Tasks: 20 Memory: 100.0M CPU: 1.2s CGroup: /system.slice/kubelet.service └─12345 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7
2.2 网络规划
生产环境Kubernetes集群的网络规划:
# 网络规划示例
– 控制平面节点IP:192.168.1.101-103
– 工作节点IP:192.168.1.201-202
– Pod网络CIDR:10.244.0.0/16
– Service网络CIDR:10.96.0.0/12
– 负载均衡器IP:192.168.1.100
# 网络插件选择
– Calico:功能丰富,支持网络策略
– Flannel:简单易用,适合小型集群
– Cilium:基于eBPF,性能优异
– Weave Net:易于部署,适合测试环境
– 控制平面节点IP:192.168.1.101-103
– 工作节点IP:192.168.1.201-202
– Pod网络CIDR:10.244.0.0/16
– Service网络CIDR:10.96.0.0/12
– 负载均衡器IP:192.168.1.100
# 网络插件选择
– Calico:功能丰富,支持网络策略
– Flannel:简单易用,适合小型集群
– Cilium:基于eBPF,性能优异
– Weave Net:易于部署,适合测试环境
2.3 存储规划
生产环境Kubernetes集群的存储规划:
,学习交流加群风哥微信: itpux-com。
# 存储规划示例
– 控制平面节点存储:200GB SSD
– 工作节点存储:500GB SSD
– etcd数据存储:单独的磁盘,100GB SSD
– 持久化存储:使用NFS或云存储
# 存储类配置
– 标准存储类:用于一般应用
– 高性能存储类:用于数据库等性能敏感应用
– 备份存储类:用于数据备份
– 控制平面节点存储:200GB SSD
– 工作节点存储:500GB SSD
– etcd数据存储:单独的磁盘,100GB SSD
– 持久化存储:使用NFS或云存储
# 存储类配置
– 标准存储类:用于一般应用
– 高性能存储类:用于数据库等性能敏感应用
– 备份存储类:用于数据备份
Part03-生产环境项目实施方案
3.1 Kubernetes安装
生产环境Kubernetes集群的安装:
# 安装Docker
$ yum install -y docker-ce docker-ce-cli containerd.io
$ systemctl enable docker
$ systemctl start docker
# 配置containerd
$ mkdir -p /etc/containerd
$ containerd config default > /etc/containerd/config.toml
$ sed -i ‘s/SystemdCgroup = false/SystemdCgroup = true/’ /etc/containerd/config.toml
$ systemctl restart containerd
# 安装Kubernetes组件
$ yum install -y kubelet kubeadm kubectl
$ systemctl enable kubelet
$ systemctl start kubelet
$ yum install -y docker-ce docker-ce-cli containerd.io
$ systemctl enable docker
$ systemctl start docker
# 配置containerd
$ mkdir -p /etc/containerd
$ containerd config default > /etc/containerd/config.toml
$ sed -i ‘s/SystemdCgroup = false/SystemdCgroup = true/’ /etc/containerd/config.toml
$ systemctl restart containerd
# 安装Kubernetes组件
$ yum install -y kubelet kubeadm kubectl
$ systemctl enable kubelet
$ systemctl start kubelet
3.2 集群初始化
生产环境Kubernetes集群的初始化,风哥提示:。
# 初始化控制平面节点
$ kubeadm init –control-plane-endpoint=”192.168.1.100:6443″ –pod-network-cidr=10.244.0.0/16 –service-cidr=10.96.0.0/12 –image-repository=registry.aliyuncs.com/google_containers
# 输出示例
[init] Using Kubernetes version: v1.28.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using ‘kubeadm config images pull’
[certs] Using certificateDir folder “/etc/kubernetes/pki”
[certs] Generating “ca” certificate and key
[certs] Generating “apiserver” certificate and key
[certs] apiserver serving cert is signed for DNS names [fgedu-master1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.101 192.168.1.100]
[certs] Generating “apiserver-kubelet-client” certificate and key
[certs] Generating “front-proxy-ca” certificate and key
[certs] Generating “front-proxy-client” certificate and key
[certs] Generating “etcd/ca” certificate and key
[certs] Generating “etcd/server” certificate and key
[certs] etcd/server serving cert is signed for DNS names [fgedu-master1 localhost] and IPs [192.168.1.101 127.0.0.1 ::1]
[certs] Generating “etcd/peer” certificate and key
[certs] etcd/peer serving cert is signed for DNS names [fgedu-master1 localhost] and IPs [192.168.1.101 127.0.0.1 ::1]
[certs] Generating “etcd/healthcheck-client” certificate and key
[certs] Generating “apiserver-etcd-client” certificate and key
[certs] Generating “sa” key and public key
[kubeconfig] Using kubeconfig folder “/etc/kubernetes”
[kubeconfig] Writing “admin.conf” kubeconfig file
[kubeconfig] Writing “kubelet.conf” kubeconfig file,学习交流加群风哥QQ113257174。
[kubeconfig] Writing “controller-manager.conf” kubeconfig file
[kubeconfig] Writing “scheduler.conf” kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder “/etc/kubernetes/manifests”
[control-plane] Creating static Pod manifest for “kube-apiserver”
[control-plane] Creating static Pod manifest for “kube-controller-manager”
[control-plane] Creating static Pod manifest for “kube-scheduler”
[etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests”
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s
[apiclient] All control plane components are healthy after 30.501673 seconds
[upload-config] Storing the configuration used in ConfigMap “kubeadm-config” in the “kube-system” Namespace
[kubelet] Creating a ConfigMap “kubelet-config” in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see –upload-certs
[mark-control-plane] Marking the node fgedu-master1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node fgedu-master1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the “cluster-info” ConfigMap in the “kube-public” namespace
[kubelet-finalize] Updating “/etc/kubernetes/kubelet.conf” to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run “kubectl apply -f [podnetwork].yaml” with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root:
kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \
–discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef \
–control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \
–discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
$ kubeadm init –control-plane-endpoint=”192.168.1.100:6443″ –pod-network-cidr=10.244.0.0/16 –service-cidr=10.96.0.0/12 –image-repository=registry.aliyuncs.com/google_containers
# 输出示例
[init] Using Kubernetes version: v1.28.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using ‘kubeadm config images pull’
[certs] Using certificateDir folder “/etc/kubernetes/pki”
[certs] Generating “ca” certificate and key
[certs] Generating “apiserver” certificate and key
[certs] apiserver serving cert is signed for DNS names [fgedu-master1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.101 192.168.1.100]
[certs] Generating “apiserver-kubelet-client” certificate and key
[certs] Generating “front-proxy-ca” certificate and key
[certs] Generating “front-proxy-client” certificate and key
[certs] Generating “etcd/ca” certificate and key
[certs] Generating “etcd/server” certificate and key
[certs] etcd/server serving cert is signed for DNS names [fgedu-master1 localhost] and IPs [192.168.1.101 127.0.0.1 ::1]
[certs] Generating “etcd/peer” certificate and key
[certs] etcd/peer serving cert is signed for DNS names [fgedu-master1 localhost] and IPs [192.168.1.101 127.0.0.1 ::1]
[certs] Generating “etcd/healthcheck-client” certificate and key
[certs] Generating “apiserver-etcd-client” certificate and key
[certs] Generating “sa” key and public key
[kubeconfig] Using kubeconfig folder “/etc/kubernetes”
[kubeconfig] Writing “admin.conf” kubeconfig file
[kubeconfig] Writing “kubelet.conf” kubeconfig file,学习交流加群风哥QQ113257174。
[kubeconfig] Writing “controller-manager.conf” kubeconfig file
[kubeconfig] Writing “scheduler.conf” kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder “/etc/kubernetes/manifests”
[control-plane] Creating static Pod manifest for “kube-apiserver”
[control-plane] Creating static Pod manifest for “kube-controller-manager”
[control-plane] Creating static Pod manifest for “kube-scheduler”
[etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests”
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s
[apiclient] All control plane components are healthy after 30.501673 seconds
[upload-config] Storing the configuration used in ConfigMap “kubeadm-config” in the “kube-system” Namespace
[kubelet] Creating a ConfigMap “kubelet-config” in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see –upload-certs
[mark-control-plane] Marking the node fgedu-master1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node fgedu-master1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the “cluster-info” ConfigMap in the “kube-public” namespace
[kubelet-finalize] Updating “/etc/kubernetes/kubelet.conf” to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run “kubectl apply -f [podnetwork].yaml” with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root:
kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \
–discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef \
–control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \
–discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
3.3 节点加入
生产环境Kubernetes集群的节点加入。
# 配置kubectl
$ mkdir -p $HOME/.kube
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ chown $(id -u):$(id -g) $HOME/.kube/config
# 加入其他控制平面节点
$ kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \
–discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef \
–control-plane
# 加入工作节点
$ kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \
–discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
$ mkdir -p $HOME/.kube
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ chown $(id -u):$(id -g) $HOME/.kube/config
# 加入其他控制平面节点
$ kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \
–discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef \
–control-plane
# 加入工作节点
$ kubeadm join 192.168.1.100:6443 –token abcdef.0123456789abcdef \
–discovery-token-ca-cert-hash sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
Part04-生产案例与实战讲解
4.1 网络插件安装
,更多视频教程www.fgedu.net.cn。
生产环境Kubernetes集群的网络插件安装。
# 安装Calico网络插件
$ kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
# 安装Flannel网络插件
$ kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
# 安装Cilium网络插件
$ kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.13.0/install/kubernetes/cilium.yaml
# 检查网络插件状态
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6d4b75cb6d-7f5f8 1/1 Running 0 10m
calico-node-4q7k8 1/1 Running 0 10m
calico-node-7c9x6 1/1 Running 0 10m
calico-node-8d2k3 1/1 Running 0 10m
calico-node-9f5g7 1/1 Running 0 10m
calico-node-b7c4d 1/1 Running 0 10m
coredns-6d4b75cb6d-7f5f8 1/1 Running 0 30m
coredns-6d4b75cb6d-8k45d 1/1 Running 0 30m
etcd-fgedu-master1 1/1 Running 0 30m
etcd-fgedu-master2 1/1 Running 0 25m
etcd-fgedu-master3 1/1 Running 0 20m
kube-apiserver-fgedu-master1 1/1 Running 0 30m
kube-apiserver-fgedu-master2 1/1 Running 0 25m
kube-apiserver-fgedu-master3 1/1 Running 0 20m
kube-controller-manager-fgedu-master1 1/1 Running 0 30m
kube-controller-manager-fgedu-master2 1/1 Running 0 25m
kube-controller-manager-fgedu-master3 1/1 Running 0 20m
kube-proxy-4q7k8 1/1 Running 0 30m
kube-proxy-7c9x6 1/1 Running 0 25m
kube-proxy-8d2k3 1/1 Running 0 20m
kube-proxy-9f5g7 1/1 Running 0 30m
kube-proxy-b7c4d 1/1 Running 0 30m
kube-scheduler-fgedu-master1 1/1 Running 0 30m
kube-scheduler-fgedu-master2 1/1 Running 0 25m
kube-scheduler-fgedu-master3 1/1 Running 0 20m
$ kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
# 安装Flannel网络插件
$ kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
# 安装Cilium网络插件
$ kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/v1.13.0/install/kubernetes/cilium.yaml
# 检查网络插件状态
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6d4b75cb6d-7f5f8 1/1 Running 0 10m
calico-node-4q7k8 1/1 Running 0 10m
calico-node-7c9x6 1/1 Running 0 10m
calico-node-8d2k3 1/1 Running 0 10m
calico-node-9f5g7 1/1 Running 0 10m
calico-node-b7c4d 1/1 Running 0 10m
coredns-6d4b75cb6d-7f5f8 1/1 Running 0 30m
coredns-6d4b75cb6d-8k45d 1/1 Running 0 30m
etcd-fgedu-master1 1/1 Running 0 30m
etcd-fgedu-master2 1/1 Running 0 25m
etcd-fgedu-master3 1/1 Running 0 20m
kube-apiserver-fgedu-master1 1/1 Running 0 30m
kube-apiserver-fgedu-master2 1/1 Running 0 25m
kube-apiserver-fgedu-master3 1/1 Running 0 20m
kube-controller-manager-fgedu-master1 1/1 Running 0 30m
kube-controller-manager-fgedu-master2 1/1 Running 0 25m
kube-controller-manager-fgedu-master3 1/1 Running 0 20m
kube-proxy-4q7k8 1/1 Running 0 30m
kube-proxy-7c9x6 1/1 Running 0 25m
kube-proxy-8d2k3 1/1 Running 0 20m
kube-proxy-9f5g7 1/1 Running 0 30m
kube-proxy-b7c4d 1/1 Running 0 30m
kube-scheduler-fgedu-master1 1/1 Running 0 30m
kube-scheduler-fgedu-master2 1/1 Running 0 25m
kube-scheduler-fgedu-master3 1/1 Running 0 20m
4.2 集群验证
生产环境Kubernetes集群的验证。
# 检查集群状态
$ kubectl cluster-info
Kubernetes control plane is running at https://192.168.1.100:6443
CoreDNS is running at https://192.168.1.100:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use ‘kubectl cluster-info dump’.
# 检查节点状态
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-master1 Ready control-plane,master 30m v1.28.0
fgedu-master2 Ready control-plane,master 25m v1.28.0
fgedu-master3 Ready control-plane,master 20m v1.28.0
fgedu-worker1 Ready 15m v1.28.0
fgedu-worker2 Ready 10m v1.28.0
# 检查Pod状态
$ kubectl get pods –all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6d4b75cb6d-7f5f8 1/1 Running 0 10m
kube-system calico-node-4q7k8 1/1 Running 0 10m
kube-system calico-node-7c9x6 1/1 Running 0 10m
kube-system calico-node-8d2k3 1/1 Running 0 10m
kube-system calico-node-9f5g7 1/1 Running 0 10m,更多学习教程公众号风哥教程itpux_com。
kube-system calico-node-b7c4d 1/1 Running 0 10m
kube-system coredns-6d4b75cb6d-7f5f8 1/1 Running 0 30m
kube-system coredns-6d4b75cb6d-8k45d 1/1 Running 0 30m
kube-system etcd-fgedu-master1 1/1 Running 0 30m
kube-system etcd-fgedu-master2 1/1 Running 0 25m
kube-system etcd-fgedu-master3 1/1 Running 0 20m
kube-system kube-apiserver-fgedu-master1 1/1 Running 0 30m
kube-system kube-apiserver-fgedu-master2 1/1 Running 0 25m
kube-system kube-apiserver-fgedu-master3 1/1 Running 0 20m
kube-system kube-controller-manager-fgedu-master1 1/1 Running 0 30m
kube-system kube-controller-manager-fgedu-master2 1/1 Running 0 25m
kube-system kube-controller-manager-fgedu-master3 1/1 Running 0 20m
kube-system kube-proxy-4q7k8 1/1 Running 0 30m
kube-system kube-proxy-7c9x6 1/1 Running 0 25m
kube-system kube-proxy-8d2k3 1/1 Running 0 20m
kube-system kube-proxy-9f5g7 1/1 Running 0 30m
kube-system kube-proxy-b7c4d 1/1 Running 0 30m
kube-system kube-scheduler-fgedu-master1 1/1 Running 0 30m
kube-system kube-scheduler-fgedu-master2 1/1 Running 0 25m
kube-system kube-scheduler-fgedu-master3 1/1 Running 0 20m
# 测试集群功能
$ kubectl create deployment nginx –image=nginx
$ kubectl expose deployment nginx –port=80 –type=NodePort
$ kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-6d6f58987b-7f5f8 1/1 Running 0 5m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 443/TCP 35m
service/nginx NodePort 10.100.123.45 80:32123/TCP 5m
$ kubectl cluster-info
Kubernetes control plane is running at https://192.168.1.100:6443
CoreDNS is running at https://192.168.1.100:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use ‘kubectl cluster-info dump’.
# 检查节点状态
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
fgedu-master1 Ready control-plane,master 30m v1.28.0
fgedu-master2 Ready control-plane,master 25m v1.28.0
fgedu-master3 Ready control-plane,master 20m v1.28.0
fgedu-worker1 Ready
fgedu-worker2 Ready
# 检查Pod状态
$ kubectl get pods –all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6d4b75cb6d-7f5f8 1/1 Running 0 10m
kube-system calico-node-4q7k8 1/1 Running 0 10m
kube-system calico-node-7c9x6 1/1 Running 0 10m
kube-system calico-node-8d2k3 1/1 Running 0 10m
kube-system calico-node-9f5g7 1/1 Running 0 10m,更多学习教程公众号风哥教程itpux_com。
kube-system calico-node-b7c4d 1/1 Running 0 10m
kube-system coredns-6d4b75cb6d-7f5f8 1/1 Running 0 30m
kube-system coredns-6d4b75cb6d-8k45d 1/1 Running 0 30m
kube-system etcd-fgedu-master1 1/1 Running 0 30m
kube-system etcd-fgedu-master2 1/1 Running 0 25m
kube-system etcd-fgedu-master3 1/1 Running 0 20m
kube-system kube-apiserver-fgedu-master1 1/1 Running 0 30m
kube-system kube-apiserver-fgedu-master2 1/1 Running 0 25m
kube-system kube-apiserver-fgedu-master3 1/1 Running 0 20m
kube-system kube-controller-manager-fgedu-master1 1/1 Running 0 30m
kube-system kube-controller-manager-fgedu-master2 1/1 Running 0 25m
kube-system kube-controller-manager-fgedu-master3 1/1 Running 0 20m
kube-system kube-proxy-4q7k8 1/1 Running 0 30m
kube-system kube-proxy-7c9x6 1/1 Running 0 25m
kube-system kube-proxy-8d2k3 1/1 Running 0 20m
kube-system kube-proxy-9f5g7 1/1 Running 0 30m
kube-system kube-proxy-b7c4d 1/1 Running 0 30m
kube-system kube-scheduler-fgedu-master1 1/1 Running 0 30m
kube-system kube-scheduler-fgedu-master2 1/1 Running 0 25m
kube-system kube-scheduler-fgedu-master3 1/1 Running 0 20m
# 测试集群功能
$ kubectl create deployment nginx –image=nginx
$ kubectl expose deployment nginx –port=80 –type=NodePort
$ kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-6d6f58987b-7f5f8 1/1 Running 0 5m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1
service/nginx NodePort 10.100.123.45
4.3 附加组件安装
生产环境Kubernetes集群的附加组件安装。
# 安装Dashboard
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
# 安装Metrics Server
$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 安装Prometheus和Grafana
$ kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/setup.yaml
$ kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/kube-prometheus.yaml
# 检查附加组件状态
$ kubectl get pods -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-6d4b75cb6d-7f5f8 1/1 Running 0 5m
kubernetes-dashboard-6d4b75cb6d-7f5f8 1/1 Running 0 5m
$ kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 10m
blackbox-exporter-6d4b75cb6d-7f5f8 3/3 Running 0 10m
grafana-6d4b75cb6d-7f5f8 1/1 Running 0 10m
kube-state-metrics-6d4b75cb6d-7f5f8 3/3 Running 0 10m
node-exporter-4q7k8 2/2 Running 0 10m
node-exporter-7c9x6 2/2 Running 0 10m
node-exporter-8d2k3 2/2 Running 0 10m
node-exporter-9f5g7 2/2 Running 0 10m
node-exporter-b7c4d 2/2 Running 0 10m
prometheus-adapter-6d4b75cb6d-7f5f8 1/1 Running 0 10m
prometheus-main-0 2/2 Running 0 10m
prometheus-operator-6d4b75cb6d-7f5f8 2/2 Running 0 10m
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
# 安装Metrics Server
$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 安装Prometheus和Grafana
$ kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/setup.yaml
$ kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/kube-prometheus.yaml
# 检查附加组件状态
$ kubectl get pods -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-6d4b75cb6d-7f5f8 1/1 Running 0 5m
kubernetes-dashboard-6d4b75cb6d-7f5f8 1/1 Running 0 5m
$ kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 10m
blackbox-exporter-6d4b75cb6d-7f5f8 3/3 Running 0 10m
grafana-6d4b75cb6d-7f5f8 1/1 Running 0 10m
kube-state-metrics-6d4b75cb6d-7f5f8 3/3 Running 0 10m
node-exporter-4q7k8 2/2 Running 0 10m
node-exporter-7c9x6 2/2 Running 0 10m
node-exporter-8d2k3 2/2 Running 0 10m
node-exporter-9f5g7 2/2 Running 0 10m
node-exporter-b7c4d 2/2 Running 0 10m
prometheus-adapter-6d4b75cb6d-7f5f8 1/1 Running 0 10m
prometheus-main-0 2/2 Running 0 10m
prometheus-operator-6d4b75cb6d-7f5f8 2/2 Running 0 10m
Part05-风哥经验总结与分享
,from K8S+DB视频:www.itpux.com。
5.1 部署最佳实践
Kubernetes部署的最佳实践:
- 高可用设计:部署多个控制平面节点,确保集群高可用
- 网络配置:选择合适的网络插件,配置合理的网络CIDR
- 存储配置:为etcd配置高性能存储,确保数据安全
- 安全加固:配置RBAC,启用Pod安全策略,限制网络访问
- 监控告警:部署Prometheus和Grafana,设置合理的告警阈值
- 备份恢复:定期备份etcd数据,制定灾难恢复计划
- 版本管理:定期升级Kubernetes版本,保持系统更新
5.2 常见问题
Kubernetes部署的常见问题。
。
# 常见问题及解决方案
## 1. 网络插件安装失败
– 症状:Pod处于Pending状态,无法分配IP地址
– 解决方案:检查网络插件配置,确保网络CIDR正确,重启网络插件Pod
## 2. 控制平面节点无法启动
– 症状:kube-apiserver、kube-controller-manager或kube-scheduler无法启动
– 解决方案:检查系统日志,确保etcd正常运行,检查证书配置
## 3. 工作节点无法加入集群
– 症状:kubeadm join命令失败,提示无法连接到控制平面
– 解决方案:检查网络连接,确保控制平面节点IP正确,检查token是否过期
## 4. Pod调度失败
– 症状:Pod处于Pending状态,无法调度到节点
– 解决方案:检查节点资源,确保节点有足够的CPU和内存,检查节点亲和性配置
## 5. 集群版本升级失败
– 症状:升级过程中组件无法启动
– 解决方案:检查版本兼容性,确保备份etcd数据,按照官方文档步骤升级
## 1. 网络插件安装失败
– 症状:Pod处于Pending状态,无法分配IP地址
– 解决方案:检查网络插件配置,确保网络CIDR正确,重启网络插件Pod
## 2. 控制平面节点无法启动
– 症状:kube-apiserver、kube-controller-manager或kube-scheduler无法启动
– 解决方案:检查系统日志,确保etcd正常运行,检查证书配置
## 3. 工作节点无法加入集群
– 症状:kubeadm join命令失败,提示无法连接到控制平面
– 解决方案:检查网络连接,确保控制平面节点IP正确,检查token是否过期
## 4. Pod调度失败
– 症状:Pod处于Pending状态,无法调度到节点
– 解决方案:检查节点资源,确保节点有足够的CPU和内存,检查节点亲和性配置
## 5. 集群版本升级失败
– 症状:升级过程中组件无法启动
– 解决方案:检查版本兼容性,确保备份etcd数据,按照官方文档步骤升级
5.3 故障排查
Kubernetes部署的故障排查:
- 检查系统日志:使用journalctl查看系统日志,定位错误原因
- 检查Pod状态:使用kubectl get pods查看Pod状态,使用kubectl describe pod查看详细信息
- 检查节点状态:使用kubectl get nodes查看节点状态,使用kubectl describe node查看详细信息
- 检查网络连接:使用ping、traceroute等命令检查网络连接
- 检查etcd状态:使用etcdctl查看etcd状态,确保etcd集群正常运行
- 风哥教程参考官方文档:查阅Kubernetes官方文档,了解常见问题的解决方案
- 社区支持:在Kubernetes社区寻求帮助,如Stack Overflow、GitHub等
持续优化:Kubernetes的部署是一个持续优化的过程,随着版本的更新和业务需求的变化,需要不断调整和优化配置,确保集群的稳定性和性能。
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
