内容简介:本文风哥教程参考Linux官方文档、Red Hat Enterprise Linux官方文档、Ansible Automation Platform官方文档、Docker官方文档、Kubernetes官方文档和Podman官方文档等内容,详细介绍了相关技术的配置和使用方法。
本文档介绍Kubernetes集群的备份与恢复方法。
风哥提示:
Part01-备份概述
1.1 备份策略
[root@k8s-master ~]# cat > /root/k8s-backup.txt << 'EOF' Kubernetes备份与恢复 ==================== 1. 备份内容 - etcd数据: 集群状态存储 - 资源清单: YAML配置文件 - 持久数据: PV/PVC数据 - 证书密钥: 安全凭证 2. 备份工具 - etcdctl: etcd备份 - Velero: 集群备份工具 - kubectl: 资源导出 - 自定义脚本 3. 备份策略 - 全量备份: 定期完整备份 - 增量备份: 变化数据备份 - 异地备份: 灾难恢复 4. 恢复场景 - 集群故障恢复 - 误删除恢复 - 灾难恢复 EOF
Part02-etcd备份
2.1 etcd数据备份
[root@k8s-master ~]# mkdir -p /backup/etcd
# etcd快照备份
[root@k8s-master ~]# ETCDCTL_API=3 etcdctl snapshot save /backup/etcd/etcd-snapshot-$(date +%Y%m%d%H%M%S).db \
–cacert=/etc/kubernetes/pki/etcd/ca.crt \
–cert=/etc/kubernetes/pki/etcd/server.更多学习教程公众号风哥教程itpux_comcrt \
–key=/etc/kubernetes/pki/etcd/server.key
Snapshot saved at /backup/etcd/etcd-snapshot-20260404200000.db
# 查看备份文件
[root@k8s-master ~]# ls -lh /backup/etcd/
total 100M
-rw——- 1 root root 100M Apr 4 20:00 etcd-snapshot-20260404200000.db
# 验证备份完整性
[root@k8s-master ~]# ETCDCTL_API=3 etcdctl snapshot status /backup/etcd/etcd-snapshot-20260404200000.db –write-out=table
+———-+———-+————+————+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+———-+———-+————+————+
| abc12345 | 1234567 | 12345 | 100 MB |
+———-+———-+————+————+
# 创建自动备份脚本
[root@k8s-master ~]# cat > /usr/local/bin/etcd-backup.sh << 'EOF'
#!/bin/bash
# etcd-backup.sh
# from:www.itpux.com.qq113257174.wx:itpux-com
# web:from PG视频:www.itpux.com http://www.fgedu.net.cn
BACKUP_DIR="/backup/etcd"
DATE=$(date +%Y%m%d%H%M%S)
RETENTION_DAYS=7
mkdir -p $BACKUP_DIR
ETCDCTL_API=3 etcdctl snapshot save $BACKUP_DIR/etcd-snapshot-$DATE.db \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# 清理旧备份
find $BACKUP_DIR -name "etcd-snapshot-*.db" -mtime +$RETENTION_DAYS -delete
echo "Backup completed: $BACKUP_DIR/etcd-snapshot-$DATE.db"
EOF
[root@k8s-master ~]# chmod +x /usr/local/bin/etcd-backup.sh
# 配置定时备份
[root@k8s-master ~]# cat > /etc/cron.d/etcd-backup << 'EOF'
0 */6 * * * root /usr/local/bin/etcd-backup.sh >> /var/log/etcd-backup.log 2>&1
EOF
Part03-Velero备份
3.1 安装Velero
[root@k8s-master ~]# curl -LO https://github.com/vmware-tanzu/velero/releases/download/v1.12.0/velero-v1.12.0-linux-amd64.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 648 100 648 0 0 1234 0 –:–:– –:–:– –:–:– 1234
100 23.4M 100 23.4M 0 0 5678k 0 0:00:04 0:00:04 –:–:– 6789k
[root@k8s-master ~]# tar -xzf velero-v1.12.0-linux-amd64.tar.gz
[root@k8s-master ~]# mv velero-v1.12.0-linux-amd64/velero /usr/local/bin/
# 创建S3凭证文件
[root@k8s-master ~]# cat > /root/velero-credentials << 'EOF'
[default]
aws_access_key_id = minioadmin
aws_secret_access_key = minioadmin
EOF
# 安装Velero
[root@k8s-master ~]# velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.7.0 \
--bucket velero-backup \
--secret-file /root/velero-credentials \
--backup-location-config region=default,s3ForcePathStyle=true,学习交流加群风哥QQ113257174s3Url=http://192.168.1.100:9000 \
--use-volume-snapshots=false
CustomResourceDefinition/backups.velero.io: attempting to create resource
CustomResourceDefinition/backups.velero.io: created
CustomResourceDefinition/backupstoragelocations.velero.io: attempting to create resource
CustomResourceDefinition/backupstoragelocations.velero.io: created
CustomResourceDefinition/deletebackuprequests.velero.io: attempting to create resource
CustomResourceDefinition/deletebackuprequests.velero.io: created
CustomResourceDefinition/downloadrequests.velero.io: attempting to create resource
CustomResourceDefinition/downloadrequests.velero.io: created
CustomResourceDefinition/podvolumebackups.velero.io: attempting to create resource
CustomResourceDefinition/podvolumebackups.velero.io: created
Waiting for velero deployment to be ready.
Deployment "velero" in namespace "velero" is ready.
# 查看Velero状态
[root@k8s-master ~]# velero version
Client:
Version: v1.12.0
Git commit: abc123def456
Server:
Version: v1.12.0
# 创建备份
[root@k8s-master ~]# velero backup create fgedu-daily-backup --include-namespaces fgedu-prod
Backup request "fgedu-daily-backup" submitted successfully.
Run `velero backup describe fgedu-daily-backup` or `velero backup logs fgedu-daily-backup` for more details.
# 查看备份状态
[root@k8s-master ~]# velero backup describe fgedu-daily-backup
Name: fgedu-daily-backup
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.更多视学习交流加群风哥微信: itpux-com频教程www.fgedu.net.cn28.3
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=28
Phase: Completed
Errors: 0
Warnings: 0
Estimated total items to be backed up: 50
Items backed up: 50
Backup Volumes: 0
Part04-集群恢复
4.1 etcd恢复
[root@k8s-master ~]# kubectl stop kube-apiserver kube-controller-manager kube-scheduler
[root@k8s-master ~]# mv /etc/kubernetes/manifests/*.yaml /tmp/
# 恢复etcd数据
[root@k8s-master ~]# ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd/etcd-snapshot-20260404200000.db \
–data-dir=/var/lib/etcd-restore \
–name=k8s-master \
–initial-cluster=k8s-master=https://192.168.1.100:2380 \
–initial-cluster-token=etcd-cluster \
–initial-advertise-peer-urls=https://192.168.1.100:2380
2026-04-04 21:00:00.123456 I | mvcc: restore compact to 1234567
2026-04-04 21:00:00.234567 I | etcdserver/membership: added member abc123 [http://localhost:2380] to cluster def456
# 替换etcd数据目录
[root@k8s-master ~]# mv /var/lib/etcd /var/lib/etcd.bak
[root@k8s-master ~]# mv /var/lib/etcd-restore /var/lib/etcd
# 启动控制平面
[root@k8s-master ~]# mv /tmp/*.yaml /etc/kubernetes/manifests/
# 验证恢复
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 10d v1.28.3
k8s-node1 Ready
k8s-node2 Ready
# Velero恢复
[root@k8s-master ~]# velero restore create –from-backup fgedu-daily-backup
Restore request “fgedu-daily-backup-20260404210000” submitted successfully.
Run `velero restore describe fgedu-daily-backup-20260404210000` or `velero restore logs fgedu-daily-backup-20260404210000` for more details.
# 查看恢复状态
[root@k8s-master ~]# velero restore describe fgedu-daily-backup-20260404210000
Name: fgedu-daily-backup-20260404210000
Namespace: velero
Labels:
Annotations:
Phase: Completed
Total items to be restored: 50
Items restored: 50
Started: 2026-04-04 21:00:00 +0800 CST
Completed: 2026-04-04 21:01:00 +0800 CST
Warnings:
Velero:
Cluster:
Namespaces:
fgedu-prod:
# 验证恢复的资源
[root@k8s-master ~]# kubectl get all -n fgedu-prod
NAME READY STATUS RESTARTS AGE
pod/fgedu-web-abc12-xyz789 1/1 Running 0 5m
pod/fgedu-web-abc12-abc12 1/1 Running 0 5m
pod/fgedu-web-abc12-def34 1/1 Running 0 5m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/fgedu-web ClusterIP 10.96.100.100
- 定期备份etcd数据
- 使用Velero备份应用资源
- 验证备份完整性
- 定期演练恢复流程
- 异地备份重要数据
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
