1. 故障排查概述
RHEL9到RHEL10升级过程中可能会遇到各种故障,本教程介绍如何系统地排查和解决这些故障,以及如何使用LVM快照和GRUB进行回滚。更多学习教程www.fgedu.net.cn
参考Red Hat Enterprise Linux 10官方文档中的System administration章节
$ cat > /backup/upgrade_failures.txt << 'EOF' 升级故障分类: 1. 升级前故障 - preupgrade检查失败 - 磁盘空间不足 - 依赖关系冲突 - 软件包不兼容 2. 升级中故障 - 网络连接中断 - 软件包下载失败 - 安装过程中断 - 配置迁移失败 3. 升级后故障 - 系统无法启动 - 内核启动失败 - 服务启动失败 - 网络连接失败 - 应用功能异常 4. 数据故障 - 数据丢失 - 数据损坏 - 配置文件丢失 - 权限问题 EOF $ cat /backup/upgrade_failures.txt 升级故障分类: 1. 升级前故障 - preupgrade检查失败 - 磁盘空间不足 - 依赖关系冲突 - 软件包不兼容 2. 升级中故障 - 网络连接中断 - 软件包下载失败 - 安装过程中断 - 配置迁移失败 3. 升级后故障 - 系统无法启动 - 内核启动失败 - 服务启动失败 - 网络连接失败 - 应用功能异常 4. 数据故障 - 数据丢失 - 数据损坏 - 配置文件丢失 - 权限问题
2. 日志分析技巧
学习如何分析各种日志文件来定位升级故障。学习交流加群风哥微信: itpux-com
$ sudo tail -100 /var/log/leapp/leapp-upgrade.log
2026-04-02 10:00:00 INFO: Starting upgrade process
2026-04-02 10:00:01 INFO: Source version: RHEL 9.5
2026-04-02 10:00:02 INFO: Target version: RHEL 10.0
2026-04-02 10:00:05 INFO: Phase 1: Preparation
2026-04-02 10:00:10 INFO: Downloading upgrade packages
2026-04-02 10:05:00 ERROR: Failed to download package: kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64
2026-04-02 10:05:01 ERROR: Error: Cannot download repomd.xml: Cannot download repodata/repomd.xml: error was
2026-04-02 10:05:02 ERROR: Upgrade process failed
# 查看系统日志
$ sudo journalctl -xe
— Logs begin at Mon 2026-04-01 10:00:00 CST, end at Wed 2026-04-02 10:05:02 CST. —
Apr 02 10:05:00 rhel9-server leapp[1234]: ERROR: Failed to download package: kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64
Apr 02 10:05:01 rhel9-server leapp[1234]: ERROR: Error: Cannot download repomd.xml: Cannot download repodata/repomd.xml: error was
Apr 02 10:05:02 rhel9-server systemd[1]: leapp-upgrade.service: Main process exited, code=exited, status=1/FAILURE
Apr 02 10:05:02 rhel9-server systemd[1]: leapp-upgrade.service: Failed with result ‘exit-code’
$ sudo dmesg | tail -50
[12345.678901] leapp[1234]: ERROR: Failed to download package
[12345.678902] leapp[1234]: Upgrade process failed
[12345.678903] systemd[1]: leapp-upgrade.service: Main process exited, code=exited, status=1/FAILURE
[12345.678904] systemd[1]: leapp-upgrade.service: Failed with result ‘exit-code’
# 查看DNF日志
$ sudo cat /var/log/dnf.log | tail -50
2026-04-02T10:00:00+0800 DEBUG Installed: leapp-0.18.0-1.el9.x86_64
2026-04-02T10:05:00+0800 ERROR Failed to download package: kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64
2026-04-02T10:05:01+0800 ERROR Error: Cannot download repomd.xml: Cannot download repodata/repomd.xml: error was
$ sudo cat /var/log/leapp/leapp-upgrade-report.txt
Leapp Upgrade Report
====================
Upgrade Date: Wed Apr 2 10:00:00 CST 2026
Source Version: RHEL 9.5
Target Version: RHEL 10.0
Status: Failed
Error Details:
Error Type: Package Download Failure
Error Message: Failed to download package: kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64
Error Code: 1
Recommended Actions:
1. Check network connection
2. Verify repository availability
3. Clear DNF cache
4. Retry upgrade process
# 查看JSON格式的报告
$ sudo cat /var/log/leapp/leapp-upgrade-report.json | python3 -m json.tool
{
“upgrade_date”: “2026-04-02T10:00:00+08:00”,
“source_version”: “RHEL 9.5”,
“target_version”: “RHEL 10.0”,
“status”: “failed”,
“error”: {
“type”: “Package Download Failure”,
“message”: “Failed to download package: kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64”,
“code”: 1
},
“recommended_actions”: [
“Check network connection”,
“Verify repository availability”,
“Clear DNF cache”,
“Retry upgrade process”
]
}
3. 常见升级失败场景
分析升级过程中常见的失败场景及解决方法。学习交流加群风哥QQ113257174
# 检查网络连接
$ ping -c 4 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
— 8.8.8.8 ping statistics —
4 packets transmitted, 0 received, 100% packet loss, time 3005ms
# 检查网络接口
$ ip addr show ens33
2: ens33:
link/ether 00:0c:29:12:34:56 brd ff:ff:ff:ff:ff:ff
# 重启网络服务
$ sudo systemctl restart NetworkManager
# 验证网络连接
$ ping -c 4 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=12.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=11.8 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=119 time=12.1 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=119 time=11.9 ms
— 8.8.8.8 ping statistics —
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 11.832/12.058/12.317/0.198 ms
# 检查磁盘空间
$ df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 50G 48G 2G 96% /
# 清理DNF缓存
$ sudo dnf clean all
0 files removed
# 清理journal日志
$ sudo journalctl –vacuum-size=500M
Vacuuming done, freed 1.2G of archived journals from /var/log/journal.
# 清理旧日志文件
$ sudo find /var/log -name “*.log.*” -delete
$ sudo find /var/log -name “*.gz” -delete
# 清理临时文件
$ sudo rm -rf /tmp/*
$ sudo rm -rf /var/tmp/*
# 验证磁盘空间
$ df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 50G 42G 8G 84% /
# 查看冲突的软件包
$ sudo dnf update –assumeno
Updating Subscription Management repositories.
Last metadata expiration check: 0:00:00 ago on Wed 02 Apr 2026 10:00:00 AM CST.
Dependencies resolved.
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
kernel x86_64 6.5.0-0.rc0.20260401git1234567.el10 rhel-10-baseos 150 M
Problem: problem with installed package custom-app-1.0.0-1.el9.x86_64
– package custom-app-1.0.0-1.el9.x86_64 requires libpython2.7.so.1.0()(64bit), but none of the providers can be installed
– cannot install both python2-2.7.18-18.el9.x86_64 and python3-3.9.16-1.el10.x86_64
– package custom-app-1.0.0-1.el9.x86_64 requires python2, but none of the providers can be installed
# 移除冲突的软件包
$ sudo dnf remove -y custom-app
Dependencies resolved.
================================================================================
Package Arch Version Repository Size
================================================================================
Removing:
custom-app x86_64 1.0.0-1.el9 @local 50 M
Transaction Summary
================================================================================
Remove 1 Packages
Installed size: 50 M
Is this ok [y/N]: y
Running transaction
Preparing : 1/1
Erasing : custom-app-1.0.0-1.el9.x86_64 1/1
Running scriptlet: custom-app-1.0.0-1.el9.x86_64 1/1
Verifying : custom-app-1.0.0-1.el9.x86_64 1/1
Removed:
custom-app-1.0.0-1.el9.x86_64
Complete!
# 查看配置文件备份
$ sudo ls -la /var/log/leapp/*.rpmsave
-rw-r–r– 1 root root 1234 Apr 2 10:00:00 /var/log/leapp/ssh_config.rpmsave
-rw-r–r– 1 root root 5678 Apr 2 10:00:00 /var/log/leapp/sshd_config.rpmsave
# 查看配置文件差异
$ sudo diff /etc/ssh/ssh_config /var/log/leapp/ssh_config.rpmsave
123c123
< PermitRootLogin yes
---
> PermitRootLogin no
# 手动恢复配置文件
$ sudo cp /var/log/leapp/ssh_config.rpmsave /etc/ssh/ssh_config
# 重启相关服务
$ sudo systemctl restart sshd
# 验证服务状态
$ sudo systemctl status sshd
● sshd.service – OpenSSH server daemon
Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2026-04-02 10:00:00 CST; 10s ago
Docs: man:sshd(8) man:sshd_config(5)
Main PID: 1234 (sshd)
Tasks: 1 (limit: 4915)
Memory: 5.2M
CGroup: /system.slice/sshd.service
└─1234 /usr/sbin/sshd -D -oCiphers=aes256-gcm@openssh.com,chacha20-poly1305@openssh.com,aes256-ctr,aes128-gcm@openssh.com,aes128-ctr
4. 启动失败处理
处理系统启动失败的各种情况。更多学习教程公众号风哥教程itpux_com
# 进入GRUB菜单,选择”Advanced options”
# 选择旧内核启动
# 查看可用内核
$ sudo grubby –info=ALL | grep title
title=Red Hat Enterprise Linux (6.5.0-0.rc0.20260401git1234567.el10.x86_64) 10.0 (Ootpa)
title=Red Hat Enterprise Linux (5.14.0-427.28.1.el9_5.x86_64) 9.5 (Plow)
# 设置默认启动项为旧内核
$ sudo grubby –set-default-index=1
# 验证默认启动项
$ sudo grubby –default-index
1
# 重启系统
$ sudo reboot
# 系统启动后,验证内核版本
$ uname -r
5.14.0-427.28.1.el9_5.x86_64
# 进入GRUB救援模式
# 选择”Rescue a Red Hat Enterprise Linux system”
# 重新安装GRUB
$ sudo grub2-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.
# 重新生成GRUB配置
$ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file …
Found linux image: /boot/vmlinuz-6.5.0-0.rc0.20260401git1234567.el10.x86_64
Found initrd image: /boot/initramfs-6.5.0-0.rc0.20260401git1234567.el10.x86_64.img
Found linux image: /boot/vmlinuz-5.14.0-427.28.1.el9_5.x86_64
Found initrd image: /boot/initramfs-5.14.0-427.28.1.el9_5.x86_64.img
done
# 重启系统
$ sudo reboot
# 重新生成initramfs
$ sudo dracut –regenerate-all –force
dracut: *** Including module: bash ***
dracut: *** Including module: systemd ***
dracut: *** Including module: systemd-initrd ***
dracut: *** Including module: i18n ***
dracut: *** Including module: network ***
dracut: *** Including module: ifcfg ***
dracut: *** Including module: drm ***
dracut: *** Including module: plymouth ***
dracut: *** Including module: kernel-modules ***
dracut: *** Including module: kernel-modules-extra ***
dracut: *** Including module: resume ***
dracut: *** Including module: rootfs-block ***
dracut: *** Including module: terminfo ***
dracut: *** Including module: udev-rules ***
dracut: *** Including module: base ***
dracut: *** Including module: fs-lib ***
dracut: *** Including module: shutdown ***
dracut: *** Including module: usrmount ***
dracut: *** Including module: emergency ***
dracut: *** Including module: 99base ***
dracut: *** Including module: 99shutdown ***
dracut: *** Including module: 99emergency ***
dracut: *** Creating initramfs image file ‘/boot/initramfs-6.5.0-0.rc0.20260401git1234567.el10.x86_64.img’ ***
dracut: *** Creating initramfs image file ‘/boot/initramfs-5.14.0-427.28.1.el9_5.x86_64.img’ ***
# 重启系统
$ sudo reboot
# 检查文件系统
$ sudo fsck -y /dev/sda1
fsck from util-linux 2.37.4
e2fsck 1.46.5 (30-Dec-2021)
/dev/sda1: clean, 123456/6553600 files, 1234567/26214400 blocks
# 如果文件系统损坏,修复它
$ sudo fsck -y -f /dev/sda1
fsck from util-linux 2.37.4
e2fsck 1.46.5 (30-Dec-2021)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary
/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda1: 123456/6553600 files (0.2% non-contiguous), 1234567/26214400 blocks
# 重启系统
$ sudo reboot
5. 网络故障处理
处理升级后的网络连接问题。更多学习教程www.fgedu.net.cn
# 查看网络接口
$ ip link show
1: lo:
link/loopback 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
2: enp0s3:
link/ether 00:0c:29:12:34:56 brd ff:ff:ff:ff:ff:ff
# 查看旧的网络配置文件
$ ls -la /etc/sysconfig/network-scripts/
total 12
-rw-r–r– 1 root root 356 Apr 2 10:00:00 ifcfg-ens33
-rw-r–r– 1 root root 254 Apr 2 10:00:00 ifcfg-lo
# 重命名网络配置文件
$ sudo mv /etc/sysconfig/network-scripts/ifcfg-ens33 /etc/sysconfig/network-scripts/ifcfg-enp0s3
# 修改配置文件中的设备名称
$ sudo sed -i ‘s/NAME=”ens33″/NAME=”enp0s3″/’ /etc/sysconfig/network-scripts/ifcfg-enp0s3
$ sudo sed -i ‘s/DEVICE=”ens33″/DEVICE=”enp0s3″/’ /etc/sysconfig/network-scripts/ifcfg-enp0s3
# 重启网络服务
$ sudo systemctl restart NetworkManager
# 验证网络连接
$ ip addr show enp0s3
2: enp0s3:
link/ether 00:0c:29:12:34:56 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.100/24 brd 192.168.1.255 scope global dynamic noprefixroute enp0s3
valid_lft 86399sec preferred_lft 86399sec
# 检查NetworkManager状态
$ sudo systemctl status NetworkManager
● NetworkManager.service – Network Manager
Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2026-04-02 10:00:00 CST; 10s ago
Docs: man:NetworkManager(8)
Main PID: 1234 (NetworkManager)
Tasks: 3 (limit: 4915)
Memory: 8.5M
CGroup: /system.slice/NetworkManager.service
├─1234 /usr/sbin/NetworkManager –no-daemon
├─1235 /usr/sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp0s3.pid -lf /var/lib/NetworkManager/dhclient-5fb06bd0-af35-11eb-95e7-54e1adxxxxxx-enp0s3.lease -cf /var/lib/NetworkManager/dhclient-enp0s3.conf enp0s3
└─1236 /usr/sbin/dnsmasq –no-resolv –keep-in-foreground –no-hosts –bind-interfaces –pid-file=/var/run/NetworkManager/dnsmasq-enp0s3.pid –listen-address=127.0.0.1 –cache-size=400 –dhcp-range=192.168.1.100,192.168.1.200,12h –dhcp-option=option:router,192.168.1.1 –dhcp-option=option:dns-server,8.8.8.8 –conf-file=/var/lib/NetworkManager/dnsmasq-enp0s3.conf
# 查看网络连接
$ sudo nmcli connection show
NAME UUID TYPE DEVICE
ens33 abc12345-def67-8901-2345-678901234567 ethernet enp0s3
# 重新加载网络配置
$ sudo nmcli connection reload
# 重新连接网络
$ sudo nmcli connection up ens33
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/1)
# 验证网络连接
$ ping -c 4 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=12.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=11.8 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=119 time=12.1 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=119 time=11.9 ms
— 8.8.8.8 ping statistics —
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 11.832/12.058/12.317/0.198 ms
# 检查DNS配置
$ cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 127.0.0.1
# 查看NetworkManager DNS配置
$ sudo nmcli connection show ens33 | grep dns
ipv4.dns: 8.8.8.8,8.8.4.4
ipv4.dns-search: —
ipv4.dns-options: —
ipv6.dns: —
ipv6.dns-search: —
ipv6.dns-options: —
# 重新配置DNS
$ sudo nmcli connection mod ens33 ipv4.dns “8.8.8.8 8.8.4.4”
$ sudo nmcli connection up ens33
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/1)
# 验证DNS解析
$ nslookup www.baidu.com
Server: 8.8.8.8
Address: 8.8.8.8#53
Non-authoritative answer:
www.baidu.com canonical name = www.a.shifen.com
Name: www.a.shifen.com
Address: 14.215.177.38
Name: www.a.shifen.com
Address: 14.215.177.39
6. 服务故障处理
处理升级后服务启动失败的问题。学习交流加群风哥微信: itpux-com
# 查看服务状态
$ sudo systemctl status httpd
● httpd.service – The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2026-04-02 10:00:00 CST; 10s ago
Docs: man:httpd.service(8)
Process: 1234 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=exited, status=1/FAILURE)
Main PID: 1234 (code=exited, status=1/FAILURE)
Apr 02 10:00:00 rhel10-server systemd[1]: Starting The Apache HTTP Server…
Apr 02 10:00:00 rhel10-server httpd[1234]: AH00526: Syntax error on line 123 of /etc/httpd/conf/httpd.conf:
Apr 02 10:00:00 rhel10-server httpd[1234]: Invalid command ‘Require’, perhaps misspelled or defined by a module not included in the server configuration
Apr 02 10:00:00 rhel10-server systemd[1]: httpd.service: Main process exited, code=exited, status=1/FAILURE
Apr 02 10:00:00 rhel10-server systemd[1]: httpd.service: Failed with result ‘exit-code’
Apr 02 10:00:00 rhel10-server systemd[1]: Failed to start The Apache HTTP Server.
# 查看配置文件
$ sudo sed -n ‘120,125p’ /etc/httpd/conf/httpd.conf
# Require all granted
# 检查Apache模块
$ sudo httpd -M | grep auth
authz_core_module (shared)
authz_host_module (shared)
# 启用所需的模块
$ sudo sed -i ‘s/#LoadModule authz_core_module/LoadModule authz_core_module/’ /etc/httpd/conf.modules.d/00-base.conf
# 重启服务
$ sudo systemctl restart httpd
# 验证服务状态
$ sudo systemctl status httpd
● httpd.service – The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2026-04-02 10:00:00 CST; 10s ago
Docs: man:httpd.service(8)
Main PID: 1235 (httpd)
Tasks: 175 (limit: 4915)
Memory: 15.2M
CGroup: /system.slice/httpd.service
├─1235 /usr/sbin/httpd -DFOREGROUND
├─1236 /usr/sbin/httpd -DFOREGROUND
├─1237 /usr/sbin/httpd -DFOREGROUND
├─1238 /usr/sbin/httpd -DFOREGROUND
└─1239 /usr/sbin/httpd -DFOREGROUND
# 查看服务状态
$ sudo systemctl status mariadb
● mariadb.service – MariaDB 10.11 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2026-04-02 10:00:00 CST; 10s ago
Docs: man:mariadbd(8)
Process: 1234 ExecStartPre=/usr/libexec/mariadb-check-socket (code=exited, status=0/SUCCESS)
Process: 1235 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir mariadb.service (code=exited, status=1/FAILURE)
Main PID: 1235 (code=exited, status=1/FAILURE)
Apr 02 10:00:00 rhel10-server systemd[1]: Starting MariaDB 10.11 database server…
Apr 02 10:00:00 rhel10-server mariadb-prepare-db-dir[1235]: File ‘/var/lib/mysql/ibdata1’ size is 0 bytes
Apr 02 10:00:00 rhel10-server mariadb-prepare-db-dir[1235]: InnoDB: Error: log file ./ib_logfile0 is of different size 0 bytes than InnoDB: specified 5242880 bytes!
Apr 02 10:00:00 rhel10-server systemd[1]: mariadb.service: Control process exited, code=exited, status=1/FAILURE
Apr 02 10:00:00 rhel10-server systemd[1]: mariadb.service: Failed with result ‘exit-code’
Apr 02 10:00:00 rhel10-server systemd[1]: Failed to start MariaDB 10.11 database server.
# 检查数据库目录
$ sudo ls -la /var/lib/mysql/
total 123456
drwxr-xr-x 2 mysql mysql 4096 Apr 2 10:00:00 .
drwxr-xr-x 3 root root 4096 Apr 2 10:00:00 ..
-rw-r—– 1 mysql mysql 0 Apr 2 10:00:00 ibdata1
-rw-r—– 1 mysql mysql 5242880 Apr 2 10:00:00 ib_logfile0
-rw-r—– 1 mysql mysql 5242880 Apr 2 10:00:00 ib_logfile1
# 删除损坏的日志文件
$ sudo rm -f /var/lib/mysql/ib_logfile0 /var/lib/mysql/ib_logfile1
# 重启服务
$ sudo systemctl restart mariadb
# 验证服务状态
$ sudo systemctl status mariadb
● mariadb.service – MariaDB 10.11 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2026-04-02 10:00:00 CST; 10s ago
Docs: man:mariadbd(8)
Process: 1236 ExecStartPost=/usr/libexec/mysql-check-upgrade (code=exited, status=0/SUCCESS)
Main PID: 1237 (mariadbd)
Tasks: 10 (limit: 4915)
Memory: 150.2M
CGroup: /system.slice/mariadb.service
└─1237 /usr/libexec/mariadbd –basedir=/usr
# 检查SELinux上下文
$ sudo ls -Z /var/www/html
system_u:object_r:httpd_sys_content_t:s0 index.html
# 检查SELinux日志
$ sudo ausearch -m avc -ts recent | tail -20
type=AVC msg=audit(1234567890.123:123): avc: denied { read } for pid=1234 comm=”httpd” name=”index.html” dev=”sda1″ ino=123456 scontext=system_u:system_r:httpd_t:s0 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0
# 恢复SELinux上下文
$ sudo restorecon -R -v /var/www/html
restorecon reset /var/www/html context system_u:object_r:user_home_t:s0->system_u:object_r:httpd_sys_content_t:s0
# 重启服务
$ sudo systemctl restart httpd
# 验证服务状态
$ sudo systemctl status httpd
● httpd.service – The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2026-04-02 10:00:00 CST; 10s ago
Docs: man:httpd.service(8)
Main PID: 1235 (httpd)
Tasks: 175 (limit: 4915)
Memory: 15.2M
CGroup: /system.slice/httpd.service
├─1235 /usr/sbin/httpd -DFOREGROUND
├─1236 /usr/sbin/httpd -DFOREGROUND
├─1237 /usr/sbin/httpd -DFOREGROUND
├─1238 /usr/sbin/httpd -DFOREGROUND
└─1239 /usr/sbin/httpd -DFOREGROUND
7. LVM快照回滚
使用LVM快照进行系统回滚。学习交流加群风哥QQ113257174 更多视频教程www.fgedu.net.cn
$ sudo lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
root rhel owi-aotz– 50.00g
rhel10-upgrade-snapshot rhel Vri—tz-k 50.00g root 0.00
home rhel -wi-ao—- 20.00g
var rhel -wi-ao—- 20.00g
tmp rhel -wi-ao—- 10.00g
swap rhel -wi-ao—- 4.00g
# 检查快照状态
$ sudo lvdisplay /dev/rhel/rhel10-upgrade-snapshot
— Logical volume —
LV Path /dev/rhel/rhel10-upgrade-snapshot
LV Name rhel10-upgrade-snapshot
VG Name rhel
LV UUID abc123-def456-7890-1234-567890123456
LV Write Access read/write
LV Creation host, time rhel9-server, 2026-04-02 10:00:00 +0800
LV Status available
# open 0
LV Size 50.00 GiB
Current LE 12800
COW-table size 4.00 GiB
COW-table LE 1024
Allocated to snapshot 0.00%
Snapshot chunk size 4.00 KiB
Segments 1
Allocation inherit
Read ahead sectors auto
– currently set to 256
Block device 253:6
# 执行回滚前准备
$ sudo systemctl stop httpd
$ sudo systemctl stop mariadb
$ sudo systemctl stop docker
# 卸载文件系统
$ sudo umount /home
$ sudo umount /var
$ sudo umount /tmp
$ sudo lvconvert –merge /dev/rhel/rhel10-upgrade-snapshot
Merging of volume rhel/rhel10-upgrade-snapshot into rhel/root started.
rhel/root: Merged: 0.00%
rhel/root: Merged: 25.00%
rhel/root: Merged: 50.00%
rhel/root: Merged: 75.00%
rhel/root: Merged: 100.00%
# 重启系统以完成合并
$ sudo reboot
# 系统重启后,验证回滚结果
$ cat /etc/redhat-release
Red Hat Enterprise Linux release 9.5 (Plow)
$ uname -r
5.14.0-427.28.1.el9_5.x86_64
# 查看LVM状态
$ sudo lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
root rhel owi-aotz– 50.00g
home rhel -wi-ao—- 20.00g
var rhel -wi-ao—- 20.00g
tmp rhel -wi-ao—- 10.00g
swap rhel -wi-ao—- 4.00g
# 快照已自动删除
$ sudo systemctl status sshd
● sshd.service – OpenSSH server daemon
Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2026-04-02 10:00:00 CST; 10s ago
Docs: man:sshd(8) man:sshd_config(5)
Main PID: 1234 (sshd)
Tasks: 1 (limit: 4915)
Memory: 5.2M
CGroup: /system.slice/sshd.service
└─1234 /usr/sbin/sshd -D -oCiphers=aes256-gcm@openssh.com,chacha20-poly1305@openssh.com,aes256-ctr,aes128-gcm@openssh.com,aes128-ctr
$ ping -c 4 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=12.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=11.8 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=119 time=12.1 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=119 time=11.9 ms
— 8.8.8.8 ping statistics —
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 11.832/12.058/12.317/0.198 ms
8. GRUB引导回滚
使用GRUB引导进行系统回滚。更多学习教程公众号风哥教程itpux_com
$ sudo grubby –info=ALL | grep -A 2 “title”
title=Red Hat Enterprise Linux (6.5.0-0.rc0.20260401git1234567.el10.x86_64) 10.0 (Ootpa)
title=Red Hat Enterprise Linux (5.14.0-427.28.1.el9_5.x86_64) 9.5 (Plow)
# 查看当前默认启动项
$ sudo grubby –default-index
0
# 设置默认启动项为旧内核
$ sudo grubby –set-default-index=1
# 验证默认启动项
$ sudo grubby –default-index
1
# 重启系统
$ sudo reboot
# 系统启动后,验证内核版本
$ uname -r
5.14.0-427.28.1.el9_5.x86_64
# 1. 重启系统,在GRUB菜单出现时按’e’键
# 2. 找到以’linux16’或’linux’开头的行
# 3. 在该行末尾添加’rd.break’
# 4. 按’Ctrl+x’启动系统
# 进入救援模式后,重新挂载根文件系统
switch_root:/# mount -o remount,rw /sysroot
# 切换到根文件系统
switch_root:/# chroot /sysroot
# 重新安装GRUB
bash-4.4# grub2-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.
# 重新生成GRUB配置
bash-4.4# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file …
Found linux image: /boot/vmlinuz-6.5.0-0.rc0.20260401git1234567.el10.x86_64
Found initrd image: /boot/initramfs-6.5.0-0.rc0.20260401git1234567.el10.x86_64.img
Found linux image: /boot/vmlinuz-5.14.0-427.28.1.el9_5.x86_64
Found initrd image: /boot/initramfs-5.14.0-427.28.1.el9_5.x86_64.img
done
# 退出chroot环境
bash-4.4# exit
# 重启系统
switch_root:/# reboot
# 查看已安装的内核
$ rpm -qa | grep kernel
kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64
kernel-core-6.5.0-0.rc0.20260401git1234567.el10.x86_64
kernel-modules-6.5.0-0.rc0.20260401git1234567.el10.x86_64
kernel-5.14.0-427.28.1.el9_5.x86_64
kernel-core-5.14.0-427.28.1.el9_5.x86_64
kernel-modules-5.14.0-427.28.1.el9_5.x86_64
# 删除损坏的内核
$ sudo dnf remove -y kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64
Dependencies resolved.
================================================================================
Package Arch Version Repository Size
================================================================================
Removing:
kernel x86_64 6.5.0-0.rc0.20260401git1234567.el10 rhel-10-baseos 150 M
kernel-core x86_64 6.5.0-0.rc0.20260401git1234567.el10 rhel-10-baseos 100 M
kernel-modules x86_64 6.5.0-0.rc0.20260401git1234567.el10 rhel-10-baseos 50 M
Transaction Summary
================================================================================
Remove 3 Packages
Installed size: 300 M
Is this ok [y/N]: y
Running transaction
Preparing : 1/1
Erasing : kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64 1/3
Erasing : kernel-core-6.5.0-0.rc0.20260401git1234567.el10.x86_64 2/3
Erasing : kernel-modules-6.5.0-0.rc0.20260401git1234567.el10.x86_64 3/3
Running scriptlet: kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64 3/3
Verifying : kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64 1/3
Verifying : kernel-core-6.5.0-0.rc0.20260401git1234567.el10.x86_64 2/3
Verifying : kernel-modules-6.5.0-0.rc0.20260401git1234567.el10.x86_64 3/3
Removed:
kernel-6.5.0-0.rc0.20260401git1234567.el10.x86_64
kernel-core-6.5.0-0.rc0.20260401git1234567.el10.x86_64
kernel-modules-6.5.0-0.rc0.20260401git1234567.el10.x86_64
Complete!
9. 紧急恢复方案
在严重故障情况下的紧急恢复方案。更多学习教程www.fgedu.net.cn from LinuxDBA视频:www.itpux.com
# 1. 从Live CD启动系统
# 2. 挂载根文件系统
$ sudo mkdir -p /mnt/rhel
$ sudo mount /dev/sda1 /mnt/rhel
# 3. 挂载其他文件系统
$ sudo mount /dev/sda2 /mnt/rhel/boot
$ sudo mount /dev/rhel/home /mnt/rhel/home
$ sudo mount /dev/rhel/var /mnt/rhel/var
# 4. chroot到系统
$ sudo chroot /mnt/rhel
# 5. 执行恢复操作
# 例如:重新安装GRUB
bash-4.4# grub2-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.
# 6. 退出chroot并重启
bash-4.4# exit
$ sudo reboot
# 1. 从Live CD启动系统
# 2. 挂载备份磁盘
$ sudo mkdir -p /mnt/backup
$ sudo mount /dev/sdb1 /mnt/backup
# 3. 恢复系统配置文件
$ sudo tar -xzf /mnt/backup/system-config-20260402.tar.gz -C /
# 4. 恢复重要数据
$ sudo tar -xzf /mnt/backup/important-data-20260402.tar.gz -C /
# 5. 恢复数据库
$ sudo mysql < /mnt/backup/mysql-backup-20260402.sql
# 6. 重启系统
$ sudo reboot
# 1. 从安装介质启动
# 2. 选择”安装RHEL 10″
# 3. 在安装过程中选择”重新安装”
# 4. 保留数据分区
# 5. 完成安装后恢复数据
# 恢复数据
$ sudo tar -xzf /backup/important-data-20260402.tar.gz -C /
# 恢复数据库
$ sudo mysql < /backup/mysql-backup-20260402.sql
# 恢复系统配置
$ sudo tar -xzf /backup/system-config-20260402.tar.gz -C /
# 重启服务
$ sudo systemctl start httpd
$ sudo systemctl start mariadb
10. 风哥经验总结
在生产环境中进行升级故障排查和回滚的经验总结。学习交流加群风哥微信: itpux-com
$ cat > /backup/troubleshooting_experience1.txt << 'EOF' 经验1:建立完善的备份策略 1. 备份要完整 - 系统配置文件 - 重要数据 - 数据库 - 软件包列表 - 系统信息 2. 备份要定期 - 每日增量备份 - 每周全量备份 - 升级前额外备份 - 备份要验证 3. 备份要安全 - 存储在多个位置 - 加密敏感数据 - 定期测试恢复 - 记录备份信息 EOF
$ cat > /backup/troubleshooting_experience2.txt << 'EOF' 经验2:建立完善的监控体系 1. 系统监控 - CPU使用率 - 内存使用率 - 磁盘使用率 - 网络流量 2. 服务监控 - 服务状态 - 服务响应时间 - 服务错误率 - 服务日志 3. 应用监控 - 应用功能 - 应用性能 - 应用错误 - 应用日志 EOF
$ cat > /backup/troubleshooting_experience3.txt << 'EOF' 经验3:建立完善的应急预案 1. 应急响应流程 - 问题发现 - 问题评估 - 问题处理 - 问题验证 2. 应急联系人 - 技术负责人 - 系统管理员 - 应用负责人 - 技术支持 3. 应急资源 - 备份服务器 - 备用硬件 - 应急文档 - 恢复工具 EOF
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
