1. 首页 > Linux教程 > 正文

Linux教程FG045-安装常见故障(启动失败/介质异常)排查命令

本文档风哥主要介绍Linux安装过程中常见故障的排查方法,包括启动失败、介质异常等问题的诊断命令和解决方案,参考Red Hat Enterprise Linux 10官方文档,适合运维人员在学习和测试中使用,如果要应用于生产环境则需要自行确认。更多视频教程www.fgedu.net.cn

参考Red Hat Enterprise Linux 10官方文档中的System administration章节 学习交流加群风哥微信: itpux-com

Part01-基础概念与理论知识

1.1 Linux安装常见故障类型

Linux安装过程中可能遇到的故障类型包括:

常见故障分类:

  • 启动失败:GRUB损坏、内核丢失、引导配置错误
  • 介质异常:ISO损坏、U盘故障、光盘划伤
  • 硬件兼容:驱动缺失、硬件不兼容
  • 分区问题:分区表损坏、磁盘空间不足
  • 网络问题:网络安装源不可达、DNS解析失败
  • 内存问题:内存不足、内存故障

1.2 启动失败原因分析

启动失败的常见原因: 更多学习教程公众号风哥教程itpux_com

  • GRUB配置错误:grub.cfg文件损坏或配置不当
  • 内核文件丢失:vmlinuz或initramfs文件损坏
  • 引导分区问题:/boot分区损坏或未正确挂载
  • 磁盘故障:磁盘损坏或连接问题
  • BIOS/UEFI设置:启动顺序错误、安全启动问题

1.3 介质异常问题

介质异常的常见表现: 学习交流加群风哥QQ113257174

  • ISO镜像损坏:下载不完整、文件传输错误
  • U盘故障:U盘质量问题、写入错误
  • 光盘问题:光盘划伤、刻录错误
  • 网络介质:网络中断、服务器故障
风哥提示:安装故障排查需要有系统性的方法,从最可能的原因开始逐一排除,同时善用日志文件定位问题。

Part02-生产环境规划与建议

2.1 故障排查流程

# Linux安装故障排查流程

1. 收集信息
– 记录错误信息
– 查看安装日志
– 确认硬件配置

2. 分析问题
– 确定故障类型
– 查找可能原因
– 制定排查计划

3. 验证假设
– 逐步测试
– 记录测试结果
– 调整排查方向

4. 解决问题
– 应用解决方案
– 验证修复效果
– 记录解决过程

5. 预防措施
– 分析根本原因
– 制定预防策略
– 更新文档知识库

2.2 预防策略

预防安装故障的策略: from LinuxDBA视频:www.itpux.com

# 安装故障预防策略

1. 介质验证
– 下载后校验MD5/SHA256
– 使用可靠的下载工具
– 选择可靠的镜像源

2. 硬件检查
– 验证硬件兼容性
– 检查磁盘健康状态
– 测试内存完整性

3. 环境准备
– 确认系统要求
– 准备足够的磁盘空间
– 检查网络连接

4. 备份方案
– 准备备用安装介质
– 保留重要数据备份
– 记录系统配置信息

2.3 排查工具准备

准备必要的排查工具:

  • 系统救援介质:Live CD/USB、救援模式
  • 硬件诊断工具:memtest86+、smartctl
  • 日志分析工具:journalctl、dmesg
  • 磁盘工具:fdisk、gdisk、fsck

Part03-生产环境项目实施方案

3.1 启动失败排查命令

# 启动失败排查命令集合

# 1. 进入救援模式
# 在GRUB菜单按’e’编辑启动项,在linux行末尾添加:
systemd.unit=rescue.target

# 或使用安装介质启动,选择救援模式

# 2. 查看GRUB配置
# cat /boot/grub2/grub.cfg
#
# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub2-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#

### BEGIN /etc/grub.d/00_header ###
set pager=1

if [ -s $prefix/grubenv ]; then
load_env
fi
if [ “${next_entry}” ] ; then
set default=”${next_entry}”
set next_entry=
save_env next_entry
set boot_once=true
else
set default=”${saved_entry}”
fi

# 3. 检查内核文件
# ls -lh /boot/vmlinuz-*
-rwxr-xr-x. 1 root root 12M Apr 2 10:00 /boot/vmlinuz-5.14.0-123.el10.x86_64

# ls -lh /boot/initramfs-*
-rw——-. 1 root root 85M Apr 2 10:00 /boot/initramfs-5.14.0-123.el10.x86_64.img

# 4. 检查引导分区
# df -h /boot
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 1014M 285M 730M 29% /boot

# 5. 检查磁盘分区表
# fdisk -l /dev/sda
Disk /dev/sda: 100 GiB, 107374182400 bytes, 209715200 sectors
Disk model: VBOX HARDDISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 12345678-1234-1234-1234-123456789ABC

Device Start End Sectors Size Type
/dev/sda1 2048 2099199 2097152 1G EFI System
/dev/sda2 2099200 209715166 207615967 99G Linux LVM

# 6. 检查LVM状态
# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
root rhel -wi-ao—- 90.00g
swap rhel -wi-ao—- 9.00g

# 7. 重新生成GRUB配置
# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file …
Found linux image: /boot/vmlinuz-5.14.0-123.el10.x86_64
Found initrd image: /boot/initramfs-5.14.0-123.el10.x86_64.img
done

# 8. 重新安装GRUB
# grub2-install /dev/sda
Installing for x86_64-efi platform.
Installation finished. No error reported.

# 9. 检查启动日志
# journalctl -b -1
— Logs begin at Fri 2026-04-01 00:00:00 CST, end at Fri 2026-04-02 16:00:00 CST. —
Apr 01 00:00:00 gf-linux-server kernel: Linux version 5.14.0-123.el10.x86_64
Apr 01 00:00:00 gf-linux-server kernel: Command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-123.el10.x86_64
Apr 01 00:00:00 gf-linux-server kernel: Disabled fast string operations

3.2 介质异常排查命令

# 介质异常排查命令

# 1. 校验ISO镜像完整性
# sha256sum rhel-10.0-x86_64-dvd.iso
a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f2 rhel-10.0-x86_64-dvd.iso

# 对比官方提供的校验值
# cat rhel-10.0-sha256.txt
a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f2 rhel-10.0-x86_64-dvd.iso

# 2. 检查U盘健康状态
# 查看U盘设备信息
# lsblk | grep -i disk
sda 8:0 0 100G 0 disk
sdb 8:16 1 16G 0 disk

# 检查U盘SMART信息(如果支持)
# smartctl -a /dev/sdb
smartctl 7.2 2020-12-30 r5338 [x86_64-linux-5.14.0-123.el10.x86_64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Generic USB Flash Disk
Device Model: USB Flash Disk
Serial Number: ABCDEF123456
Firmware Version: 1.00
User Capacity: 16,013,898,752 bytes [16.0 GB]
Sector Size: 512 bytes logical/physical

# 3. 验证U盘启动盘
# 挂载U盘查看内容
# mount /dev/sdb1 /mnt/usb
# ls -lh /mnt/usb/
total 8.5G
dr-xr-xr-x. 4 root root 2.0K Apr 2 10:00 .
drwxr-xr-x. 3 root root 4.0K Apr 2 10:00 ..
-r–r–r–. 1 root root 84 Apr 2 10:00 .discinfo
-r–r–r–. 1 root root 2.0K Apr 2 10:00 .treeinfo
dr-xr-xr-x. 2 root root 2.0K Apr 2 10:00 EFI
-r–r–r–. 1 root root 8.5G Apr 2 10:00 images
dr-xr-xr-x. 3 root root 2.0K Apr 2 10:00 isolinux
-r–r–r–. 1 root root 3.0K Apr 2 10:00 LICENSE
-r–r–r–. 1 root root 1.5K Apr 2 10:00 media.repo

# 检查关键文件
# md5sum /mnt/usb/isolinux/vmlinuz
1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p /mnt/usb/isolinux/vmlinuz

# 4. 检查光盘介质
# 挂载ISO镜像
# mount -o loop rhel-10.0-x86_64-dvd.iso /mnt/iso
mount: /mnt/iso: WARNING: device write-protected, mounted read-only.

# 验证ISO内容
# ls -lh /mnt/iso/
total 8.5G
dr-xr-xr-x. 4 root root 2.0K Jan 1 00:00 .
drwxr-xr-x. 3 root root 4.0K Apr 2 10:00 ..
-r–r–r–. 1 root root 84 Jan 1 00:00 .discinfo
-r–r–r–. 1 root root 2.0K Jan 1 00:00 .treeinfo
dr-xr-xr-x. 2 root root 2.0K Jan 1 00:00 EFI
-r–r–r–. 1 root root 8.5G Jan 1 00:00 images
dr-xr-xr-x. 3 root root 2.0K Jan 1 00:00 isolinux
-r–r–r–. 1 root root 3.0K Jan 1 00:00 LICENSE
-r–r–r–. 1 root root 1.5K Jan 1 00:00 media.repo

# 5. 检查网络安装源
# 测试HTTP安装源
# curl -I http://mirror.fgedu.net.cn/rhel10/
HTTP/1.1 200 OK
Date: Fri, 02 Apr 2026 16:00:00 GMT
Server: Apache/2.4.51 (Red Hat Enterprise Linux)
Last-Modified: Mon, 01 Jan 2026 00:00:00 GMT
ETag: “123456-1234567890abcdef”
Accept-Ranges: bytes
Content-Length: 12345678
Content-Type: text/html;charset=UTF-8

# 测试NFS安装源
# showmount -e nfs-server.fgedu.net.cn
Export list for nfs-server.fgedu.net.cn:
/export/rhel10 *

# 测试FTP安装源
# curl -I ftp://ftp.fgedu.net.cn/rhel10/
Last-Modified: Mon, 01 Jan 2026 00:00:00 GMT
Content-Length: 12345678
Accept-Ranges: bytes

3.3 安装日志分析

# 安装日志分析命令

# 1. Anaconda安装日志位置
# 安装过程中:
/tmp/anaconda.log # Anaconda主日志
/tmp/syslog # 系统日志
/tmp/X.log # X服务器日志
/tmp/program.log # 其他程序日志
/tmp/storage.log # 存储相关日志
/tmp/network.log # 网络相关日志

# 安装完成后:
/var/log/anaconda/ # 安装日志目录
/var/log/anaconda/anaconda.log
/var/log/anaconda/syslog
/var/log/anaconda/X.log
/var/log/anaconda/program.log
/var/log/anaconda/storage.log
/var/log/anaconda/network.log

# 2. 查看安装主日志
# cat /var/log/anaconda/anaconda.log | grep -i error
16:00:00,123 INFO anaconda: Anaconda 33.16.5.6-1.el10 started
16:00:01,234 DEBUG anaconda: Hardware detection started
16:00:05,456 ERROR anaconda: Failed to mount /dev/sdb1
16:00:10,567 WARNING anaconda: Network device eth0 not found
16:00:15,678 INFO anaconda: Installation completed successfully

# 3. 查看存储日志
# cat /var/log/anaconda/storage.log | grep -i fail
16:00:05,123 DEBUG blivet: scanning sda (/sys/devices/pci0000:00/0000:00:0d.0/ata1/host0/target0:0:0/0:0:0:0/block/sda)…
16:00:05,234 DEBUG blivet: sda is a disk
16:00:05,345 ERROR blivet: Failed to create partition on /dev/sda
16:00:05,456 DEBUG blivet: parted exception: Error informing the kernel about modifications

# 4. 查看网络日志
# cat /var/log/anaconda/network.log | grep -i error
16:00:10,123 DEBUG network: Checking network connectivity
16:00:10,234 ERROR network: Failed to configure eth0
16:00:10,345 DEBUG network: Connection activation failed: No suitable device found

# 5. 分析系统日志
# journalctl –directory=/var/log/journal -b -1 | grep -i anaconda
Apr 02 16:00:00 gf-linux-server anaconda[1234]: Anaconda 33.16.5.6-1.el10 started
Apr 02 16:00:01 gf-linux-server anaconda[1234]: Hardware detection started
Apr 02 16:00:05 gf-linux-server anaconda[1234]: Failed to mount /dev/sdb1
Apr 02 16:00:15 gf-linux-server anaconda[1234]: Installation completed successfully

# 6. 查看内核消息
# dmesg | grep -i error
[ 0.000000] Linux version 5.14.0-123.el10.x86_64
[ 1.234567] ACPI: Error parsing PPTT
[ 2.345678] ata1: SATA link down (SStatus 0 SControl 300)
[ 5.456789] EXT4-fs (sda2): error count: 0

# 7. 创建日志收集脚本
# cat > /fgedu/shell/collect-install-logs.sh << 'EOF' #!/bin/bash # from:www.itpux.com.qq113257174.wx:itpux-com LOG_DIR="/var/log/install-troubleshooting-$(date +%Y%m%d-%H%M%S)" mkdir -p $LOG_DIR echo "收集安装日志到 $LOG_DIR ..." # 复制Anaconda日志 if [ -d /var/log/anaconda ]; then cp -r /var/log/anaconda $LOG_DIR/ fi # 收集系统信息 uname -a > $LOG_DIR/system-info.txt
cat /etc/redhat-release >> $LOG_DIR/system-info.txt
free -h >> $LOG_DIR/system-info.txt
df -h >> $LOG_DIR/system-info.txt
lspci > $LOG_DIR/hardware-info.txt
lsblk > $LOG_DIR/disk-info.txt

# 收集内核日志
dmesg > $LOG_DIR/dmesg.txt
journalctl -b > $LOG_DIR/journal.txt

# 打包日志
tar -czf $LOG_DIR.tar.gz $LOG_DIR
rm -rf $LOG_DIR

echo “日志已收集: $LOG_DIR.tar.gz”
EOF

# chmod +x /fgedu/shell/collect-install-logs.sh
# /fgedu/shell/collect-install-logs.sh
收集安装日志到 /var/log/install-troubleshooting-20260402-160000 …
日志已收集: /var/log/install-troubleshooting-20260402-160000.tar.gz

Part04-生产案例与实战讲解

4.1 案例1:GRUB引导失败修复

# GRUB引导失败修复实战

# 问题现象:
# 启动时显示 “grub>_” 提示符,无法进入系统

# 解决步骤:

# 1. 使用安装介质启动救援模式
# 从RHEL安装光盘启动,选择 “Troubleshooting” -> “Rescue a Red Hat Enterprise Linux system”
# 选择 “1) Continue” 挂载系统

# 2. 查看当前磁盘分区
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 1G 0 part
├─sda2 8:2 0 99G 0 part
├─rhel-root 253:0 0 90G 0 lvm /mnt/sysimage
└─rhel-swap 253:1 0 9G 0 lvm

# 3. 切换到系统环境
# chroot /mnt/sysimage

# 4. 检查/boot分区
# df -h /boot
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 1014M 285M 730M 29% /boot

# ls -lh /boot/
total 285M
-rw——-. 1 root root 85M Apr 2 10:00 initramfs-5.14.0-123.el10.x86_64.img
drwx——. 2 root root 4.0K Apr 2 10:00 efi
drwxr-xr-x. 6 root root 4.0K Apr 2 10:00 grub2
-rw——-. 1 root root 12M Apr 2 10:00 vmlinuz-5.14.0-123.el10.x86_64

# 5. 检查GRUB配置文件
# ls -lh /boot/grub2/grub.cfg
-rw-r–r–. 1 root root 5.1K Apr 2 10:00 /boot/grub2/grub.cfg

# 如果文件不存在或损坏,重新生成
# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file …
Found linux image: /boot/vmlinuz-5.14.0-123.el10.x86_64
Found initrd image: /boot/initramfs-5.14.0-123.el10.x86_64.img
done

# 6. 重新安装GRUB到磁盘
# 对于BIOS系统:
# grub2-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.

# 对于UEFI系统:
# grub2-install –target=x86_64-efi –efi-directory=/boot/efi
Installing for x86_64-efi platform.
Installation finished. No error reported.

# 7. 验证GRUB安装
# ls -lh /boot/grub2/
total 2.5M
drwxr-xr-x. 2 root root 4.0K Apr 2 10:00 fonts
-rw-r–r–. 1 root root 5.1K Apr 2 10:00 grub.cfg
-rw-r–r–. 1 root root 1.0K Apr 2 10:00 grubenv
drwxr-xr-x. 2 root root 4.0K Apr 2 10:00 i386-pc
drwxr-xr-x. 2 root root 4.0K Apr 2 10:00 locale

# 8. 退出并重启
# exit
# reboot

4.2 案例2:ISO镜像损坏检测

# ISO镜像损坏检测与修复

# 问题现象:
# 安装过程中提示 “The installation source is not valid”

# 排查步骤:

# 1. 检查ISO文件完整性
# sha256sum rhel-10.0-x86_64-dvd.iso
a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f2 rhel-10.0-x86_64-dvd.iso

# 对比官方校验值
# curl -s https://access.redhat.com/downloads/content/rhel-10.0-sha256.txt | grep dvd.iso
a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6a7b8c9d0e1f2 rhel-10.0-x86_64-dvd.iso

# 2. 挂载ISO检查内容
# mkdir -p /mnt/iso
# mount -o loop rhel-10.0-x86_64-dvd.iso /mnt/iso
mount: /mnt/iso: WARNING: device write-protected, mounted read-only.

# 检查关键文件
# ls -lh /mnt/iso/isolinux/
total 56M
-r–r–r–. 1 root root 2.0K Jan 1 00:00 boot.cat
-r–r–r–. 1 root root 84 Jan 1 00:00 boot.msg
-r–r–r–. 1 root root 12M Jan 1 00:00 initrd.img
-r–r–r–. 1 root root

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息