1. 首页 > Linux教程 > 正文

Linux教程FG219-网络监控与告警配置

内容简介:本文风哥教程参考Linux官方文档、Red Hat Enterprise Linux官方文档、Ansible Automation Platform官方文档、Docker官方文档、Kubernetes官方文档和Podman官方文档等内容,详细介绍了相关技术的配置和使用方法。

风哥提示:

本文档详细介绍Linux网络监控和告警系统的配置方法。

Part01-网络监控工具

1.1 安装监控工具

# 安装常用监控工具
$ sudo dnf install -y net-snmp net-snmp-utils
$ sudo dnf install -y nagios-plugins
$ sudo dnf install -y monitoring-plugins

# 安装Prometheus Node Exporter
$ sudo dnf install -y prometheus-node-exporter
$ sudo systemctl enable –now prometheus-node-exporter

# 查看Node Exporter状态
$ sudo systemctl status prometheus-node-exporter
● prometheus-node-exporter.service – Prometheus Node Exporter
Loaded: loaded (/usr/lib/systemd/system/prometheus-node-exporter.service; enabled; preset: disabled)
Active: active (running) since Thu 2026-04-03 21:25:00 CST; 10s ago
Main PID: 12345 (node_exporter)
Tasks: 4 (limit: 49152)
Memory: 5.2M
CPU: 15ms
CGroup: /system.slice/prometheus-node-exporter.service
└─12345 /usr/bin/node_exporter

# 访问Node Exporter指标
$ curl http://localhost:9100/metrics | head -20
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile=”0″} 0.000123
go_gc_duration_seconds{quantile=”0.25″} 0.000234
go_gc_duration_seconds{quantile=”0.5″} 0.000345
go_gc_duration_seconds{quantile=”0.75″} 0.000456
go_gc_duration_seconds{quantile=”1″} 0.001234
go_gc_duration_seconds_sum 0.012345
go_gc_duration_seconds_count 10
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 8

Part02-SNMP监控配置

2.1 配置SNMP服务

# 配置SNMP
$ sudo tee /etc/snmp/snmpd.conf << EOF # 定义团体名 rocommuni学习交流加群风哥QQ113257174ty public 127.0.0.1 rocommunity public 192.168.1.0/24 # 定义系统信息 sysLocation "Data Center, Room 1" sysContact "admin@fgedu.net.cn" sysName "rhel10.fgedu.net.cn" # 定义OID视图 view systemview included .1.3.6.1.2.1.1 view systemview included .1.3.6.1.2.1.25.1.1 # 定义访问权限 access notConfigGroup "" any noauth exact systemview none none # 磁盘监控 disk / 10000 disk /var 5000 # 进程监控 proc httpd 10 1 proc sshd 10 1 # CPU负载监控 load 12 10 5 EOF # 启动SNMP服务 $ sudo systemctl enable --now snmpd # 测试SNMP $ snmpwalk -v 2c -c public localhost system SNMPv2-MIB::sysDescr.0 = STRING: Linux rhel10.fgedu.net.cn 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 3 21:25:00 CST 2026 x86_64 SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (123456) 0:20:34.56 SNMPv2-MIB::sysContact.0 = STRING: admin@fgedu.net.cn SNMPv2-MIB::sysName.0 = STRING: rhel10.fgedu.net.cn SNMPv2-MIB::sysLocation.0 = STRING: Data Center, Room 1 # 查看磁盘信息 $ snmpwalk -v 2c -c public localhost dskTabl更多学习教程公众号风哥教程itpux_come UCD-SNMP-MIB::dskIndex.1 = INTEGER: 1 UCD-SNMP-MIB::dskPath.1 = STRING: / UCD-SNMP-MIB::dskDevice.1 = STRING: /dev/mapper/rhel-root UCD-SNMP-MIB::dskTotal.1 = INTEGER: 52428800 UCD-SNMP-MIB::dskAvail.1 = INTEGER: 10485760 UCD-SNMP-MIB::dskUsed.1 = INTEGER: 41943040 UCD-SNMP-MIB::dskPercent.1 = INTEGER: 80

Part03-网络流量监控

3.1 配置流量监控

# 安装vnstat
$ sudo dnf install -y vnstat

# 初始化数据库
$ sudo vnstat –add -i eth0

# 启动vnstat服务
$ sudo systemctl enable –now vnstat

# 查看流量统计
$ vnstat
Database updated: 2026-04-03 21:30:00

eth0 since 2026-04-01
rx: 10.00 GiB tx: 5.00 GiB total: 15.00 GiB

monthly
rx | tx | total | avg. rate
————————+————-+————-+—————
2026-04 10.00 GiB | 5.00 GiB | 15.00 GiB | 5.00 Mbit/s
————————+————-+————-+—————
estimated 15.00 GiB | 7.50 GiB | 22.50 GiB |

daily
rx | tx | total | avg. rate
————————+————-+————-+—————
yesterday 5.00 GiB | 2.50 GiB | 7.50 GiB | 10.00 Mbit/s
today 5.00 GiB | 2.50 GiB | 7.50 GiB | 10.00 Mbit/s
————————+————-+————-+—————
estimated 7.50 GiB | 3.75 GiB | 11.25 GiB |

# 实时流量监控
$ vnstat -l
Monitoring eth0… (press CTRL-C to stop)

rx: 1.50 Mbit/s 250 p/s tx: 0.75 Mbit/s 125 p/s

# 安装iftop
$ sudo dnf install -y iftop

# 实时流量监控
$ sudo iftop -i eth0
interface: eth0
IP address is: 192.168.1.100
MAC address is: 08:00:27:12:34:56
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes

# 安装nload
$ sudo dnf install -y nload

# 查看网络流量
$ nload eth0

Part04-告警配置

4.1 配置邮件告警

# 安装邮件服务
$ sudo dnf install -y postfix mailx

# 配置Postfix
$ sudo tee /etc/postfix/main.cf << EOF myhostname = rhel10.fgedu.net.cn mydomain = fgedu.net.cn myorigin = \$mydomain inet_interfaces = localhost inet_protocols = ipv4 mydestination = \$myhostname, localhost.\$mydomain, localhost, \$mydomain relayhost = [smtp.fgedu.net.cn]:587 smtp_sasl_auth_enable = yes smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd smtp_sasl_security_options = noanonymous smtp_tls_security_level = encrypt EOF # 配置SMTP认证 $ sudo tee /etc/postfix/sasl_passwd << EOF [smtp.fgedu.net.cn]:587 user@fgedu.net.cn:password EOF $ sudo postmap /etc/postfix/sasl_passwd $ sudo chmod 600 /etc/postfix/sasl_passwd* # 启动Postfix $ sudo systemctl enable --now postfix # 测试邮件发送 $ echo "Test email" | mail -s "Test Subject" admin@fgedu.net.cn # 创建网络监控脚本 $ cat > /usr/local/bin/network-monitor.sh << 'EOF' #!/bin/bash ALERT_EMAIL="admin@fgedu.net.cn" PING_TARGET="8.8.8.8" LOG_FILE="/var/log/network-monitor.log" log() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> $LOG_FILE
}

send_alert() {
echo “$1” | mail -s “Network Alert: $2” $ALERT_EMAIL
}

check_ping() {
if ! ping -c 3 $PING_TARGET > /dev/null 2>&1; then
log “ERROR: Cannot ping $PING_TARGET”
send_alert “Cannot ping $PING_TARGET” “Ping Failed”
fi
}

check_dns() {
if ! nslookup www.google.com > /dev/null 2>&1; then
log “ERROR: DNS resolution failed”
send_alert “DNS resolution failed” “DNS Failed”
fi
}

check_bandwidth() {
RX_RATE=$(cat /proc/net/dev | grep eth0 | awk ‘{print $2}’)
TX_RATE=$(cat /proc/net/dev | grep eth0 | awk ‘{print $10}’)

if [ $RX_RATE -gt 1000000000 ] || [ $TX_RATE -gt 1000000000 ]; then
log “WARNING: High network traffic detected”
send_alert “High network traffic: RX=$RX_RATE, TX=$TX_RATE” “High Traffic”
fi
}

log “Starting network monitor”
check_ping
check_dns
check_bandwidth
log “Network monitor completed”
EOF

chmod +x /usr/local/bin/network-monitor.sh

# 配置定时任务
$ sudo tee /etc/cron.d/network-monitor << EOF */5 * * * * root /usr/local/bin/network-monitor.sh EOF

Part05-监控仪表板

5.1 配置监控仪表板

# 安装Grafana
$ sudo tee /etc/yum.repos.d/grafana.repo << EOF [grafana] name=grafana baseurl=https://packages.grafana.com/oss/rpm repo_gpgcheck=1 enabled=1 gpgcheck=1 gpgkey=https://packages.grafana.com/gpg.key sslverify=1 sslcacert=/etc/pki/tls/certs/ca-bundle.crt EOF $ sudo dnf install -y grafana # 启动Grafana $ sudo systemctl enable --now grafana-server # 查看Grafana状态 $ sudo systemctl status grafana-server ● grafana-server.service - Grafana instance Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; enabled; preset: disabled) Active: active (running) since Thu 2026-04-03 21:35:00 CST; 10s ago Main PID: 12346 (grafana-server) Tasks: 10 (limit: 49152) Memory: 50.5M CPU: 500ms CGroup: /system.slice/grafana-server.from PG视频:www.itpux.comservice └─12346 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini # 访问Grafana # 浏览器访问 http://localhost:3000 # 默认用户名: admin # 默认密码: admin # 创建监控脚本 $ cat > /usr/local/bin/network-stats.sh << 'EOF' #!/bin/bash echo "=== Network Statistics ===" echo "Date: $(date)" echo "" echo "=== Interface Statistics ===" ip -s link show eth0 echo "" echo "=== Connection Statistics ===" ss -s echo "" echo "=== Traffic Statistics ===" vnstat --oneline echo "" echo "=== Active Connections ===" ss -tunap | head -20 echo "" echo "=== Network Errors ===" cat /proc/net/dev | grep eth0 echo "" EOF chmod +x /usr/local/bin/network-stats.sh # 执行监控脚本 $ sudo /usr/local/bin/network-stats.sh === Network Statistics === Date: Thu Apr 3 21:35:30 CST 2026 === Interface Statistics === 2: eth0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 08:00:27:12:34:56 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
10.0M 10000 0 0 0 0
TX: bytes packets errors dropped carrier collsns
5.0M 5000 0 0 0 0
风哥针对监控建议:
1. 部署完整的监控系统
2. 配置关键指标告警
3. 定期检查监控数据
4. 建立监控基线
5. 制定故障响应流程

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息