1. Prometheus简介
Prometheus是由CNCF托管的开源监控系统和时序数据库,采用Go语言编写。Prometheus具有多维数据模型、灵活的查询语言(PromQL)、不依赖分布式存储、通过HTTP拉取时间序列数据、支持多种图形和仪表板等特点。Prometheus广泛应用于系统监控、应用监控、告警系统等场景。更多学习教程www.fgedu.net.cn
Prometheus的主要特点包括:多维数据模型、PromQL查询语言、独立部署、Pull模式采集、服务发现、多种可视化支持、告警管理、高效存储。
2. Prometheus版本说明
Prometheus提供多个版本系列,用户可根据需求选择:
当前版本
Prometheus 3.11.0:最新版本,2026-04-02发布
Prometheus 3.5.1:LTS长期支持版本,2026-01-07发布
组件版本
Alertmanager 0.31.1:告警管理器
Node Exporter 1.10.2:节点指标采集
Blackbox Exporter 0.28.0:黑盒监控
MySQL Exporter 0.19.0:MySQL监控
支持的平台
Linux:AMD64、ARM64
macOS:AMD64、ARM64(Silicon)
Windows:AMD64
Docker:官方镜像
3. 官方下载方式
Prometheus是完全开源免费的监控系统,可直接从官网下载。学习交流加群风哥微信: itpux-com
官方下载地址
Prometheus官网:https://prometheus.io/
下载页面:https://prometheus.io/download/
GitHub仓库:https://github.com/prometheus/prometheus
使用wget下载
$ wget https://github.com/prometheus/prometheus/releases/download/v3.11.0/prometheus-3.11.0.linux-amd64.tar.gz
# 输出示例如下:
–2026-04-04 10:15:00– https://github.com/prometheus/prometheus/releases/download/v3.11.0/prometheus-3.11.0.linux-amd64.tar.gz
Resolving github.com… 140.82.121.4
Connecting to github.com|140.82.121.4|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 143210000 (137M) [application/octet-stream]
Saving to: ‘prometheus-3.11.0.linux-amd64.tar.gz’
prometheus-3.11.0.linux-amd64.tar.gz 100%[===========================================>] 136.58M 25.5MB/s in 5s
# 验证下载文件
$ sha256sum prometheus-3.11.0.linux-amd64.tar.gz
# 输出示例如下:
ff799c3e4c318e17dec14aaaa406a4da328fabb4578336b36d96d893870c3b76 prometheus-3.11.0.linux-amd64.tar.gz
# 解压安装包
$ tar -xzf prometheus-3.11.0.linux-amd64.tar.gz
# 输出示例如下:
$ ls prometheus-3.11.0.linux-amd64/
console_libraries consoles LICENSE NOTICE prometheus prometheus.yml promtool
4. Docker安装方式
Docker是部署Prometheus最简单的方式。from:www.itpux.com
$ docker pull prom/prometheus:v3.11.0
# 输出示例如下:
v3.11.0: Pulling from prom/prometheus
Digest: sha256:abc123def456…
Status: Downloaded newer image for prom/prometheus:v3.11.0
docker.io/prom/prometheus:v3.11.0
# 创建配置目录
$ mkdir -p /fgeudb/prometheus/data /fgeudb/prometheus/config
# 创建配置文件
$ cat > /fgeudb/prometheus/config/prometheus.yml << EOF
global:
scrape_interval: 15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets:
- 192.168.1.51:9093
rule_files:
- /etc/prometheus/rules/*.yml
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets: ['192.168.1.51:9100']
EOF
# 启动Prometheus容器
$ docker run -d --name prometheus \
-p 9090:9090 \
-v /fgeudb/prometheus/config:/etc/prometheus \
-v /fgeudb/prometheus/data:/prometheus \
prom/prometheus:v3.11.0
# 输出示例如下:
abc123def456789...
# 查看容器状态
$ docker ps
# 输出示例如下:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
abc123def456 prom/prometheus:v3.11.0 "/bin/prometheus --c…" 5 seconds ago Up 4 seconds 0.0.0.0:9090->9090/tcp prometheus
# 查看日志
$ docker logs prometheus
# 输出示例如下:
ts=2026-04-04T10:30:00.000Z caller=main.go:538 level=info msg=”Starting Prometheus Server” mode=server version=”(version=3.11.0, branch=HEAD, revision=abc123)”
ts=2026-04-04T10:30:00.100Z caller=main.go:553 level=info msg=”Build context” build_context=”(go=go1.22.0, platform=linux/amd64, user=root@abc123, date=20260404-00:00:00)”
ts=2026-04-04T10:30:00.200Z caller=main.go:554 level=info msg=”Host information” host_info=”(Linux 5.15.0-91-generic #101-Ubuntu SMP x86_64)”
5. 安装介质说明
Prometheus提供多种安装介质,用户可根据实际需求选择。学习交流加群风哥QQ113257174
安装包类型
TAR.GZ包:Linux通用安装包
RPM包:RHEL/CentOS专用
DEB包:Ubuntu/Debian专用
Docker镜像:跨平台通用
二进制安装
$ tar -xzf prometheus-3.11.0.linux-amd64.tar.gz
$ cd prometheus-3.11.0.linux-amd64
# 创建用户和目录
# useradd -r -s /bin/false prometheus
# mkdir -p /fgeudb/prometheus/data /fgeudb/prometheus/config
# 复制文件
# cp prometheus promtool /usr/local/bin/
# cp -r consoles console_libraries /etc/prometheus/
# cp prometheus.yml /fgeudb/prometheus/config/
# 设置权限
# chown -R prometheus:prometheus /fgeudb/prometheus
# 创建systemd服务
# cat > /etc/systemd/system/prometheus.service << EOF
[Unit]
Description=Prometheus Server
After=network.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file=/fgeudb/prometheus/config/prometheus.yml \
--storage.tsdb.path=/fgeudb/prometheus/data \
--storage.tsdb.retention.time=30d \
--web.listen-address=0.0.0.0:9090 \
--web.enable-admin-api
[Install]
WantedBy=multi-user.target
EOF
# 启动服务
# systemctl daemon-reload
# systemctl start prometheus
# systemctl enable prometheus
# 输出示例如下:
Created symlink /etc/systemd/system/multi-user.target.wants/prometheus.service → /etc/systemd/system/prometheus.service.
# 查看服务状态
# systemctl status prometheus
# 输出示例如下:
● prometheus.service - Prometheus Server
Loaded: loaded (/etc/systemd/system/prometheus.service; enabled)
Active: active (running) since Fri 2026-04-04 10:30:00 CST; 5s ago
Main PID: 12345 (prometheus)
Tasks: 8 (limit: 4915)
Memory: 150.0M
CGroup: /system.slice/prometheus.service
└─12345 /usr/local/bin/prometheus --config.file=/fgeudb/prometheus/config/prometheus.yml
6. 系统配置方法
Prometheus安装后需要进行基本配置,以下是常用配置方法。更多学习教程公众号风哥教程itpux_com
配置文件说明
$ vi /fgeudb/prometheus/config/prometheus.yml
# 完整配置示例
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: ‘fgedu-monitor’
alerting:
alertmanagers:
– static_configs:
– targets:
– 192.168.1.51:9093
rule_files:
– /fgeudb/prometheus/rules/*.yml
scrape_configs:
– job_name: ‘prometheus’
static_configs:
– targets: [‘localhost:9090’]
labels:
instance: ‘prometheus-fgedu’
– job_name: ‘node-exporter’
static_configs:
– targets:
– ‘192.168.1.51:9100’
– ‘192.168.1.52:9100’
– ‘192.168.1.53:9100’
labels:
env: ‘production’
– job_name: ‘mysql’
static_configs:
– targets: [‘192.168.1.51:9104’]
labels:
instance: ‘mysql-master’
– job_name: ‘nginx’
static_configs:
– targets: [‘192.168.1.51:9113’]
# 验证配置文件
$ promtool check config /fgeudb/prometheus/config/prometheus.yml
# 输出示例如下:
Checking /fgeudb/prometheus/config/prometheus.yml
SUCCESS: 0 potential warnings or errors found.
告警规则配置
$ mkdir -p /fgeudb/prometheus/rules
# 创建告警规则文件
$ cat > /fgeudb/prometheus/rules/alerts.yml << EOF
groups:
- name: node_alerts
rules:
- alert: NodeDown
expr: up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Node {{ \$labels.instance }} is down"
description: "Node {{ \$labels.instance }} has been down for more than 1 minute."
- alert: HighCPU
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: “High CPU usage on {{ \$labels.instance }}”
description: “CPU usage is above 80% for more than 5 minutes.”
– alert: HighMemory
expr: (1 – (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 85
for: 5m
labels:
severity: warning
annotations:
summary: “High memory usage on {{ \$labels.instance }}”
description: “Memory usage is above 85% for more than 5 minutes.”
– alert: DiskSpaceLow
expr: (node_filesystem_avail_bytes{fstype!=”tmpfs”} / node_filesystem_size_bytes{fstype!=”tmpfs”}) * 100 < 10
for: 5m
labels:
severity: critical
annotations:
summary: "Low disk space on {{ \$labels.instance }}"
description: "Disk {{ \$labels.mountpoint }} has less than 10% space remaining."
EOF
# 验证告警规则
$ promtool check rules /fgeudb/prometheus/rules/alerts.yml
# 输出示例如下:
Checking /fgeudb/prometheus/rules/alerts.yml
SUCCESS: 4 rules found.
7. 生产环境建议
在生产环境中使用Prometheus时,需要考虑以下因素:
存储配置
–storage.tsdb.retention.time=30d
# 配置存储大小限制
–storage.tsdb.retention.size=50GB
# 查看存储状态
$ curl http://localhost:9090/api/v1/status/tsdb
# 输出示例如下:
{
“status”: “success”,
“data”: {
“headStats”: {
“numSeries”: 12345,
“numLabelValues”: 67890,
“chunkCount”: 123456,
“minTime”: 1712214400000,
“maxTime”: 1712300800000,
“numChunks”: 123456,
“numSamples”: 1234567
},
“seriesCountByMetricName”: [
{
“name”: “node_cpu_seconds_total”,
“value”: 1000
}
]
}
}
高可用配置
# 实例1配置
–web.listen-address=0.0.0.0:9090
–storage.tsdb.path=/fgeudb/prometheus/data1
# 实例2配置
–web.listen-address=0.0.0.0:9091
–storage.tsdb.path=/fgeudb/prometheus/data2
# 使用Thanos实现长期存储和高可用
$ docker run -d –name thanos-sidecar \
-v /fgeudb/prometheus/data:/prometheus \
thanosio/thanos:v0.35.0 \
sidecar \
–tsdb.path=/prometheus \
–prometheus.url=http://localhost:9090 \
–objstore.config-file=/etc/thanos/bucket.yml
8. Exporter组件推荐
Prometheus生态提供多种Exporter用于采集不同系统的指标:
Node Exporter(系统监控)
$ wget https://github.com/prometheus/node_exporter/releases/download/v1.10.2/node_exporter-1.10.2.linux-amd64.tar.gz
# 输出示例如下:
–2026-04-04 10:15:00– https://github.com/prometheus/node_exporter/releases/download/v1.10.2/node_exporter-1.10.2.linux-amd64.tar.gz
Resolving github.com… 140.82.121.4
Connecting to github.com|140.82.121.4|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 15400000 (15M) [application/octet-stream]
Saving to: ‘node_exporter-1.10.2.linux-amd64.tar.gz’
node_exporter-1.10.2.linux-amd64.tar.gz 100%[===========================================>] 14.69M 10.0MB/s in 1.5s
# 解压并安装
$ tar -xzf node_exporter-1.10.2.linux-amd64.tar.gz
# cp node_exporter-1.10.2.linux-amd64/node_exporter /usr/local/bin/
# 创建systemd服务
# cat > /etc/systemd/system/node_exporter.service << EOF
[Unit]
Description=Node Exporter
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
EOF
# 启动服务
# systemctl start node_exporter
# systemctl enable node_exporter
# 输出示例如下:
Created symlink /etc/systemd/system/multi-user.target.wants/node_exporter.service → /etc/systemd/system/node_exporter.service.
# 验证运行状态
$ curl http://localhost:9100/metrics | head -20
# 输出示例如下:
# HELP node_cpu_seconds_total Seconds the cpu spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 1234567.89
node_cpu_seconds_total{cpu="0",mode="iowait"} 12345.67
node_cpu_seconds_total{cpu="0",mode="irq"} 123.45
node_cpu_seconds_total{cpu="0",mode="nice"} 12.34
node_cpu_seconds_total{cpu="0",mode="softirq"} 123.45
node_cpu_seconds_total{cpu="0",mode="steal"} 1.23
node_cpu_seconds_total{cpu="0",mode="system"} 12345.67
node_cpu_seconds_total{cpu="0",mode="user"} 123456.78
Alertmanager(告警管理)
$ wget https://github.com/prometheus/alertmanager/releases/download/v0.31.1/alertmanager-0.31.1.linux-amd64.tar.gz
# 解压并安装
$ tar -xzf alertmanager-0.31.1.linux-amd64.tar.gz
# cp alertmanager-0.31.1.linux-amd64/alertmanager /usr/local/bin/
# 创建配置文件
# cat > /fgeudb/alertmanager/alertmanager.yml << EOF
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.example.com:587'
smtp_from: 'alertmanager@fgedu.net.cn'
smtp_auth_username: 'alertmanager@fgedu.net.cn'
smtp_auth_password: 'yourpassword'
route:
group_by: ['alertname']
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
receiver: 'email-notifications'
receivers:
- name: 'email-notifications'
email_configs:
- to: 'admin@fgedu.net.cn'
send_resolved: true
EOF
# 启动Alertmanager
# alertmanager --config.file=/fgeudb/alertmanager/alertmanager.yml --storage.path=/fgeudb/alertmanager/data
# 输出示例如下:
ts=2026-04-04T10:30:00.000Z caller=main.go:240 level=info msg="Starting Alertmanager" version="(version=0.31.1, branch=HEAD, revision=abc123)"
ts=2026-04-04T10:30:00.100Z caller=main.go:241 level=info msg="Build context" build_context="(go=go1.22.0, platform=linux/amd64, user=root@abc123)"
ts=2026-04-04T10:30:00.200Z caller=cluster.go:170 level=info msg="setting advertise address explicitly" local_addr=192.168.1.51
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
