1. Loki概述与环境规划
Loki是一个开源的日志聚合系统,由Grafana Labs开发,专为云原生环境设计。Loki采用与Prometheus类似的设计理念,使用标签索引日志,支持高效的日志查询和分析。更多学习教程www.fgedu.net.cn
1.1 Loki版本说明
Loki目前主要版本为2.x系列,本教程以Loki 2.8.0为例进行详细讲解。Loki 2.x版本相比之前版本在性能、稳定性和功能方面都有显著提升,支持更多的日志处理特性。
$ loki –version
loki, version 2.8.0 (branch: HEAD, revision: abcdefg1234567890abcdefg1234567890abcdefg)
build user: root@1a2b3c4d5e6f
build date: 2023-04-05T15:33:19Z
go version: go1.19.6
platform: linux/amd64
# 查看系统版本
$ cat /etc/os-release
NAME=”Oracle Linux Server”
VERSION=”8.9″
ID=”ol”
PRETTY_NAME=”Oracle Linux Server 8.9″
# 查看内核版本
$ uname -r
5.4.17-2136.302.7.2.el8uek.x86_64
1.2 环境规划
本次安装环境规划如下:
loki01.fgedu.net.cn (192.168.1.95) – Loki主服务器
data01.fgedu.net.cn (192.168.1.96) – 存储节点1
data02.fgedu.net.cn (192.168.1.97) – 存储节点2
data03.fgedu.net.cn (192.168.1.98) – 存储节点3
Loki版本:2.8.0
存储后端:S3兼容存储
安装方式:二进制安装
数据存储:S3兼容存储
2. 硬件环境要求
Loki作为日志聚合系统,对硬件资源要求根据日志量和查询频率而定。学习交流加群风哥微信: itpux-com
2.1 物理主机环境要求
– CPU:至少8核
– 内存:至少32GB
– 磁盘:系统盘120GB SSD + 数据盘500GB SSD
# 存储节点要求
– CPU:至少4核
– 内存:至少16GB
– 磁盘:系统盘120GB SSD + 数据盘2TB SSD
# 检查Loki服务器资源
# free -h
total used free shared buff/cache available
Mem: 32G 8.4G 22G 512M 3.6G 23G
Swap: 8G 0B 8G
# 检查磁盘空间
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 120G 20G 100G 17% /
/dev/sdb1 500G 50G 450G 10% /data
2.2 vSphere虚拟主机环境要求
– Loki服务器:
– vCPU:8核
– 内存:32GB
– 磁盘:系统盘120GB SSD + 数据盘500GB SSD
– 网络:VMXNET3网卡,10Gbps网络
– 存储节点:
– vCPU:4核
– 内存:16GB
– 磁盘:系统盘120GB SSD + 数据盘2TB SSD
– 网络:VMXNET3网卡,10Gbps网络
资源池配置:
– CPU预留:Loki服务器4GHz,存储节点2GHz
– 内存预留:Loki服务器16GB,存储节点8GB
– 内存限制:Loki服务器32GB,存储节点16GB
– CPU份额:正常
– 内存份额:正常
2.3 云平台主机环境要求
– Loki服务器:
– 实例规格:ecs.g6.4xlarge或同等规格
– vCPU:16核
– 内存:64GB
– 系统盘:SSD云盘 120GB
– 数据盘:SSD云盘 500GB
– 网络带宽:10Gbps以上
– 存储节点:
– 实例规格:ecs.g6.2xlarge或同等规格
– vCPU:8核
– 内存:32GB
– 系统盘:SSD云盘 120GB
– 数据盘:SSD云盘 2TB
– 网络带宽:10Gbps以上
存储配置:
– OSS对象存储:用于存储日志数据
– NAS文件存储:用于共享配置文件
– 云盘快照:定期备份配置数据
3. 操作系统环境准备
在安装Loki之前,需要对操作系统进行必要的配置和优化。
3.1 操作系统版本检查
# cat /etc/os-release
NAME=”Oracle Linux Server”
VERSION=”8.9″
ID=”ol”
PRETTY_NAME=”Oracle Linux Server 8.9″
# 检查内核版本
# uname -r
5.4.17-2136.302.7.2.el8uek.x86_64
# 检查SELinux状态
# getenforce
Enforcing
# 检查防火墙状态
# systemctl status firewalld
● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: active (running)
3.2 依赖服务安装
# dnf install -y wget curl tar gzip
# 关闭防火墙
# systemctl stop firewalld
# systemctl disable firewalld
# 关闭SELinux
# setenforce 0
# sed -i ‘s/SELINUX=enforcing/SELINUX=disabled/’ /etc/selinux/config
# 创建Loki用户
# useradd -r -s /bin/false loki
# 创建目录结构
# mkdir -p /data/loki/{config,bin,data}
# chown -R loki:loki /data/loki
3.3 网络配置
# vi /etc/sysconfig/network-scripts/ifcfg-ens33
TYPE=Ethernet
BOOTPROTO=static
NAME=ens33
DEVICE=ens33
ONBOOT=yes
IPADDR=192.168.1.95
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
DNS1=114.114.114.114
# 重启网络
# systemctl restart NetworkManager
# 验证网络
# ping -c 4 google.com
4. Loki安装配置
完成环境准备后,开始安装Loki。
4.1 安装Loki
# wget https://github.com/grafana/loki/releases/download/v2.8.0/loki-linux-amd64.zip
# 解压文件
# unzip loki-linux-amd64.zip
# mv loki-linux-amd64 /data/loki/bin/loki
# 创建配置文件
# vi /data/loki/config/loki.yml
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9095
common:
path_prefix: /data/loki
storage:
filesystem:
chunks_directory: /data/loki/chunks
rules_directory: /data/loki/rules
replication_factor: 3
ring:
kvstore:
store: inmemory
schema_config:
configs:
– from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/index
cache_location: /data/loki/cache
shared_store: filesystem
compactor:
working_directory: /data/loki/compactor
shared_store: filesystem
# 创建systemd服务文件
# vi /etc/systemd/system/loki.service
[Unit]
Description=Loki Log Aggregation System
After=network.target
[Service]
User=loki
ExecStart=/data/loki/bin/loki -config.file=/data/loki/config/loki.yml
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
# 启动Loki
# systemctl daemon-reload
# systemctl start loki
# systemctl enable loki
# 验证安装
# systemctl status loki
# curl http://localhost:3100/metrics
4.2 安装Promtail
# wget https://github.com/grafana/loki/releases/download/v2.8.0/promtail-linux-amd64.zip
# 解压文件
# unzip promtail-linux-amd64.zip
# mv promtail-linux-amd64 /data/loki/bin/promtail
# 创建配置文件
# vi /data/loki/config/promtail.yml
server:
http_listen_port: 9080
grpc_listen_port: 0
clients:
– url: http://localhost:3100/loki/api/v1/push
scrape_configs:
– job_name: system
static_configs:
– targets:
– localhost
labels:
job: varlogs
__path__: /var/log/*log
# 创建systemd服务文件
# vi /etc/systemd/system/promtail.service
[Unit]
Description=Promtail Log Collector
After=network.target
[Service]
User=loki
ExecStart=/data/loki/bin/promtail -config.file=/data/loki/config/promtail.yml
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
# 启动Promtail
# systemctl daemon-reload
# systemctl start promtail
# systemctl enable promtail
# 验证安装
# systemctl status promtail
# curl http://localhost:9080/metrics
5. Loki配置优化
为了提高Loki的性能和稳定性,需要进行一些配置优化。
5.1 存储配置优化
# vi /data/loki/config/loki.yml
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9095
grpc_server_max_recv_msg_size: 104857600
grpc_server_max_send_msg_size: 104857600
common:
path_prefix: /data/loki
storage:
filesystem:
chunks_directory: /data/loki/chunks
rules_directory: /data/loki/rules
replication_factor: 3
ring:
kvstore:
store: consul
consul:
host: localhost:8500
schema_config:
configs:
– from: 2020-10-24
store: boltdb-shipper
object_store: s3
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/index
cache_location: /data/loki/cache
shared_store: s3
s3:
endpoint: s3.fgedu.net.cn
bucketnames: loki
access_key_id: AKIAXXXXXXXXXXXXXXXX
secret_access_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
region: us-east-1
insecure: false
compactor:
working_directory: /data/loki/compactor
shared_store: s3
compaction_interval: 10m
retention_enabled: true
retention_period: 7d
# 重启Loki
# systemctl restart loki
5.2 高可用配置
# dnf install -y consul
# 配置Consul
# vi /etc/consul.d/consul.hcl
server = true
bootstrap_expect = 3
# 启动Consul
# systemctl start consul
# systemctl enable consul
# 配置Loki集群
# vi /data/loki/config/loki.yml
common:
replication_factor: 3
ring:
kvstore:
store: consul
consul:
host: localhost:8500
# 重启Loki
# systemctl restart loki
# 验证集群状态
# curl http://localhost:3100/loki/api/v1/status
5.3 内存配置
# vi /etc/systemd/system/loki.service
[Unit]
Description=Loki Log Aggregation System
After=network.target
[Service]
User=loki
Environment=”GODEBUG=madvdontneed=1″
Environment=”GOMAXPROCS=8″
ExecStart=/data/loki/bin/loki -config.file=/data/loki/config/loki.yml
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
# 重启Loki
# systemctl daemon-reload
# systemctl restart loki
6. Promtail配置
Promtail用于收集和发送日志到Loki。
6.1 配置Promtail
# vi /data/loki/config/promtail.yml
server:
http_listen_port: 9080
grpc_listen_port: 0
clients:
– url: http://loki01.fgedu.net.cn:3100/loki/api/v1/push
– url: http://loki02.fgedu.net.cn:3100/loki/api/v1/push
scrape_configs:
– job_name: system
static_configs:
– targets:
– localhost
labels:
job: varlogs
host: {{ hostname }}
__path__: /var/log/*log
– job_name: docker
static_configs:
– targets:
– localhost
labels:
job: docker
host: {{ hostname }}
__path__: /var/lib/docker/containers/*/*-json.log
– job_name: nginx
static_configs:
– targets:
– localhost
labels:
job: nginx
host: {{ hostname }}
__path__: /var/log/nginx/*log
# 重启Promtail
# systemctl restart promtail
6.2 配置日志标签
# vi /data/loki/config/promtail.yml
scrape_configs:
– job_name: system
static_configs:
– targets:
– localhost
labels:
job: varlogs
host: {{ hostname }}
environment: production
service: system
__path__: /var/log/*log
– job_name: application
static_configs:
– targets:
– localhost
labels:
job: application
host: {{ hostname }}
environment: production
service: app
__path__: /opt/app/logs/*log
# 重启Promtail
# systemctl restart promtail
7. Grafana集成
Grafana用于可视化Loki收集的日志数据。
7.1 配置Loki数据源
# 1. 点击左侧菜单的”Configuration” -> “Data sources”
# 2. 点击”Add data source”
# 3. 选择”Loki”
# 4. 配置URL为 http://loki01.fgedu.net.cn:3100
# 5. 点击”Save & Test”
# 验证数据源
# 1. 点击”Test”按钮
# 2. 确认显示”Data source is working”
7.2 配置日志查询
# 1. 点击左侧菜单的”Explore”
# 2. 选择Loki数据源
# 3. 输入查询语句,如:{job=”varlogs”} |= “error”
# 4. 点击”Run query”
# 5. 查看日志结果
# 配置日志面板
# 1. 打开一个仪表板
# 2. 点击”Add new panel”
# 3. 选择Loki数据源
# 4. 配置查询语句
# 5. 调整面板设置
# 6. 点击”Apply”
# 7. 点击”Save dashboard”
8. Loki安全配置
Loki提供了多种安全功能,包括认证、授权、TLS加密等。
8.1 认证配置
# vi /data/loki/config/loki.yml
auth_enabled: true
# 配置基本认证
# vi /data/loki/config/loki.yml
auth_enabled: true
server:
http_listen_port: 3100
grpc_listen_port: 9095
http_tls_config:
cert_file: /data/loki/config/cert.pem
key_file: /data/loki/config/key.pem
# 配置Promtail认证
# vi /data/loki/config/promtail.yml
clients:
– url: https://loki01.fgedu.net.cn:3100/loki/api/v1/push
basic_auth:
username: admin
password: password
# 重启Loki和Promtail
# systemctl restart loki promtail
8.2 TLS加密配置
# openssl req -newkey rsa:2048 -nodes -keyout /data/loki/config/key.pem -x509 -days 365 -out /data/loki/config/cert.pem
# 配置Loki
# vi /data/loki/config/loki.yml
server:
http_listen_port: 3100
grpc_listen_port: 9095
http_tls_config:
cert_file: /data/loki/config/cert.pem
key_file: /data/loki/config/key.pem
grpc_tls_config:
cert_file: /data/loki/config/cert.pem
key_file: /data/loki/config/key.pem
# 配置Promtail
# vi /data/loki/config/promtail.yml
clients:
– url: https://loki01.fgedu.net.cn:3100/loki/api/v1/push
tls_config:
insecure_skip_verify: false
ca_file: /data/loki/config/cert.pem
# 重启Loki和Promtail
# systemctl restart loki promtail
9. Loki性能优化
在生产环境中,需要对Loki进行性能优化以提高日志处理效率。from:www.itpux.com
9.1 存储优化
# vi /data/loki/config/loki.yml
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/index
cache_location: /data/loki/cache
shared_store: s3
cache_ttl: 24h
s3:
endpoint: s3.fgedu.net.cn
bucketnames: loki
access_key_id: AKIAXXXXXXXXXXXXXXXX
secret_access_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
region: us-east-1
insecure: false
s3forcepathstyle: true
compactor:
working_directory: /data/loki/compactor
shared_store: s3
compaction_interval: 10m
retention_enabled: true
retention_period: 7d
max_compaction_objects: 1000000
# 重启Loki
# systemctl restart loki
9.2 内存优化
# vi /etc/systemd/system/loki.service
[Unit]
Description=Loki Log Aggregation System
After=network.target
[Service]
User=loki
Environment=”GODEBUG=madvdontneed=1″
Environment=”GOMAXPROCS=16″
Environment=”GOGC=20″
ExecStart=/data/loki/bin/loki -config.file=/data/loki/config/loki.yml
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
# 重启Loki
# systemctl daemon-reload
# systemctl restart loki
9.3 查询优化
# vi /data/loki/config/loki.yml
query_range:
parallelise_shardable_queries: true
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
ttl: 24h
frontend:
max_outstanding_per_tenant: 2048
compress_responses: true
# 重启Loki
# systemctl restart loki
10. Loki升级迁移
本节介绍Loki的版本升级和数据迁移方法。
10.1 Loki版本升级
# cp /data/loki/config/loki.yml /backup/loki-config-$(date +%Y%m%d).yml
# 停止Loki服务
# systemctl stop loki
# 下载新版本Loki
# wget https://github.com/grafana/loki/releases/download/v2.8.1/loki-linux-amd64.zip
# 解压文件
# unzip loki-linux-amd64.zip
# mv loki-linux-amd64 /data/loki/bin/loki
# 启动Loki服务
# systemctl start loki
# 验证升级
# loki –version
loki, version 2.8.1 (branch: HEAD, revision: abcdefg1234567890abcdefg1234567890abcdefg)
build user: root@1a2b3c4d5e6f
build date: 2023-05-01T15:33:19Z
go version: go1.19.6
platform: linux/amd64
# 访问Loki API
# curl http://localhost:3100/loki/api/v1/status
10.2 Loki数据迁移
# cp -r /data/loki/data /backup/loki-data-$(date +%Y%m%d)
# 在新服务器上恢复数据
# scp -r /backup/loki-data-20230405 root@new-server:/data/loki/
# 安装Loki
# 重复安装步骤
# 启动Loki服务
# systemctl start loki
# 验证迁移
# curl http://new-server:3100/loki/api/v1/status
11. Loki备份恢复
本节介绍Loki的备份和恢复方法。
11.1 Loki备份
# vi /data/loki/scripts/backup.sh
#!/bin/bash
BACKUP_DIR=”/backup/loki”
DATE=$(date +%Y%m%d)
# 创建备份目录
mkdir -p $BACKUP_DIR
# 停止Loki服务
systemctl stop loki
# 备份数据
cp -r /data/loki/data $BACKUP_DIR/data-$DATE
cp /data/loki/config/loki.yml $BACKUP_DIR/config-$DATE.yml
cp /data/loki/config/promtail.yml $BACKUP_DIR/promtail-$DATE.yml
# 启动Loki服务
systemctl start loki
# 清理旧备份(保留7天)
find $BACKUP_DIR -type d -mtime +7 -exec rm -rf {} \;
# 添加执行权限
# chmod +x /data/loki/scripts/backup.sh
# 添加定时任务
# crontab -e
0 0 * * * /data/loki/scripts/backup.sh
11.2 Loki恢复
# systemctl stop loki
# 清理现有数据
# rm -rf /data/loki/data
# 恢复数据
# cp -r /backup/loki/data-20230405 /data/loki/data
# cp /backup/loki/config-20230405.yml /data/loki/config/loki.yml
# cp /backup/loki/promtail-20230405.yml /data/loki/config/promtail.yml
# 启动Loki服务
# systemctl start loki
# 验证恢复
# systemctl status loki
# curl http://localhost:3100/loki/api/v1/status
11.3 Loki监控脚本
# vi /data/loki/scripts/monitor.sh
#!/bin/bash
LOG_FILE=”/var/log/loki_monitor.log”
ALERT_EMAIL=”admin@fgedu.net.cn”
check_loki_status() {
echo “$(date): Checking loki status…” >> $LOG_FILE
status=$(systemctl status loki | grep Active | awk ‘{print $2}’)
if [ “$status” != “active” ]; then
echo “$(date): Loki is not running” >> $LOG_FILE
echo “Loki is not running” | mail -s “Loki Alert” $ALERT_EMAIL
systemctl start loki
else
echo “$(date): Loki is running” >> $LOG_FILE
fi
}
check_loki_api() {
echo “$(date): Checking loki API…” >> $LOG_FILE
status=$(curl -s -o /dev/null -w “%{http_code}” http://localhost:3100/loki/api/v1/status)
if [ “$status” = “200” ]; then
echo “$(date): Loki API: OK” >> $LOG_FILE
else
echo “$(date): Loki API: FAILED” >> $LOG_FILE
echo “Loki API failed” | mail -s “Loki Alert” $ALERT_EMAIL
fi
}
check_promtail_status() {
echo “$(date): Checking promtail status…” >> $LOG_FILE
status=$(systemctl status promtail | grep Active | awk ‘{print $2}’)
if [ “$status” != “active” ]; then
echo “$(date): Promtail is not running” >> $LOG_FILE
echo “Promtail is not running” | mail -s “Loki Alert” $ALERT_EMAIL
systemctl start promtail
else
echo “$(date): Promtail is running” >> $LOG_FILE
fi
}
main() {
check_loki_status
check_loki_api
check_promtail_status
}
main
# 添加执行权限
# chmod +x /data/loki/scripts/monitor.sh
# 添加定时任务
# crontab -e
*/15 * * * * /data/loki/scripts/monitor.sh
通过以上步骤,Loki安装配置、性能优化、升级迁移、备份恢复等内容已全部完成。Loki作为开源日志聚合系统,能够高效地处理和分析日志数据,是企业级日志管理的重要工具。
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
