1. Ceph概述与环境规划
Ceph是一个开源的分布式存储系统,提供对象存储、块存储和文件系统存储。Ceph设计用于高可用性、高可靠性和高性能,适用于各种存储场景。更多学习教程www.fgedu.net.cn
1.1 Ceph版本说明
Ceph目前主要版本为17.x(Pacific)系列,本教程以Ceph 17.2.6为例进行详细讲解。Ceph 17.x版本相比之前版本在性能、稳定性和功能方面都有显著提升,支持更多的存储特性。
$ ceph –version
ceph version 17.2.6 (abcdefg1234567890abcdefg1234567890abcdefg)
# 查看系统版本
$ cat /etc/os-release
NAME=”Oracle Linux Server”
VERSION=”8.9″
ID=”ol”
PRETTY_NAME=”Oracle Linux Server 8.9″
# 查看内核版本
$ uname -r
5.4.17-2136.302.7.2.el8uek.x86_64
1.2 环境规划
本次安装环境规划如下:
ceph01.fgedu.net.cn (192.168.1.116) – Monitor节点1
ceph02.fgedu.net.cn (192.168.1.117) – Monitor节点2
ceph03.fgedu.net.cn (192.168.1.118) – Monitor节点3
ceph04.fgedu.net.cn (192.168.1.119) – OSD节点1
ceph05.fgedu.net.cn (192.168.1.120) – OSD节点2
ceph06.fgedu.net.cn (192.168.1.121) – OSD节点3
Ceph版本:17.2.6
部署模式:分布式部署
数据存储:本地SSD磁盘
2. 硬件环境要求
Ceph作为分布式存储系统,对硬件资源要求根据存储容量和并发访问量而定。学习交流加群风哥微信: itpux-com
2.1 物理主机环境要求
– CPU:至少4核
– 内存:至少8GB
– 磁盘:系统盘120GB SSD
# OSD节点要求
– CPU:至少8核
– 内存:至少16GB
– 磁盘:系统盘120GB SSD + 数据盘4TB SSD x 4
# 检查OSD节点资源
# free -h
total used free shared buff/cache available
Mem: 16G 8.4G 7.1G 512M 512M 7.4G
Swap: 8G 0B 8G
# 检查磁盘空间
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 120G 20G 100G 17% /
/dev/sdb1 4TB 100G 3.9T 3% /data1
/dev/sdc1 4TB 100G 3.9T 3% /data2
/dev/sdd1 4TB 100G 3.9T 3% /data3
/dev/sde1 4TB 100G 3.9T 3% /data4
2.2 vSphere虚拟主机环境要求
– Monitor节点:
– vCPU:4核
– 内存:8GB
– 磁盘:系统盘120GB SSD
– 网络:VMXNET3网卡,10Gbps网络
– OSD节点:
– vCPU:8核
– 内存:16GB
– 磁盘:系统盘120GB SSD + 数据盘4TB SSD x 4
– 网络:VMXNET3网卡,10Gbps网络
资源池配置:
– CPU预留:Monitor节点2GHz,OSD节点4GHz
– 内存预留:Monitor节点4GB,OSD节点8GB
– 内存限制:Monitor节点8GB,OSD节点16GB
– CPU份额:正常
– 内存份额:正常
2.3 云平台主机环境要求
– Monitor节点:
– 实例规格:ecs.g6.2xlarge或同等规格
– vCPU:8核
– 内存:16GB
– 系统盘:SSD云盘 120GB
– 网络带宽:10Gbps以上
– OSD节点:
– 实例规格:ecs.g6.4xlarge或同等规格
– vCPU:16核
– 内存:32GB
– 系统盘:SSD云盘 120GB
– 数据盘:SSD云盘 4TB x 4
– 网络带宽:10Gbps以上
存储配置:
– 云盘快照:定期备份数据
– 跨区域复制:实现数据异地备份
3. 操作系统环境准备
在安装Ceph之前,需要对操作系统进行必要的配置和优化。
3.1 操作系统版本检查
# cat /etc/os-release
NAME=”Oracle Linux Server”
VERSION=”8.9″
ID=”ol”
PRETTY_NAME=”Oracle Linux Server 8.9″
# 检查内核版本
# uname -r
5.4.17-2136.302.7.2.el8uek.x86_64
# 检查SELinux状态
# getenforce
Enforcing
# 检查防火墙状态
# systemctl status firewalld
● firewalld.service – firewalld – dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: active (running)
3.2 依赖服务安装
# dnf install -y wget curl tar gzip python3 python3-pip
# 关闭防火墙
# systemctl stop firewalld
# systemctl disable firewalld
# 关闭SELinux
# setenforce 0
# sed -i ‘s/SELINUX=enforcing/SELINUX=disabled/’ /etc/selinux/config
# 创建Ceph用户
# useradd -r -s /bin/false ceph
# 创建目录结构
# mkdir -p /data/ceph/{config,bin,data}
# chown -R ceph:ceph /data/ceph
3.3 配置网络
# vi /etc/sysconfig/network-scripts/ifcfg-ens33
TYPE=Ethernet
BOOTPROTO=static
NAME=ens33
DEVICE=ens33
ONBOOT=yes
IPADDR=192.168.1.116
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
DNS1=114.114.114.114
# 重启网络
# systemctl restart NetworkManager
# 验证网络
# ping -c 4 google.com
# 配置主机名
# hostnamectl set-hostname ceph01.fgedu.net.cn
# 配置hosts文件
# vi /etc/hosts
192.168.1.116 ceph01.fgedu.net.cn ceph01
192.168.1.117 ceph02.fgedu.net.cn ceph02
192.168.1.118 ceph03.fgedu.net.cn ceph03
192.168.1.119 ceph04.fgedu.net.cn ceph04
192.168.1.120 ceph05.fgedu.net.cn ceph05
192.168.1.121 ceph06.fgedu.net.cn ceph06
3.4 配置SSH免密登录
# ssh-keygen -t rsa -b 2048
# 复制公钥到所有节点
# for host in ceph01 ceph02 ceph03 ceph04 ceph05 ceph06; do ssh-copy-id $host; done
# 验证免密登录
# ssh ceph02 hostname
ceph02.fgedu.net.cn
4. Ceph安装配置
完成环境准备后,开始安装Ceph。
4.1 安装Ceph
# vi /etc/yum.repos.d/ceph.repo
[ceph]
name=Ceph packages for $basearch
baseurl=https://download.ceph.com/rpm-pacific/el8/$basearch
gpgcheck=1
gpgkey=https://download.ceph.com/keys/release.asc
enabled=1
# 安装Ceph
# dnf install -y ceph ceph-mon ceph-osd ceph-mgr ceph-mds radosgw
# 验证安装
# ceph –version
ceph version 17.2.6 (abcdefg1234567890abcdefg1234567890abcdefg)
4.2 初始化Ceph集群
# vi /etc/ceph/ceph.conf
[global]
fsid = $(uuidgen)
mon_initial_members = ceph01, ceph02, ceph03
mon_host = 192.168.1.116,192.168.1.117,192.168.1.118
public_network = 192.168.1.0/24
cluster_network = 192.168.2.0/24
# 初始化Monitor节点
# cephadm bootstrap –mon-ip 192.168.1.116
# 加入其他Monitor节点
# cephadm add-repo –release pacific
# cephadm install ceph-common
# cephadm bootstrap –mon-ip 192.168.1.117
# cephadm bootstrap –mon-ip 192.168.1.118
# 验证集群状态
# ceph status
4.3 配置OSD节点
# ceph orch device ls
# 添加OSD节点
# ceph orch host add ceph04 192.168.1.119
# ceph orch host add ceph05 192.168.1.120
# ceph orch host add ceph06 192.168.1.121
# 部署OSD
# ceph orch daemon add osd ceph04:/dev/sdb
# ceph orch daemon add osd ceph04:/dev/sdc
# ceph orch daemon add osd ceph04:/dev/sdd
# ceph orch daemon add osd ceph04:/dev/sde
# ceph orch daemon add osd ceph05:/dev/sdb
# ceph orch daemon add osd ceph05:/dev/sdc
# ceph orch daemon add osd ceph05:/dev/sdd
# ceph orch daemon add osd ceph05:/dev/sde
# ceph orch daemon add osd ceph06:/dev/sdb
# ceph orch daemon add osd ceph06:/dev/sdc
# ceph orch daemon add osd ceph06:/dev/sdd
# ceph orch daemon add osd ceph06:/dev/sde
# 验证OSD状态
# ceph osd status
5. Ceph配置优化
为了提高Ceph的性能和稳定性,需要进行一些配置优化。
5.1 基本配置优化
# vi /etc/ceph/ceph.conf
[global]
fsid = $(uuidgen)
mon_initial_members = ceph01, ceph02, ceph03
mon_host = 192.168.1.116,192.168.1.117,192.168.1.118
public_network = 192.168.1.0/24
cluster_network = 192.168.2.0/24
osd_pool_default_size = 3
osd_pool_default_min_size = 2
osd_pool_default_pg_num = 128
osd_pool_default_pgp_num = 128
osd_crush_chooseleaf_type = 0
# 重启Ceph服务
# systemctl restart ceph-mon ceph-osd ceph-mgr
# 验证配置
# ceph config show
5.2 高可用配置
# ceph mon add ceph02 192.168.1.117:6789
# ceph mon add ceph03 192.168.1.118:6789
# 配置MGR高可用
# ceph mgr module enable dashboard
# ceph mgr module enable prometheus
# 验证高可用
# ceph status
5.3 内存配置
# vi /etc/ceph/ceph.conf
[osd]
osd_memory_target = 8589934592
[mds]
mds_cache_memory_limit = 4294967296
# 重启Ceph服务
# systemctl restart ceph-osd ceph-mds
6. Ceph Monitor配置
Ceph Monitor负责维护集群状态和配置信息。
6.1 配置Monitor
# vi /etc/ceph/ceph.conf
[mon]
mon_data = /var/lib/ceph/mon/ceph-$id
mon_clock_drift_allowed = 0.1
mon_max_pg_per_osd = 4096
# 重启Monitor服务
# systemctl restart ceph-mon
# 验证Monitor状态
# ceph mon status
6.2 监控Monitor
# ceph mon status
# 查看Monitor日志
# tail -f /var/log/ceph/ceph-mon-*.log
# 监控Monitor健康状态
# ceph health
7. Ceph OSD配置
Ceph OSD负责存储数据和处理数据复制。
7.1 配置OSD
# vi /etc/ceph/ceph.conf
[osd]
osd_data = /var/lib/ceph/osd/ceph-$id
osd_journal_size = 10240
osd_max_object_name_len = 256
osd_max_object_namespace_len = 64
osd_op_threads = 8
osd_disk_threads = 4
osd_map_cache_size = 512
osd_pool_default_size = 3
osd_pool_default_min_size = 2
# 重启OSD服务
# systemctl restart ceph-osd
# 验证OSD状态
# ceph osd status
7.2 管理OSD
# ceph osd status
# 查看OSD详情
# ceph osd dump
# 重启OSD
# systemctl restart ceph-osd@0
# 停止OSD
# systemctl stop ceph-osd@0
# 启动OSD
# systemctl start ceph-osd@0
8. Ceph MDS配置
Ceph MDS负责管理Ceph文件系统的元数据。
8.1 配置MDS
# vi /etc/ceph/ceph.conf
[mds]
mds_data = /var/lib/ceph/mds/ceph-$id
mds_cache_memory_limit = 4294967296
mds_max_file_size = 107374182400
# 启动MDS服务
# systemctl start ceph-mds
# systemctl enable ceph-mds
# 验证MDS状态
# ceph mds status
8.2 创建Ceph文件系统
# ceph osd pool create cephfs_metadata 128
# 创建数据池
# ceph osd pool create cephfs_data 1024
# 创建文件系统
# ceph fs new cephfs cephfs_metadata cephfs_data
# 验证文件系统
# ceph fs status
9. Ceph RGW配置
Ceph RGW(RADOS Gateway)提供S3兼容的对象存储服务。
9.1 配置RGW
# vi /etc/ceph/ceph.conf
[client.rgw.ceph01]
rgw_frontends = “civetweb port=8080”
rgw_thread_pool_size = 100
# 启动RGW服务
# systemctl start ceph-radosgw@rgw.ceph01
# systemctl enable ceph-radosgw@rgw.ceph01
# 验证RGW状态
# systemctl status ceph-radosgw@rgw.ceph01
9.2 创建S3用户
# radosgw-admin user create –uid=testuser –display-name=”Test User”
# 查看用户信息
# radosgw-admin user info –uid=testuser
# 测试S3 API
# aws s3 ls –endpoint-url http://ceph01:8080 –region us-east-1 –profile testuser
10. Ceph安全配置
Ceph提供了多种安全功能,包括认证、授权、TLS加密等。
10.1 认证配置
# ceph auth get-or-create client.admin mon ‘allow *’ osd ‘allow *’ mds ‘allow *’
# 查看密钥环
# ceph auth list
# 导出密钥环
# ceph auth get client.admin > /etc/ceph/ceph.client.admin.keyring
# 设置权限
# chmod 600 /etc/ceph/ceph.client.admin.keyring
10.2 TLS加密配置
# mkdir -p /etc/ceph/ssl
# openssl req -newkey rsa:2048 -nodes -keyout /etc/ceph/ssl/ceph.key -x509 -days 365 -out /etc/ceph/ssl/ceph.crt
# 编辑Ceph配置文件
# vi /etc/ceph/ceph.conf
[global]
fsid = $(uuidgen)
mon_initial_members = ceph01, ceph02, ceph03
mon_host = 192.168.1.116,192.168.1.117,192.168.1.118
public_network = 192.168.1.0/24
cluster_network = 192.168.2.0/24
[mon]
mon_tls_cert = /etc/ceph/ssl/ceph.crt
mon_tls_key = /etc/ceph/ssl/ceph.key
mon_tls_ca = /etc/ceph/ssl/ceph.crt
[osd]
osd_tls_cert = /etc/ceph/ssl/ceph.crt
osd_tls_key = /etc/ceph/ssl/ceph.key
osd_tls_ca = /etc/ceph/ssl/ceph.crt
# 重启Ceph服务
# systemctl restart ceph-mon ceph-osd ceph-mgr
11. Ceph性能优化
在生产环境中,需要对Ceph进行性能优化以提高存储和检索效率。from:www.itpux.com
11.1 内存优化
# vi /etc/ceph/ceph.conf
[osd]
osd_memory_target = 8589934592
[mds]
mds_cache_memory_limit = 4294967296
[mon]
mon_osd_full_ratio = 0.95
mon_osd_nearfull_ratio = 0.85
# 重启Ceph服务
# systemctl restart ceph-osd ceph-mds ceph-mon
11.2 网络优化
# vi /etc/sysctl.conf
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15
# 应用配置
# sysctl -p
11.3 磁盘优化
# vi /etc/sysctl.conf
vm.swappiness = 0
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
# 应用配置
# sysctl -p
# 优化XFS文件系统
# vi /etc/fstab
/dev/sdb1 /data1 xfs defaults,noatime,nodiratime 0 0
/dev/sdc1 /data2 xfs defaults,noatime,nodiratime 0 0
/dev/sdd1 /data3 xfs defaults,noatime,nodiratime 0 0
/dev/sde1 /data4 xfs defaults,noatime,nodiratime 0 0
# 重新挂载
# mount -o remount /data1
# mount -o remount /data2
# mount -o remount /data3
# mount -o remount /data4
12. Ceph升级迁移
本节介绍Ceph的版本升级和数据迁移方法。
12.1 Ceph版本升级
# cp -r /etc/ceph /backup/ceph-config-$(date +%Y%m%d)
# 停止Ceph服务
# systemctl stop ceph-mon ceph-osd ceph-mgr ceph-mds ceph-radosgw
# 升级Ceph
# dnf update -y ceph ceph-mon ceph-osd ceph-mgr ceph-mds radosgw
# 启动Ceph服务
# systemctl start ceph-mon ceph-osd ceph-mgr ceph-mds ceph-radosgw
# 验证升级
# ceph –version
ceph version 17.2.7 (abcdefg1234567890abcdefg1234567890abcdefg)
# 验证集群状态
# ceph status
12.2 Ceph数据迁移
# rados -p cephfs_data ls | xargs -I {} rados -p cephfs_data get {} /backup/ceph-data/{}
# 在新集群上恢复数据
# rados -p cephfs_data put {} /backup/ceph-data/{}
# 安装Ceph
# 重复安装步骤
# 启动Ceph服务
# systemctl start ceph-mon ceph-osd ceph-mgr ceph-mds ceph-radosgw
# 验证迁移
# ceph status
13. Ceph备份恢复
本节介绍Ceph的备份和恢复方法。
13.1 Ceph备份
# vi /data/ceph/scripts/backup.sh
#!/bin/bash
BACKUP_DIR=”/backup/ceph”
DATE=$(date +%Y%m%d)
# 创建备份目录
mkdir -p $BACKUP_DIR
# 停止Ceph服务
systemctl stop ceph-mon ceph-osd ceph-mgr ceph-mds ceph-radosgw
# 备份配置文件
cp -r /etc/ceph $BACKUP_DIR/config-$DATE
# 备份数据
tar -czf $BACKUP_DIR/data-$DATE.tar.gz /var/lib/ceph
# 启动Ceph服务
systemctl start ceph-mon ceph-osd ceph-mgr ceph-mds ceph-radosgw
# 清理旧备份(保留7天)
find $BACKUP_DIR -type f -mtime +7 -exec rm -f {} \;
# 添加执行权限
# chmod +x /data/ceph/scripts/backup.sh
# 添加定时任务
# crontab -e
0 0 * * * /data/ceph/scripts/backup.sh
13.2 Ceph恢复
# systemctl stop ceph-mon ceph-osd ceph-mgr ceph-mds ceph-radosgw
# 清理现有数据
# rm -rf /var/lib/ceph
# 恢复数据
# tar -xzf /backup/ceph/data-20230405.tar.gz -C /
# 恢复配置文件
# cp -r /backup/ceph/config-20230405/* /etc/ceph/
# 启动Ceph服务
# systemctl start ceph-mon ceph-osd ceph-mgr ceph-mds ceph-radosgw
# 验证恢复
# ceph status
13.3 Ceph监控脚本
# vi /data/ceph/scripts/monitor.sh
#!/bin/bash
LOG_FILE=”/var/log/ceph_monitor.log”
ALERT_EMAIL=”admin@fgedu.net.cn”
check_ceph_status() {
echo “$(date): Checking ceph status…” >> $LOG_FILE
status=$(ceph health | awk ‘{print $2}’)
if [ “$status” != “HEALTH_OK” ]; then
echo “$(date): Ceph health is $status” >> $LOG_FILE
echo “Ceph health is $status” | mail -s “Ceph Alert” $ALERT_EMAIL
else
echo “$(date): Ceph health is OK” >> $LOG_FILE
fi
}
check_osd_status() {
echo “$(date): Checking osd status…” >> $LOG_FILE
osd_count=$(ceph osd stat | awk ‘{print $2}’)
osd_up=$(ceph osd stat | awk ‘{print $4}’)
osd_in=$(ceph osd stat | awk ‘{print $6}’)
if [ “$osd_up” != “$osd_count” ] || [ “$osd_in” != “$osd_count” ]; then
echo “$(date): OSD status: $osd_up up, $osd_in in, $osd_count total” >> $LOG_FILE
echo “OSD status: $osd_up up, $osd_in in, $osd_count total” | mail -s “Ceph Alert” $ALERT_EMAIL
else
echo “$(date): OSD status: all $osd_count OSDs are up and in” >> $LOG_FILE
fi
}
check_mon_status() {
echo “$(date): Checking mon status…” >> $LOG_FILE
mon_count=$(ceph mon stat | awk ‘{print $2}’)
mon_quorum=$(ceph mon stat | awk ‘{print $4}’)
if [ “$mon_quorum” != “$mon_count” ]; then
echo “$(date): Monitor status: $mon_quorum in quorum, $mon_count total” >> $LOG_FILE
echo “Monitor status: $mon_quorum in quorum, $mon_count total” | mail -s “Ceph Alert” $ALERT_EMAIL
else
echo “$(date): Monitor status: all $mon_count monitors are in quorum” >> $LOG_FILE
fi
}
main() {
check_ceph_status
check_osd_status
check_mon_status
}
main
# 添加执行权限
# chmod +x /data/ceph/scripts/monitor.sh
# 添加定时任务
# crontab -e
*/15 * * * * /data/ceph/scripts/monitor.sh
通过以上步骤,Ceph安装配置、性能优化、升级迁移、备份恢复等内容已全部完成。Ceph作为开源分布式存储系统,能够高效地存储和管理数据,是企业级存储的重要工具。
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
