1. 首页 > 国产数据库教程 > 达梦DM教程 > 正文

DM教程FG147-达梦数据库灾难恢复

本文档风哥主要介绍DM数据库灾难恢复,包括灾难恢复概述、备份恢复概述、高可用概述、备份策略、恢复策略、高可用架构、备份实现、恢复实现、灾难恢复实现、实际案例和最佳实践等内容,风哥教程参考DM官方文档DM8备份恢复指南、DM8高可用指南,适合数据库管理员在学习和生产环境中使用。

Part01-基础概念与理论知识

1.1 灾难恢复概述

灾难恢复是指在发生灾难性事件时,恢复系统和服务的能力。

# 灾难恢复的定义
灾难恢复是指在发生灾难性事件时,恢复系统和服务的能力。
# 灾难类型
1. 自然灾难
– 地震
– 洪水
– 火灾
2. 人为灾难
– 操作失误
– 恶意攻击
– 数据删除
3. 技术灾难
– 硬件故障
– 软件故障
– 网络故障
# 灾难恢复目标
1. RPO(Recovery Point Objective)
– 恢复点目标
– 数据丢失容忍度
– 通常以时间衡量
2. RTO(Recovery Time Objective)
– 恢复时间目标
– 服务恢复时间
– 通常以时间衡量
# 灾难恢复价值
– 数据保护:保护重要数据
– 业务连续性:保证业务连续
– 降低损失:降低灾难损失
– 提高可靠性:提高系统可靠性

1.2 备份恢复概述

备份恢复是数据库管理的重要组成部分,是保证数据安全和业务连续性的重要手段。

# 备份恢复的定义
备份恢复是数据库管理的重要组成部分,是保证数据安全和业务连续性的重要手段。
# 备份类型
1. 全量备份
– 备份整个数据库
– 备份时间较长
– 恢复简单
2. 增量备份
– 备份增量数据
– 备份时间较短
– 恢复复杂
3. 差异备份
– 备份差异数据
– 备份时间适中
– 恢复适中
# 备份方式
1. 物理备份
– 备份物理文件 风哥提示:
– 备份速度快
– 跨平台限制
2. 逻辑备份
– 备份逻辑数据
– 备份速度慢
– 跨平台支持
# 备份恢复价值
– 数据保护:保护重要数据
– 数据恢复:快速恢复数据
– 业务连续:保证业务连续
– 降低风险:降低数据丢失风险

1.3 高可用概述

高可用是指系统在长时间内保持可用性的能力,是保证业务连续性的重要手段。

# 高可用的定义
高可用是指系统在长时间内保持可用性的能力,是保证业务连续性的重要手段。
# 高可用架构
1. 主备架构
– 主库处理读写
– 备库只读
– 故障切换
2. 集群架构
– 多节点集群
– 负载均衡
– 故障转移
3. 分布式架构
– 数据分布
– 负载均衡
– 故障隔离 学习交流加群风哥微信: itpux-com
# 高可用技术
1. Data Guard
– 主备复制
– 实时同步
– 故障切换
2. DMDSC
– 共享存储
– 多节点集群
– 负载均衡
# 高可用价值
– 提高可用性:提高系统可用性
– 快速故障恢复:快速恢复故障
– 负载均衡:均衡系统负载
– 数据安全:保证数据安全
风哥提示:灾难恢复是数据库管理的重要组成部分,掌握灾难恢复的方法和工具,是保证数据安全和业务连续性的关键。根据业务需求和数据重要性,选择合适的灾难恢复方案,是保证项目成功的重要手段。

Part02-生产环境规划与建议

2.1 备份策略

2.1.1 备份规划

# 1. 备份频率
– 全量备份:每周一次
– 增量备份:每天一次
– 日志备份:每小时一次
# 2. 备份保留
– 全量备份:保留4周
– 增量备份:保留1周
– 日志备份:保留3天
# 3. 备份存储
– 本地存储:快速恢复
– 异地存储:灾难恢复
– 云端存储:长期保留
# 4. 实际示例
– 备份脚本
$ vi /dm/scripts/backup.sh
#!/bin/bash
# 全量备份
/dm/bin/dmrman CTLSTMT=”BACKUP DATABASE ‘/dm/data/dm.ini’ FULL TO ‘/backup/full/full_$(date +%Y%m%d).bak'” 学习交流加群风哥QQ113257174
# 增量备份
/dm/bin/dmrman CTLSTMT=”BACKUP DATABASE ‘/dm/data/dm.ini’ INCREMENT TO ‘/backup/inc/inc_$(date +%Y%m%d).bak'”
# 日志备份
/dm/bin/dmrman CTLSTMT=”BACKUP ARCHIVELOG FROM LOG ‘/dm/data/dm.ini’ TO ‘/backup/log/log_$(date +%Y%m%d%H).bak'”
– 定时任务
$ crontab -e
# 全量备份:每周日凌晨2点
0 2 * * 0 /dm/scripts/backup.sh full
# 增量备份:每天凌晨2点
0 2 * * 1-6 /dm/scripts/backup.sh inc
# 日志备份:每小时
0 * * * * /dm/scripts/backup.sh log

2.1.2 备份验证

# 1. 备份完整性检查
– 检查备份文件
$ ls -lh /backup/full/
# 输出结果
# total 10G
# -rw-r–r– 1 dm dm 2.5G Jan 10 02:00 full_20240110.bak
# -rw-r–r– 1 dm dm 2.5G Jan 3 02:00 full_20240103.bak
– 验证备份文件
$ /dm/bin/dmrman CTLSTMT=”CHECK BACKUPSET ‘/backup/full/full_20240110.bak'”
# 输出结果
# check backupset ‘/backup/full/full_20240110.bak’ success
# 2. 备份恢复测试
– 恢复测试
$ /dm/bin/dmrman CTLSTMT=”RESTORE DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/full/full_20240110.bak'”
# 输出结果
# restore database ‘/dm/data/dm.ini’ from backupset ‘/backup/full/full_20240110.bak’ success
# 3. 实际示例
– 自动化验证脚本
$ vi /dm/scripts/verify_backup.sh
#!/bin/bash
# 检查备份文件
BACKUP_FILE=”/backup/full/full_$(date +%Y%m%d).bak”
if [ -f “$BACKUP_FILE” ]; then
echo “Backup file exists: $BACKUP_FILE”
# 验证备份文件 更多视频教程www.fgedu.net.cn
/dm/bin/dmrman CTLSTMT=”CHECK BACKUPSET ‘$BACKUP_FILE'”
if [ $? -eq 0 ]; then
echo “Backup file is valid”
else
echo “Backup file is invalid”
# 发送告警
echo “Backup file is invalid: $BACKUP_FILE” | mail -s “Backup Verification Failed” admin@fgedu.com
fi
else
echo “Backup file does not exist: $BACKUP_FILE”
# 发送告警
echo “Backup file does not exist: $BACKUP_FILE” | mail -s “Backup File Missing” admin@fgedu.com
fi
– 定时任务
$ crontab -e
# 备份验证:每天凌晨3点
0 3 * * * /dm/scripts/verify_backup.sh

2.2 恢复策略

2.2.1 恢复规划

# 1. 恢复场景
– 完全恢复
– 恢复到最新状态
– 使用全量备份+增量备份+日志
– 数据零丢失
– 时间点恢复
– 恢复到指定时间点
– 使用全量备份+增量备份+日志
– 数据部分丢失
– 表空间恢复
– 恢复指定表空间
– 使用表空间备份
– 快速恢复
# 2. 恢复流程
– 停止数据库
$ /dm/bin/dmserver /dm/data/dm.ini stop
– 恢复数据库
$ /dm/bin/dmrman CTLSTMT=”RESTORE DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/full/full_20240110.bak'”
– 恢复增量备份 更多学习教程公众号风哥教程itpux_com
$ /dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/inc/inc_20240110.bak'”
– 恢复日志
$ /dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM ARCHIVELOG ‘/backup/log/log_2024011010.bak'”
– 启动数据库
$ /dm/bin/dmserver /dm/data/dm.ini start
# 3. 实际示例
– 恢复脚本
$ vi /dm/scripts/recover.sh
#!/bin/bash
# 停止数据库
/dm/bin/dmserver /dm/data/dm.ini stop
# 恢复全量备份
/dm/bin/dmrman CTLSTMT=”RESTORE DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/full/full_20240110.bak'”
# 恢复增量备份
/dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/inc/inc_20240110.bak'”
# 恢复日志
/dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM ARCHIVELOG ‘/backup/log/log_2024011010.bak'”
# 启动数据库
/dm/bin/dmserver /dm/data/dm.ini start
# 验证恢复
/dm/bin/disql SYSDBA/SYSDBA@fgedu.localhost:5236 “SELECT * FROM V$VERSION”

2.2.2 恢复验证

# 1. 数据完整性检查
– 检查数据文件
$ ls -lh /dm/data/
# 输出结果
# total 20G
# -rw-r–r– 1 dm dm 10G Jan 10 10:00 SYSTEM.DBF from DB视频:www.itpux.com
# -rw-r–r– 1 dm dm 5G Jan 10 10:00 MAIN.DBF
# -rw-r–r– 1 dm dm 5G Jan 10 10:00 ROLL.DBF
– 检查数据库状态
SQL> SELECT * FROM V$INSTANCE;
# 输出结果
# NAME STATUS
# ——- ——-
# FGEDU OPEN
# 2. 数据一致性检查
– 检查表数据
SQL> SELECT COUNT(*) FROM fgedu_user;
# 输出结果
# COUNT(*)
# ———-
# 1000
– 检查索引
SQL> SELECT * FROM V$INDEX WHERE TABLE_NAME = ‘FGEDU_USER’;
# 输出结果
# INDEX_NAME TABLE_NAME STATUS
# ——————- ———— ——-
# IDX_FGEDU_USER_ID FGEDU_USER VALID
# IDX_FGEDU_USER_EMAIL FGEDU_USER VALID
# 3. 实际示例
– 验证脚本
$ vi /dm/scripts/verify_recovery.sh
#!/bin/bash
# 检查数据库状态
STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@fgedu.localhost:5236 -c “SELECT STATUS FROM V\$INSTANCE” | tail -n +2)
if [ “$STATUS” = “OPEN” ]; then
echo “Database is open”
# 检查表数据
COUNT=$(/dm/bin/disql SYSDBA/SYSDBA@fgedu.localhost:5236 -c “SELECT COUNT(*) FROM fgedu_user” | tail -n +2)
echo “Table fgedu_user has $COUNT rows”
# 检查索引
/dm/bin/disql SYSDBA/SYSDBA@fgedu.localhost:5236 -c “SELECT * FROM V\$INDEX WHERE TABLE_NAME = ‘FGEDU_USER'”
else
echo “Database is not open: $STATUS”
# 发送告警
echo “Database is not open: $STATUS” | mail -s “Recovery Verification Failed” admin@fgedu.com
fi

2.3 高可用架构

2.3.1 Data Guard架构

# 1. Data Guard架构
– 主库(Primary)
– 处理读写请求
– 生成Redo日志
– 发送到备库
– 备库(Standby)
– 接收Redo日志
– 应用Redo日志
– 只读查询
– 监控(Observer)
– 监控主备库状态
– 故障检测
– 自动切换
# 2. Data Guard配置
– 主库配置
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
– 备库配置
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
– Data Guard配置
SQL> CREATE STANDBY DATABASE ‘standby_db’ CONNECT TO ‘standby_host’ PORT 5236 USER ‘SYSDBA’ PASSWORD ‘SYSDBA’;
# 3. 实际示例
– 配置Data Guard
# 主库配置
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
SQL> CREATE STANDBY DATABASE ‘standby_db’ CONNECT TO ‘192.168.1.11’ PORT 5236 USER ‘SYSDBA’ PASSWORD ‘SYSDBA’;
# 备库配置
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
# 验证Data Guard
SQL> SELECT * FROM V$STANDBY;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————- ————– ————– —————
# standby_db 192.168.1.11 5236 NORMAL

2.3.2 DMDSC架构

# 1. DMDSC架构
– 共享存储
– 存储数据库文件
– 多节点共享
– 高可用存储
– 节点(Node)
– 访问共享存储
– 处理用户请求
– 故障转移
– 监控(CSS)
– 集群同步服务
– 节点状态监控
– 故障检测
# 2. DMDSC配置
– 共享存储配置
$ vi /etc/dmdsc/dmdsc.ini
DMDCS_CLUSTER_NAME = fgedu_cluster
DMDCS_EP_NAME = ep01
DMDCS_EP_HOST = 192.168.1.10
DMDCS_EP_PORT = 5236
DMDCS_SHARED_DIR = /shared/dm
– 节点配置
$ vi /etc/dmdsc/dm.ini
PORT_NUM = 5236
MAL_HOST = 192.168.1.10
MAL_PORT = 5246
MAL_INST_NAME = EP01
# 3. 实际示例
– 配置DMDSC
# 节点1配置
$ vi /etc/dmdsc/dmdsc.ini
DMDCS_CLUSTER_NAME = fgedu_cluster
DMDCS_EP_NAME = ep01
DMDCS_EP_HOST = 192.168.1.10
DMDCS_EP_PORT = 5236
DMDCS_SHARED_DIR = /shared/dm
$ vi /etc/dmdsc/dm.ini
PORT_NUM = 5236
MAL_HOST = 192.168.1.10
MAL_PORT = 5246
MAL_INST_NAME = EP01
# 节点2配置
$ vi /etc/dmdsc/dmdsc.ini
DMDCS_CLUSTER_NAME = fgedu_cluster
DMDCS_EP_NAME = ep02
DMDCS_EP_HOST = 192.168.1.11
DMDCS_EP_PORT = 5236
DMDCS_SHARED_DIR = /shared/dm
$ vi /etc/dmdsc/dm.ini
PORT_NUM = 5236
MAL_HOST = 192.168.1.11
MAL_PORT = 5246
MAL_INST_NAME = EP02
# 启动DMDSC
$ /dm/bin/dmcss /etc/dmdsc/dmdsc.ini
# 验证DMDSC
SQL> SELECT * FROM V$DMDSC_EP;
# 输出结果
# EP_NAME EP_HOST EP_PORT EP_STATUS
# ——– ———— ——– ———-
# ep01 192.168.1.10 5236 NORMAL
# ep02 192.168.1.11 5236 NORMAL
生产环境建议:根据业务需求和数据重要性,制定合适的备份策略和恢复策略。在高可用架构中,要考虑故障切换和数据一致性。建立完善的监控体系,及时发现和解决问题。

Part03-生产环境项目实施方案

3.1 备份实现

3.1.1 全量备份

# 1. 全量备份命令
– 使用dmrman备份
$ /dm/bin/dmrman CTLSTMT=”BACKUP DATABASE ‘/dm/data/dm.ini’ FULL TO ‘/backup/full/full_$(date +%Y%m%d).bak'”
# 输出结果
# backup database ‘/dm/data/dm.ini’ full to ‘/backup/full/full_20240110.bak’
# backup file: /backup/full/full_20240110.bak
# backup size: 2.5GB
# backup time: 1800s
# 2. 全量备份脚本
$ vi /dm/scripts/full_backup.sh
#!/bin/bash
# 备份目录
BACKUP_DIR=”/backup/full”
BACKUP_FILE=”$BACKUP_DIR/full_$(date +%Y%m%d).bak”
# 创建备份目录
mkdir -p $BACKUP_DIR
# 执行全量备份
/dm/bin/dmrman CTLSTMT=”BACKUP DATABASE ‘/dm/data/dm.ini’ FULL TO ‘$BACKUP_FILE'”
# 检查备份结果
if [ $? -eq 0 ]; then
echo “Full backup succeeded: $BACKUP_FILE”
# 发送成功通知
echo “Full backup succeeded: $BACKUP_FILE” | mail -s “Full Backup Succeeded” admin@fgedu.com
else
echo “Full backup failed: $BACKUP_FILE”
# 发送失败告警
echo “Full backup failed: $BACKUP_FILE” | mail -s “Full Backup Failed” admin@fgedu.com
fi
# 3. 实际示例
– 执行全量备份
$ /dm/scripts/full_backup.sh
# 输出结果
# backup database ‘/dm/data/dm.ini’ full to ‘/backup/full/full_20240110.bak’
# backup file: /backup/full/full_20240110.bak
# backup size: 2.5GB
# backup time: 1800s
# Full backup succeeded: /backup/full/full_20240110.bak
– 查看备份文件
$ ls -lh /backup/full/
# 输出结果
# total 10G
# -rw-r–r– 1 dm dm 2.5G Jan 10 02:00 full_20240110.bak

3.1.2 增量备份

# 1. 增量备份命令
– 使用dmrman备份
$ /dm/bin/dmrman CTLSTMT=”BACKUP DATABASE ‘/dm/data/dm.ini’ INCREMENT TO ‘/backup/inc/inc_$(date +%Y%m%d).bak'”
# 输出结果
# backup database ‘/dm/data/dm.ini’ increment to ‘/backup/inc/inc_20240110.bak’
# backup file: /backup/inc/inc_20240110.bak
# backup size: 500MB
# backup time: 300s
# 2. 增量备份脚本
$ vi /dm/scripts/inc_backup.sh
#!/bin/bash
# 备份目录
BACKUP_DIR=”/backup/inc”
BACKUP_FILE=”$BACKUP_DIR/inc_$(date +%Y%m%d).bak”
# 创建备份目录
mkdir -p $BACKUP_DIR
# 执行增量备份
/dm/bin/dmrman CTLSTMT=”BACKUP DATABASE ‘/dm/data/dm.ini’ INCREMENT TO ‘$BACKUP_FILE'”
# 检查备份结果
if [ $? -eq 0 ]; then
echo “Increment backup succeeded: $BACKUP_FILE”
# 发送成功通知
echo “Increment backup succeeded: $BACKUP_FILE” | mail -s “Increment Backup Succeeded” admin@fgedu.com
else
echo “Increment backup failed: $BACKUP_FILE”
# 发送失败告警
echo “Increment backup failed: $BACKUP_FILE” | mail -s “Increment Backup Failed” admin@fgedu.com
fi
# 3. 实际示例
– 执行增量备份
$ /dm/scripts/inc_backup.sh
# 输出结果
# backup database ‘/dm/data/dm.ini’ increment to ‘/backup/inc/inc_20240110.bak’
# backup file: /backup/inc/inc_20240110.bak
# backup size: 500MB
# backup time: 300s
# Increment backup succeeded: /backup/inc/inc_20240110.bak
– 查看备份文件
$ ls -lh /backup/inc/
# 输出结果
# total 2G
# -rw-r–r– 1 dm dm 500M Jan 10 02:00 inc_20240110.bak

3.2 恢复实现

3.2.1 完全恢复

# 1. 完全恢复命令
– 停止数据库
$ /dm/bin/dmserver /dm/data/dm.ini stop
– 恢复全量备份
$ /dm/bin/dmrman CTLSTMT=”RESTORE DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/full/full_20240110.bak'”
# 输出结果
# restore database ‘/dm/data/dm.ini’ from backupset ‘/backup/full/full_20240110.bak’
# restore file: /dm/data/SYSTEM.DBF
# restore file: /dm/data/MAIN.DBF
# restore file: /dm/data/ROLL.DBF
# restore time: 1200s
– 恢复增量备份
$ /dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/inc/inc_20240110.bak'”
# 输出结果
# recover database ‘/dm/data/dm.ini’ from backupset ‘/backup/inc/inc_20240110.bak’
# recover time: 300s
– 恢复日志
$ /dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM ARCHIVELOG ‘/backup/log/log_2024011010.bak'”
# 输出结果
# recover database ‘/dm/data/dm.ini’ from archivelog ‘/backup/log/log_2024011010.bak’
# recover time: 200s
– 启动数据库
$ /dm/bin/dmserver /dm/data/dm.ini start
# 2. 完全恢复脚本
$ vi /dm/scripts/full_recover.sh
#!/bin/bash
# 停止数据库
/dm/bin/dmserver /dm/data/dm.ini stop
# 恢复全量备份
/dm/bin/dmrman CTLSTMT=”RESTORE DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/full/full_20240110.bak'”
if [ $? -ne 0 ]; then
echo “Restore full backup failed”
exit 1
fi
# 恢复增量备份
/dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/inc/inc_20240110.bak'”
if [ $? -ne 0 ]; then
echo “Recover increment backup failed”
exit 1
fi
# 恢复日志
/dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM ARCHIVELOG ‘/backup/log/log_2024011010.bak'”
if [ $? -ne 0 ]; then
echo “Recover archivelog failed”
exit 1
fi
# 启动数据库
/dm/bin/dmserver /dm/data/dm.ini start
if [ $? -eq 0 ]; then
echo “Full recovery succeeded”
# 发送成功通知
echo “Full recovery succeeded” | mail -s “Full Recovery Succeeded” admin@fgedu.com
else
echo “Start database failed”
# 发送失败告警
echo “Start database failed” | mail -s “Full Recovery Failed” admin@fgedu.com
fi
# 3. 实际示例
– 执行完全恢复
$ /dm/scripts/full_recover.sh
# 输出结果
# restore database ‘/dm/data/dm.ini’ from backupset ‘/backup/full/full_20240110.bak’
# restore time: 1200s
# recover database ‘/dm/data/dm.ini’ from backupset ‘/backup/inc/inc_20240110.bak’
# recover time: 300s
# recover database ‘/dm/data/dm.ini’ from archivelog ‘/backup/log/log_2024011010.bak’
# recover time: 200s
# Full recovery succeeded

3.2.2 时间点恢复

# 1. 时间点恢复命令
– 停止数据库
$ /dm/bin/dmserver /dm/data/dm.ini stop
– 恢复全量备份
$ /dm/bin/dmrman CTLSTMT=”RESTORE DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/full/full_20240110.bak'”
– 恢复增量备份
$ /dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/inc/inc_20240110.bak'”
– 恢复日志到指定时间点
$ /dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM ARCHIVELOG ‘/backup/log/log_2024011010.bak’ UNTIL TIME ‘2024-01-10 10:00:00′”
# 输出结果
# recover database ‘/dm/data/dm.ini’ from archivelog ‘/backup/log/log_2024011010.bak’ until time ‘2024-01-10 10:00:00’
# recover time: 200s
– 启动数据库
$ /dm/bin/dmserver /dm/data/dm.ini start
# 2. 时间点恢复脚本
$ vi /dm/scripts/timepoint_recover.sh
#!/bin/bash
# 恢复时间点
RECOVER_TIME=$1
if [ -z “$RECOVER_TIME” ]; then
echo “Usage: $0
echo “Example: $0 ‘2024-01-10 10:00:00′”
exit 1
fi
# 停止数据库
/dm/bin/dmserver /dm/data/dm.ini stop
# 恢复全量备份
/dm/bin/dmrman CTLSTMT=”RESTORE DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/full/full_20240110.bak'”
if [ $? -ne 0 ]; then
echo “Restore full backup failed”
exit 1
fi
# 恢复增量备份
/dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM BACKUPSET ‘/backup/inc/inc_20240110.bak'”
if [ $? -ne 0 ]; then
echo “Recover increment backup failed”
exit 1
fi
# 恢复日志到指定时间点
/dm/bin/dmrman CTLSTMT=”RECOVER DATABASE ‘/dm/data/dm.ini’ FROM ARCHIVELOG ‘/backup/log/log_2024011010.bak’ UNTIL TIME ‘$RECOVER_TIME'”
if [ $? -ne 0 ]; then
echo “Recover archivelog failed”
exit 1
fi
# 启动数据库
/dm/bin/dmserver /dm/data/dm.ini start
if [ $? -eq 0 ]; then
echo “Timepoint recovery succeeded: $RECOVER_TIME”
# 发送成功通知
echo “Timepoint recovery succeeded: $RECOVER_TIME” | mail -s “Timepoint Recovery Succeeded” admin@fgedu.com
else
echo “Start database failed”
# 发送失败告警
echo “Start database failed” | mail -s “Timepoint Recovery Failed” admin@fgedu.com
fi
# 3. 实际示例
– 执行时间点恢复
$ /dm/scripts/timepoint_recover.sh “2024-01-10 10:00:00”
# 输出结果
# restore database ‘/dm/data/dm.ini’ from backupset ‘/backup/full/full_20240110.bak’
# restore time: 1200s
# recover database ‘/dm/data/dm.ini’ from backupset ‘/backup/inc/inc_20240110.bak’
# recover time: 300s
# recover database ‘/dm/data/dm.ini’ from archivelog ‘/backup/log/log_2024011010.bak’ until time ‘2024-01-10 10:00:00’
# recover time: 200s
# Timepoint recovery succeeded: 2024-01-10 10:00:00

3.3 灾难恢复实现

3.3.1 Data Guard故障切换

# 1. 故障检测
– 检查主库状态
SQL> SELECT * FROM V$STANDBY WHERE STANDBY_NAME = ‘primary_db’;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————- ————– ————– —————
# primary_db 192.168.1.10 5236 ABNORMAL
– 检查备库状态
SQL> SELECT * FROM V$STANDBY WHERE STANDBY_NAME = ‘standby_db’;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————- ————– ————– —————
# standby_db 192.168.1.11 5236 NORMAL
# 2. 故障切换
– 切换备库为主库
SQL> ALTER DATABASE SWITCHOVER TO PRIMARY;
# 输出结果
# executed successfully
– 验证切换结果
SQL> SELECT * FROM V$INSTANCE;
# 输出结果
# NAME STATUS ROLE
# ——- ——- —–
# FGEDU OPEN PRIMARY
# 3. 故障切换脚本
$ vi /dm/scripts/switchover.sh
#!/bin/bash
# 检查主库状态
PRIMARY_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.1.10:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘primary_db'” | tail -n +2)
if [ “$PRIMARY_STATUS” != “NORMAL” ]; then
echo “Primary database is abnormal: $PRIMARY_STATUS”
# 检查备库状态
STANDBY_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.1.11:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘standby_db'” | tail -n +2)
if [ “$STANDBY_STATUS” = “NORMAL” ]; then
echo “Standby database is normal: $STANDBY_STATUS”
# 切换备库为主库
/dm/bin/disql SYSDBA/SYSDBA@192.168.1.11:5236 -c “ALTER DATABASE SWITCHOVER TO PRIMARY”
if [ $? -eq 0 ]; then
echo “Switchover succeeded”
# 发送成功通知
echo “Switchover succeeded” | mail -s “Switchover Succeeded” admin@fgedu.com
else
echo “Switchover failed”
# 发送失败告警
echo “Switchover failed” | mail -s “Switchover Failed” admin@fgedu.com
fi
else
echo “Standby database is abnormal: $STANDBY_STATUS”
# 发送告警
echo “Both primary and standby databases are abnormal” | mail -s “Database Failure” admin@fgedu.com
fi
else
echo “Primary database is normal: $PRIMARY_STATUS”
fi
# 4. 实际示例
– 执行故障切换
$ /dm/scripts/switchover.sh
# 输出结果
# Primary database is abnormal: ABNORMAL
# Standby database is normal: NORMAL
# executed successfully
# Switchover succeeded
– 验证切换结果
SQL> SELECT * FROM V$INSTANCE;
# 输出结果
# NAME STATUS ROLE
# ——- ——- —–
# FGEDU OPEN PRIMARY

3.3.2 跨机房灾难恢复

# 1. 跨机房架构
– 主机房
– 主库:处理读写请求
– 备库:本地备库
– 存储:本地存储
– 异地机房
– 备库:异地备库
– 存储:异地存储
# 2. 跨机房配置
– 主库配置
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
SQL> CREATE STANDBY DATABASE ‘standby_local’ CONNECT TO ‘192.168.1.11’ PORT 5236 USER ‘SYSDBA’ PASSWORD ‘SYSDBA’;
SQL> CREATE STANDBY DATABASE ‘standby_remote’ CONNECT TO ‘192.168.2.11’ PORT 5236 USER ‘SYSDBA’ PASSWORD ‘SYSDBA’;
– 异地备库配置
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
# 3. 跨机房灾难恢复
– 检查主库状态
SQL> SELECT * FROM V$STANDBY WHERE STANDBY_NAME = ‘primary_db’;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————- ————– ————– —————
# primary_db 192.168.1.10 5236 ABNORMAL
– 检查异地备库状态
SQL> SELECT * FROM V$STANDBY WHERE STANDBY_NAME = ‘standby_remote’;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————- ————– ————– —————
# standby_remote 192.168.2.11 5236 NORMAL
– 切换异地备库为主库
SQL> ALTER DATABASE SWITCHOVER TO PRIMARY;
# 输出结果
# executed successfully
– 验证切换结果
SQL> SELECT * FROM V$INSTANCE;
# 输出结果
# NAME STATUS ROLE
# ——- ——- —–
# FGEDU OPEN PRIMARY
# 4. 实际示例
– 跨机房灾难恢复脚本
$ vi /dm/scripts/dr_recover.sh
#!/bin/bash
# 检查主库状态
PRIMARY_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.1.10:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘primary_db'” | tail -n +2)
if [ “$PRIMARY_STATUS” != “NORMAL” ]; then
echo “Primary database is abnormal: $PRIMARY_STATUS”
# 检查本地备库状态
STANDBY_LOCAL_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.1.11:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘standby_local'” | tail -n +2)
if [ “$STANDBY_LOCAL_STATUS” = “NORMAL” ]; then
echo “Local standby database is normal: $STANDBY_LOCAL_STATUS”
# 切换本地备库为主库
/dm/bin/disql SYSDBA/SYSDBA@192.168.1.11:5236 -c “ALTER DATABASE SWITCHOVER TO PRIMARY”
if [ $? -eq 0 ]; then
echo “Local switchover succeeded”
# 发送成功通知
echo “Local switchover succeeded” | mail -s “Local Switchover Succeeded” admin@fgedu.com
else
echo “Local switchover failed”
# 检查异地备库状态
STANDBY_REMOTE_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.2.11:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘standby_remote'” | tail -n +2)
if [ “$STANDBY_REMOTE_STATUS” = “NORMAL” ]; then
echo “Remote standby database is normal: $STANDBY_REMOTE_STATUS”
# 切换异地备库为主库
/dm/bin/disql SYSDBA/SYSDBA@192.168.2.11:5236 -c “ALTER DATABASE SWITCHOVER TO PRIMARY”
if [ $? -eq 0 ]; then
echo “Remote switchover succeeded”
# 发送成功通知
echo “Remote switchover succeeded” | mail -s “Remote Switchover Succeeded” admin@fgedu.com
else
echo “Remote switchover failed”
# 发送失败告警
echo “Remote switchover failed” | mail -s “Remote Switchover Failed” admin@fgedu.com
fi
else
echo “Remote standby database is abnormal: $STANDBY_REMOTE_STATUS”
# 发送告警
echo “All databases are abnormal” | mail -s “Database Disaster” admin@fgedu.com
fi
fi
else
echo “Local standby database is abnormal: $STANDBY_LOCAL_STATUS”
# 检查异地备库状态
STANDBY_REMOTE_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.2.11:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘standby_remote'” | tail -n +2)
if [ “$STANDBY_REMOTE_STATUS” = “NORMAL” ]; then
echo “Remote standby database is normal: $STANDBY_REMOTE_STATUS”
# 切换异地备库为主库
/dm/bin/disql SYSDBA/SYSDBA@192.168.2.11:5236 -c “ALTER DATABASE SWITCHOVER TO PRIMARY”
if [ $? -eq 0 ]; then
echo “Remote switchover succeeded”
# 发送成功通知
echo “Remote switchover succeeded” | mail -s “Remote Switchover Succeeded” admin@fgedu.com
else
echo “Remote switchover failed”
# 发送失败告警
echo “Remote switchover failed” | mail -s “Remote Switchover Failed” admin@fgedu.com
fi
else
echo “Remote standby database is abnormal: $STANDBY_REMOTE_STATUS”
# 发送告警
echo “All databases are abnormal” | mail -s “Database Disaster” admin@fgedu.com
fi
fi
else
echo “Primary database is normal: $PRIMARY_STATUS”
fi
风哥提示:灾难恢复是数据库管理的重要组成部分,掌握灾难恢复的方法和工具,是保证数据安全和业务连续性的关键。建立完善的监控体系,是保障系统稳定运行的关键。

Part04-生产案例与实战讲解

4.1 数据库备份恢复案例

4.1.1 案例描述

某企业需要定期备份DM数据库,并在需要时快速恢复数据。

4.1.2 实施步骤

# 1. 需求分析
– 业务需求
– 定期备份数据库
– 快速恢复数据
– 保证数据安全
– 技术需求
– 全量备份:每周一次
– 增量备份:每天一次
– 日志备份:每小时一次
# 2. 方案设计
– 备份策略
– 全量备份:每周日凌晨2点
– 增量备份:每天凌晨2点
– 日志备份:每小时
– 备份存储
– 本地存储:/backup/local
– 异地存储:/backup/remote
– 云端存储:s3://fgedu-backup
# 3. 实施步骤
– 创建备份脚本
$ vi /dm/scripts/backup.sh
#!/bin/bash
# 备份类型
BACKUP_TYPE=$1
# 全量备份
if [ “$BACKUP_TYPE” = “full” ]; then
BACKUP_FILE=”/backup/local/full/full_$(date +%Y%m%d).bak”
/dm/bin/dmrman CTLSTMT=”BACKUP DATABASE ‘/dm/data/dm.ini’ FULL TO ‘$BACKUP_FILE'”
# 复制到异地存储
scp $BACKUP_FILE root@backup-server:/backup/remote/full/
# 上传到云端存储
aws s3 cp $BACKUP_FILE s3://fgedu-backup/full/
# 增量备份
elif [ “$BACKUP_TYPE” = “inc” ]; then
BACKUP_FILE=”/backup/local/inc/inc_$(date +%Y%m%d).bak”
/dm/bin/dmrman CTLSTMT=”BACKUP DATABASE ‘/dm/data/dm.ini’ INCREMENT TO ‘$BACKUP_FILE'”
# 复制到异地存储
scp $BACKUP_FILE root@backup-server:/backup/remote/inc/
# 上传到云端存储
aws s3 cp $BACKUP_FILE s3://fgedu-backup/inc/
# 日志备份
elif [ “$BACKUP_TYPE” = “log” ]; then
BACKUP_FILE=”/backup/local/log/log_$(date +%Y%m%d%H).bak”
/dm/bin/dmrman CTLSTMT=”BACKUP ARCHIVELOG FROM LOG ‘/dm/data/dm.ini’ TO ‘$BACKUP_FILE'”
# 复制到异地存储
scp $BACKUP_FILE root@backup-server:/backup/remote/log/
# 上传到云端存储
aws s3 cp $BACKUP_FILE s3://fgedu-backup/log/
fi
# 检查备份结果
if [ $? -eq 0 ]; then
echo “Backup succeeded: $BACKUP_FILE”
else
echo “Backup failed: $BACKUP_FILE”
# 发送告警
echo “Backup failed: $BACKUP_FILE” | mail -s “Backup Failed” admin@fgedu.com
fi
– 配置定时任务
$ crontab -e
# 全量备份:每周日凌晨2点
0 2 * * 0 /dm/scripts/backup.sh full
# 增量备份:每天凌晨2点
0 2 * * 1-6 /dm/scripts/backup.sh inc
# 日志备份:每小时
0 * * * * /dm/scripts/backup.sh log
# 4. 验证结果
– 查看备份文件
$ ls -lh /backup/local/full/
# 输出结果
# total 10G
# -rw-r–r– 1 dm dm 2.5G Jan 10 02:00 full_20240110.bak
– 验证备份文件
$ /dm/bin/dmrman CTLSTMT=”CHECK BACKUPSET ‘/backup/local/full/full_20240110.bak'”
# 输出结果
# check backupset ‘/backup/local/full/full_20240110.bak’ success
# 5. 实施结果
– 备份策略实施成功
– 备份存储正常
– 备份验证正常

4.2 Data Guard灾难恢复案例

4.2.1 案例描述

某企业需要建立Data Guard高可用架构,实现主备库故障切换。

4.2.2 实施步骤

# 1. 需求分析
– 业务需求
– 建立主备架构
– 实现故障切换
– 保证数据安全
– 技术需求
– 主库:处理读写请求
– 备库:实时同步
– 故障切换:自动切换
# 2. 方案设计
– Data Guard架构
– 主库:192.168.1.10
– 备库:192.168.1.11
– 监控:192.168.1.12
– 故障切换策略
– 主库故障:自动切换到备库
– 备库故障:告警通知
– 数据同步:实时同步
# 3. 实施步骤
– 配置主库
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
SQL> CREATE STANDBY DATABASE ‘standby_db’ CONNECT TO ‘192.168.1.11’ PORT 5236 USER ‘SYSDBA’ PASSWORD ‘SYSDBA’;
– 配置备库
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
– 验证Data Guard
SQL> SELECT * FROM V$STANDBY;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————- ————– ————– —————
# standby_db 192.168.1.11 5236 NORMAL
– 配置故障切换脚本
$ vi /dm/scripts/switchover.sh
#!/bin/bash
# 检查主库状态
PRIMARY_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.1.10:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘primary_db'” | tail -n +2)
if [ “$PRIMARY_STATUS” != “NORMAL” ]; then
echo “Primary database is abnormal: $PRIMARY_STATUS”
# 检查备库状态
STANDBY_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.1.11:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘standby_db'” | tail -n +2)
if [ “$STANDBY_STATUS” = “NORMAL” ]; then
echo “Standby database is normal: $STANDBY_STATUS”
# 切换备库为主库
/dm/bin/disql SYSDBA/SYSDBA@192.168.1.11:5236 -c “ALTER DATABASE SWITCHOVER TO PRIMARY”
if [ $? -eq 0 ]; then
echo “Switchover succeeded”
# 发送成功通知
echo “Switchover succeeded” | mail -s “Switchover Succeeded” admin@fgedu.com
else
echo “Switchover failed”
# 发送失败告警
echo “Switchover failed” | mail -s “Switchover Failed” admin@fgedu.com
fi
else
echo “Standby database is abnormal: $STANDBY_STATUS”
# 发送告警
echo “Both primary and standby databases are abnormal” | mail -s “Database Failure” admin@fgedu.com
fi
else
echo “Primary database is normal: $PRIMARY_STATUS”
fi
– 配置定时任务
$ crontab -e
# 故障检测:每分钟
* * * * * /dm/scripts/switchover.sh
# 4. 验证结果
– 查看Data Guard状态
SQL> SELECT * FROM V$STANDBY;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————- ————– ————– —————
# standby_db 192.168.1.11 5236 NORMAL
# 5. 实施结果
– Data Guard配置成功
– 主备同步正常
– 故障切换正常

4.3 跨机房灾难恢复案例

4.3.1 案例描述

某企业需要建立跨机房灾难恢复架构,实现主机房故障时快速切换到异地机房。

4.3.2 实施步骤

# 1. 需求分析
– 业务需求
– 建立跨机房架构
– 实现异地灾难恢复
– 保证业务连续性
– 技术需求
– 主机房:主库+备库
– 异地机房:备库
– 故障切换:自动切换
# 2. 方案设计
– 跨机房架构
– 主机房:192.168.1.10(主库)、192.168.1.11(备库)
– 异地机房:192.168.2.11(备库)
– 灾难恢复策略
– 主机房故障:切换到异地机房
– 本地故障:切换到本地备库
– 数据同步:实时同步
# 3. 实施步骤
– 配置主库
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
SQL> CREATE STANDBY DATABASE ‘standby_local’ CONNECT TO ‘192.168.1.11’ PORT 5236 USER ‘SYSDBA’ PASSWORD ‘SYSDBA’;
SQL> CREATE STANDBY DATABASE ‘standby_remote’ CONNECT TO ‘192.168.2.11’ PORT 5236 USER ‘SYSDBA’ PASSWORD ‘SYSDBA’;
– 配置本地备库
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
– 配置异地备库
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_DEST’, ‘/dm/arch’)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_FILE_SIZE’, 1024)
SQL> SP_SET_PARA_VALUE(‘ARCH_INI’, ‘ARCH_SPACE_LIMIT’, 102400)
– 验证Data Guard
SQL> SELECT * FROM V$STANDBY;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————— ————– ————– —————
# standby_local 192.168.1.11 5236 NORMAL
# standby_remote 192.168.2.11 5236 NORMAL
– 配置灾难恢复脚本
$ vi /dm/scripts/dr_recover.sh
#!/bin/bash
# 检查主库状态
PRIMARY_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.1.10:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘primary_db'” | tail -n +2)
if [ “$PRIMARY_STATUS” != “NORMAL” ]; then
echo “Primary database is abnormal: $PRIMARY_STATUS”
# 检查本地备库状态
STANDBY_LOCAL_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.1.11:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘standby_local'” | tail -n +2)
if [ “$STANDBY_LOCAL_STATUS” = “NORMAL” ]; then
echo “Local standby database is normal: $STANDBY_LOCAL_STATUS”
# 切换本地备库为主库
/dm/bin/disql SYSDBA/SYSDBA@192.168.1.11:5236 -c “ALTER DATABASE SWITCHOVER TO PRIMARY”
if [ $? -eq 0 ]; then
echo “Local switchover succeeded”
# 发送成功通知
echo “Local switchover succeeded” | mail -s “Local Switchover Succeeded” admin@fgedu.com
else
echo “Local switchover failed”
# 检查异地备库状态
STANDBY_REMOTE_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.2.11:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘standby_remote'” | tail -n +2)
if [ “$STANDBY_REMOTE_STATUS” = “NORMAL” ]; then
echo “Remote standby database is normal: $STANDBY_REMOTE_STATUS”
# 切换异地备库为主库
/dm/bin/disql SYSDBA/SYSDBA@192.168.2.11:5236 -c “ALTER DATABASE SWITCHOVER TO PRIMARY”
if [ $? -eq 0 ]; then
echo “Remote switchover succeeded”
# 发送成功通知
echo “Remote switchover succeeded” | mail -s “Remote Switchover Succeeded” admin@fgedu.com
else
echo “Remote switchover failed”
# 发送失败告警
echo “Remote switchover failed” | mail -s “Remote Switchover Failed” admin@fgedu.com
fi
else
echo “Remote standby database is abnormal: $STANDBY_REMOTE_STATUS”
# 发送告警
echo “All databases are abnormal” | mail -s “Database Disaster” admin@fgedu.com
fi
fi
else
echo “Local standby database is abnormal: $STANDBY_LOCAL_STATUS”
# 检查异地备库状态
STANDBY_REMOTE_STATUS=$(/dm/bin/disql SYSDBA/SYSDBA@192.168.2.11:5236 -c “SELECT STATUS FROM V\$STANDBY WHERE STANDBY_NAME = ‘standby_remote'” | tail -n +2)
if [ “$STANDBY_REMOTE_STATUS” = “NORMAL” ]; then
echo “Remote standby database is normal: $STANDBY_REMOTE_STATUS”
# 切换异地备库为主库
/dm/bin/disql SYSDBA/SYSDBA@192.168.2.11:5236 -c “ALTER DATABASE SWITCHOVER TO PRIMARY”
if [ $? -eq 0 ]; then
echo “Remote switchover succeeded”
# 发送成功通知
echo “Remote switchover succeeded” | mail -s “Remote Switchover Succeeded” admin@fgedu.com
else
echo “Remote switchover failed”
# 发送失败告警
echo “Remote switchover failed” | mail -s “Remote Switchover Failed” admin@fgedu.com
fi
else
echo “Remote standby database is abnormal: $STANDBY_REMOTE_STATUS”
# 发送告警
echo “All databases are abnormal” | mail -s “Database Disaster” admin@fgedu.com
fi
fi
else
echo “Primary database is normal: $PRIMARY_STATUS”
fi
– 配置定时任务
$ crontab -e
# 灾难恢复检测:每分钟
* * * * * /dm/scripts/dr_recover.sh
# 4. 验证结果
– 查看Data Guard状态
SQL> SELECT * FROM V$STANDBY;
# 输出结果
# STANDBY_NAME STANDBY_HOST STANDBY_PORT STANDBY_STATUS
# ————— ————– ————– —————
# standby_local 192.168.1.11 5236 NORMAL
# standby_remote 192.168.2.11 5236 NORMAL
# 5. 实施结果
– 跨机房架构配置成功
– 主备同步正常
– 灾难恢复正常
生产环境建议:在灾难恢复项目完成后,要进行充分的测试,确保系统的可靠性和稳定性。建立完善的监控体系,及时发现和解决问题。定期进行灾难恢复演练,保持系统稳定运行。

Part05-风哥经验总结与分享

5.1 灾难恢复最佳实践

DM数据库灾难恢复最佳实践:

  • 充分规划:在建设前进行充分的规划,包括需求分析、架构设计、方案设计
  • 定期备份:定期进行备份,包括全量备份、增量备份、日志备份
  • 备份验证:定期验证备份,包括备份完整性检查、备份恢复测试
  • 异地存储:备份文件存储在异地,包括本地存储、异地存储、云端存储
  • 高可用架构:建立高可用架构,包括Data Guard、DMDSC
  • 故障切换:建立故障切换机制,包括自动切换、手动切换

    本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息