本文主要介绍MongoDB数据库的高可用架构设计,包括副本集、分片集群等高可用方案的设计和实施。风哥教程参考MongoDB官方文档Replication和Sharding相关章节。
目录大纲
Part01-基础概念与理论知识
1.1 高可用架构概述
高可用架构是指系统在面对各种故障时能够保持正常运行的能力。对于数据库系统来说,高可用意味着在硬件故障、网络中断等情况下,系统仍然能够提供服务。
MongoDB提供了多种高可用方案,包括副本集和分片集群,以满足不同规模和需求的应用场景。学习交流加群风哥微信: itpux-com
1.2 MongoDB高可用方案
MongoDB支持的高可用方案包括:
- 副本集(Replica Set):提供数据冗余和自动故障转移
- 分片集群(Sharded Cluster):提供水平扩展和负载均衡
- 混合架构:结合副本集和分片集群的优势
不同的高可用方案适用于不同的业务场景,需要根据业务需求和数据规模进行选择。更多视频教程www.fgedu.net.cn
Part02-生产环境规划与建议
2.1 高可用架构规划
高可用架构规划需要考虑以下因素:
- 业务可用性要求(RTO和RPO)
- 数据规模和增长趋势
- 性能需求
- 预算限制
风哥提示:高可用架构设计应与业务需求相匹配,避免过度设计或设计不足。
2.2 硬件与网络规划
硬件与网络规划包括:
- 服务器配置:CPU、内存、存储
- 网络架构:带宽、延迟、冗余
- 数据中心布局:单机房、多机房
- 备份策略:本地备份、异地备份
更多学习教程公众号风哥教程itpux_com
Part03-生产环境项目实施方案
3.1 副本集部署
部署3节点副本集:
# 配置文件准备(节点1)
vi /mongodb/app/mongod1.conf
systemLog:
destination: file
path: /mongodb/logs/mongod1.log
logAppend: true
storage:
dbPath: /mongodb/fgdata1
journal:
enabled: true
processManagement:
fork: true
pidFilePath: /mongodb/run/mongod1.pid
net:
port: 27017
bindIp: 192.168.1.100
replication:
replSetName: fgedu-repl
# 配置文件准备(节点2)
vi /mongodb/app/mongod2.conf
systemLog:
destination: file
path: /mongodb/logs/mongod2.log
logAppend: true
storage:
dbPath: /mongodb/fgdata2
journal:
enabled: true
processManagement:
fork: true
pidFilePath: /mongodb/run/mongod2.pid
net:
port: 27018
bindIp: 192.168.1.101
replication:
replSetName: fgedu-repl
# 配置文件准备(节点3)
vi /mongodb/app/mongod3.conf
systemLog:
destination: file
path: /mongodb/logs/mongod3.log
logAppend: true
storage:
dbPath: /mongodb/fgdata3
journal:
enabled: true
processManagement:
fork: true
pidFilePath: /mongodb/run/mongod3.pid
net:
port: 27019
bindIp: 192.168.1.102
replication:
replSetName: fgedu-repl
启动MongoDB实例:
# 启动节点1
/mongodb/app/bin/mongod –config /mongodb/app/mongod1.conf
# 启动节点2
/mongodb/app/bin/mongod –config /mongodb/app/mongod2.conf
# 启动节点3
/mongodb/app/bin/mongod –config /mongodb/app/mongod3.conf
初始化副本集:
# 连接到节点1
/mongodb/app/bin/mongosh –host 192.168.1.100 –port 27017
# 初始化副本集
rs.initiate({
_id: “fgedu-repl”,
members: [
{ _id: 0, host: “192.168.1.100:27017” },
{ _id: 1, host: “192.168.1.101:27018” },
{ _id: 2, host: “192.168.1.102:27019” }
]
})
3.2 分片集群部署
部署分片集群:
# 配置文件准备(配置服务器)
vi /mongodb/app/configsvr.conf
systemLog:
destination: file
path: /mongodb/logs/configsvr.log
logAppend: true
storage:
dbPath: /mongodb/configdata
journal:
enabled: true
processManagement:
fork: true
pidFilePath: /mongodb/run/configsvr.pid
net:
port: 27019
bindIp: 192.168.1.103
sharding:
clusterRole: configsvr
replication:
replSetName: config-repl
# 配置文件准备(查询路由)
vi /mongodb/app/mongos.conf
systemLog:
destination: file
path: /mongodb/logs/mongos.log
logAppend: true
processManagement:
fork: true
pidFilePath: /mongodb/run/mongos.pid
net:
port: 27020
bindIp: 192.168.1.104
sharding:
configDB: config-repl/192.168.1.103:27019
启动配置服务器和查询路由:
# 启动配置服务器
/mongodb/app/bin/mongod –config /mongodb/app/configsvr.conf
# 初始化配置服务器副本集
/mongodb/app/bin/mongosh –host 192.168.1.103 –port 27019 –eval “rs.initiate({ _id: ‘config-repl’, members: [ { _id: 0, host: ‘192.168.1.103:27019’ } ] })”
# 启动查询路由
/mongodb/app/bin/mongos –config /mongodb/app/mongos.conf
Part04-生产案例与实战讲解
4.1 副本集高可用实战
查看副本集状态:
# 连接到副本集
/mongodb/app/bin/mongosh –host 192.168.1.100 –port 27017
# 查看副本集状态
rs.status()
# 输出日志
{
“set”: “fgedu-repl”,
“date”: ISODate(“2026-04-08T10:00:00Z”),
“myState”: 1,
“term”: NumberLong(1),
“syncingTo”: “”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“heartbeatIntervalMillis”: NumberLong(2000),
“majorityVoteCount”: 2,
“writeMajorityCount”: 2,
“optimes”: {
“lastCommittedOpTime”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“lastCommittedWallTime”: ISODate(“2026-04-08T10:00:00Z”),
“readConcernMajorityOpTime”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“readConcernMajorityWallTime”: ISODate(“2026-04-08T10:00:00Z”),
“appliedOpTime”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“durableOpTime”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“lastAppliedWallTime”: ISODate(“2026-04-08T10:00:00Z”),
“lastDurableWallTime”: ISODate(“2026-04-08T10:00:00Z”)
},
“lastStableRecoveryTimestamp”: Timestamp(1712544000, 1),
“electionCandidateMetrics”: {
“lastElectionReason”: “electionTimeout”,
“lastElectionDate”: ISODate(“2026-04-08T09:50:00Z”),
“electionTerm”: NumberLong(1),
“lastCommittedOpTimeAtElection”: {
“ts”: Timestamp(0, 0),
“t”: NumberLong(-1)
},
“lastSeenOpTimeAtElection”: {
“ts”: Timestamp(1712543400, 1),
“t”: NumberLong(-1)
},
“numVotesNeeded”: 2,
“priorityAtElection”: 1,
“electionTimeoutMillis”: NumberLong(10000),
“newTermStartDate”: ISODate(“2026-04-08T09:50:00Z”),
“wMajorityWriteAvailabilityDate”: ISODate(“2026-04-08T09:50:00Z”)
},
“members”: [
{
“_id”: 0,
“name”: “192.168.1.100:27017”,
“health”: 1,
“state”: 1,
“stateStr”: “PRIMARY”,
“uptime”: 3600,
“optime”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“optimeDate”: ISODate(“2026-04-08T10:00:00Z”),
“syncingTo”: “”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“infoMessage”: “”,
“electionTime”: Timestamp(1712543400, 1),
“electionDate”: ISODate(“2026-04-08T09:50:00Z”),
“configVersion”: 1,
“configTerm”: 1,
“self”: true,
“lastHeartbeatMessage”: “”
},
{
“_id”: 1,
“name”: “192.168.1.101:27018”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 3540,
“optime”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“optimeDurable”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“optimeDate”: ISODate(“2026-04-08T10:00:00Z”),
“optimeDurableDate”: ISODate(“2026-04-08T10:00:00Z”),
“lastHeartbeat”: ISODate(“2026-04-08T10:00:00Z”),
“lastHeartbeatRecv”: ISODate(“2026-04-08T10:00:00Z”),
“pingMs”: NumberLong(1),
“lastHeartbeatMessage”: “”,
“syncingTo”: “192.168.1.100:27017”,
“syncSourceHost”: “192.168.1.100:27017”,
“syncSourceId”: 0,
“infoMessage”: “”,
“configVersion”: 1,
“configTerm”: 1
},
{
“_id”: 2,
“name”: “192.168.1.102:27019”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 3540,
“optime”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“optimeDurable”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“optimeDate”: ISODate(“2026-04-08T10:00:00Z”),
“optimeDurableDate”: ISODate(“2026-04-08T10:00:00Z”),
“lastHeartbeat”: ISODate(“2026-04-08T10:00:00Z”),
“lastHeartbeatRecv”: ISODate(“2026-04-08T10:00:00Z”),
“pingMs”: NumberLong(1),
“lastHeartbeatMessage”: “”,
“syncingTo”: “192.168.1.100:27017”,
“syncSourceHost”: “192.168.1.100:27017”,
“syncSourceId”: 0,
“infoMessage”: “”,
“configVersion”: 1,
“configTerm”: 1
}
],
“ok”: 1,
“$clusterTime”: {
“clusterTime”: Timestamp(1712544000, 1),
“signature”: {
“hash”: BinData(0, “AAAAAAAAAAAAAAAAAAAAAAAAAAA=”),
“keyId”: 0
}
},
“operationTime”: Timestamp(1712544000, 1)
}
模拟故障转移:
# 停止主节点
systemctl stop mongod1
# 查看副本集状态(等待选举完成)
rs.status()
# 输出日志(新的主节点已选举产生)
{
“set”: “fgedu-repl”,
“date”: ISODate(“2026-04-08T10:05:00Z”),
“myState”: 1,
“term”: NumberLong(2),
“syncingTo”: “”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“heartbeatIntervalMillis”: NumberLong(2000),
“majorityVoteCount”: 2,
“writeMajorityCount”: 2,
“optimes”: {
“lastCommittedOpTime”: {
“ts”: Timestamp(1712544300, 1),
“t”: NumberLong(2)
},
“lastCommittedWallTime”: ISODate(“2026-04-08T10:05:00Z”),
“readConcernMajorityOpTime”: {
“ts”: Timestamp(1712544300, 1),
“t”: NumberLong(2)
},
“readConcernMajorityWallTime”: ISODate(“2026-04-08T10:05:00Z”),
“appliedOpTime”: {
“ts”: Timestamp(1712544300, 1),
“t”: NumberLong(2)
},
“durableOpTime”: {
“ts”: Timestamp(1712544300, 1),
“t”: NumberLong(2)
},
“lastAppliedWallTime”: ISODate(“2026-04-08T10:05:00Z”),
“lastDurableWallTime”: ISODate(“2026-04-08T10:05:00Z”)
},
“lastStableRecoveryTimestamp”: Timestamp(1712544300, 1),
“electionCandidateMetrics”: {
“lastElectionReason”: “stepUpRequestSkipDryRun”,
“lastElectionDate”: ISODate(“2026-04-08T10:05:00Z”),
“electionTerm”: NumberLong(2),
“lastCommittedOpTimeAtElection”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“lastSeenOpTimeAtElection”: {
“ts”: Timestamp(1712544000, 1),
“t”: NumberLong(1)
},
“numVotesNeeded”: 2,
“priorityAtElection”: 1,
“electionTimeoutMillis”: NumberLong(10000),
“newTermStartDate”: ISODate(“2026-04-08T10:05:00Z”),
“wMajorityWriteAvailabilityDate”: ISODate(“2026-04-08T10:05:00Z”)
},
“members”: [
{
“_id”: 0,
“name”: “192.168.1.100:27017”,
“health”: 0,
“state”: 8,
“stateStr”: “(not reachable/healthy)”,
“uptime”: 0,
“optime”: {
“ts”: Timestamp(0, 0),
“t”: NumberLong(-1)
},
“optimeDurable”: {
“ts”: Timestamp(0, 0),
“t”: NumberLong(-1)
},
“optimeDate”: ISODate(“1970-01-01T00:00:00Z”),
“optimeDurableDate”: ISODate(“1970-01-01T00:00:00Z”),
“lastHeartbeat”: ISODate(“2026-04-08T10:05:00Z”),
“lastHeartbeatRecv”: ISODate(“2026-04-08T10:00:00Z”),
“pingMs”: NumberLong(0),
“lastHeartbeatMessage”: “Connection refused”,
“syncingTo”: “”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“infoMessage”: “”,
“configVersion”: -1,
“configTerm”: -1
},
{
“_id”: 1,
“name”: “192.168.1.101:27018”,
“health”: 1,
“state”: 1,
“stateStr”: “PRIMARY”,
“uptime”: 3840,
“optime”: {
“ts”: Timestamp(1712544300, 1),
“t”: NumberLong(2)
},
“optimeDurable”: {
“ts”: Timestamp(1712544300, 1),
“t”: NumberLong(2)
},
“optimeDate”: ISODate(“2026-04-08T10:05:00Z”),
“optimeDurableDate”: ISODate(“2026-04-08T10:05:00Z”),
“lastHeartbeat”: ISODate(“2026-04-08T10:05:00Z”),
“lastHeartbeatRecv”: ISODate(“2026-04-08T10:05:00Z”),
“pingMs”: NumberLong(1),
“lastHeartbeatMessage”: “”,
“syncingTo”: “”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“infoMessage”: “”,
“electionTime”: Timestamp(1712544300, 1),
“electionDate”: ISODate(“2026-04-08T10:05:00Z”),
“configVersion”: 1,
“configTerm”: 2,
“self”: true,
“lastHeartbeatMessage”: “”
},
{
“_id”: 2,
“name”: “192.168.1.102:27019”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 3840,
“optime”: {
“ts”: Timestamp(1712544300, 1),
“t”: NumberLong(2)
},
“optimeDurable”: {
“ts”: Timestamp(1712544300, 1),
“t”: NumberLong(2)
},
“optimeDate”: ISODate(“2026-04-08T10:05:00Z”),
“optimeDurableDate”: ISODate(“2026-04-08T10:05:00Z”),
“lastHeartbeat”: ISODate(“2026-04-08T10:05:00Z”),
“lastHeartbeatRecv”: ISODate(“2026-04-08T10:05:00Z”),
“pingMs”: NumberLong(1),
“lastHeartbeatMessage”: “”,
“syncingTo”: “192.168.1.101:27018”,
“syncSourceHost”: “192.168.1.101:27018”,
“syncSourceId”: 1,
“infoMessage”: “”,
“configVersion”: 1,
“configTerm”: 2
}
],
“ok”: 1,
“$clusterTime”: {
“clusterTime”: Timestamp(1712544300, 1),
“signature”: {
“hash”: BinData(0, “AAAAAAAAAAAAAAAAAAAAAAAAAAA=”),
“keyId”: 0
}
},
“operationTime”: Timestamp(1712544300, 1)
}
from MongoDB视频:www.itpux.com
4.2 分片集群高可用实战
添加分片:
# 连接到查询路由
/mongodb/app/bin/mongosh –host 192.168.1.104 –port 27020
# 添加分片
sh.addShard(“fgedu-repl/192.168.1.100:27017,192.168.1.101:27018,192.168.1.102:27019”)
# 查看分片状态
sh.status()
启用分片:
# 启用数据库分片
sh.enableSharding(“fgedudb”)
# 对集合启用分片(按score字段)
sh.shardCollection(“fgedudb.fgedu_users”, { “score”: 1 })
# 查看分片状态
sh.status()
风哥提示:分片键的选择对分片集群性能至关重要,需要根据业务访问模式进行选择。
Part05-风哥经验总结与分享
5.1 高可用最佳实践
风哥建议的高可用最佳实践:
- 使用奇数个节点的副本集,推荐3-5个节点
- 将副本集节点分布在不同的物理服务器上
- 为副本集配置合适的选举超时时间
- 定期监控副本集状态和延迟
- 使用监控工具(如MongoDB Atlas或Prometheus)监控集群状态
学习交流加群风哥QQ113257174
5.2 故障处理建议
故障处理建议:
- 制定详细的故障处理预案
- 定期进行故障演练
- 建立完善的监控和告警机制
- 保持数据库备份的及时性和有效性
- 建立快速恢复流程
更多视频教程www.fgedu.net.cn
注意事项
- 副本集至少需要3个节点才能提供高可用性
- 分片集群的配置服务器也应该部署为副本集
- 分片键的选择应考虑数据分布和查询模式
- 定期检查副本集的延迟情况
- 在生产环境中,建议使用专业的监控工具
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
