本文主要介绍MongoDB数据库集群主从架构的部署与管理,包括复制集配置、主从切换、数据同步监控等内容。风哥教程参考MongoDB官方文档Replication相关章节。
目录大纲
Part01-基础概念与理论知识
1.1 主从复制概述
主从复制是MongoDB实现高可用和数据冗余的核心机制。通过主从复制,可以将数据从主节点同步到一个或多个从节点,实现数据的备份和读写分离。
主从复制的主要特点:
- 数据冗余:从节点保存主节点的数据副本,防止数据丢失
- 高可用性:主节点故障时,从节点可以自动提升为主节点
- 读写分离:读操作可以分发到从节点,减轻主节点压力
- 灾难恢复:支持跨数据中心部署,实现异地容灾
主从复制的基本原理:
- 主节点记录所有写操作到oplog(操作日志)
- 从节点异步拉取主节点的oplog并应用到本地
- 通过心跳机制监控节点状态,自动进行故障转移
学习交流加群风哥微信: itpux-com
1.2 复制集架构
复制集成员角色:
- Primary(主节点):接收所有写操作,是复制集中唯一的可写节点
- Secondary(从节点):复制主节点的数据,可以处理读请求
- Arbiter(仲裁节点):不存储数据,只参与选举投票
- Hidden(隐藏节点):对应用不可见,用于备份或报表
- Delayed(延迟节点):延迟应用oplog,用于数据恢复
- Priority 0(优先级为0的节点):不会被选举为主节点
复制集选举机制:
- 当主节点不可用时,从节点会发起选举
- 获得大多数投票的节点成为新的主节点
- 优先级高的节点优先被选举
- 数据最新的节点优先被选举
更多视频教程www.fgedu.net.cn
Part02-生产环境规划与建议
2.1 集群架构设计
复制集架构设计:
- 三节点架构:1主节点 + 2从节点(推荐)
- 五节点架构:1主节点 + 3从节点 + 1仲裁节点
- 跨数据中心架构:主节点在A数据中心,从节点分布在B、C数据中心
架构设计建议:
- 复制集成员数量建议为奇数(3、5、7)
- 仲裁节点不存储数据,可以部署在资源较少的机器上
- 跨数据中心部署时,考虑网络延迟和带宽
- 设置合适的优先级,控制主节点的位置
风哥提示:复制集架构设计需要考虑高可用性、数据安全和网络延迟等因素。
2.2 硬件与环境规划
硬件规划:
- CPU:建议8核以上,主节点配置可以更高
- 内存:建议32GB以上,确保足够的缓存空间
- 存储:使用SSD存储,确保足够的IO性能
- 网络:千兆网络,确保节点间通信顺畅
环境规划:
- 操作系统:CentOS 7/8、Ubuntu 20.04等Linux发行版
- MongoDB版本:建议使用5.0或更高版本
- 网络配置:确保节点间可以互相通信,开放27017端口
- 时间同步:所有节点配置NTP时间同步
更多学习教程公众号风哥教程itpux_com
Part03-生产环境项目实施方案
3.1 复制集部署
复制集部署步骤:
# 1. 准备三个节点
# 节点1: 192.168.1.100 (主节点)
# 节点2: 192.168.1.101 (从节点)
# 节点3: 192.168.1.102 (从节点)
# 2. 配置mongod.conf
# 在所有节点上配置
vi /mongodb/app/mongod.conf
# 添加复制集配置
replication:
replSetName: fgedu-repl
oplogSizeMB: 2048
net:
bindIp: 0.0.0.0
port: 27017
storage:
dbPath: /mongodb/fgdata
journal:
enabled: true
# 3. 启动所有节点
systemctl start mongod
# 4. 初始化复制集
mongosh –host 192.168.1.100 –port 27017
rs.initiate({
_id: “fgedu-repl”,
members: [
{ _id: 0, host: “192.168.1.100:27017”, priority: 2 },
{ _id: 1, host: “192.168.1.101:27017”, priority: 1 },
{ _id: 2, host: “192.168.1.102:27017”, priority: 1 }
]
})
# 5. 查看复制集状态
rs.status()
3.2 主从切换
手动主从切换:
# 1. 查看当前复制集状态
rs.status()
# 2. 将主节点降级为从节点
# 在主节点上执行
rs.stepDown(60)
# 3. 强制将某个从节点提升为主节点
# 在目标从节点上执行
rs.freeze(0)
rs.reconfig({
_id: “fgedu-repl”,
members: [
{ _id: 0, host: “192.168.1.100:27017”, priority: 1 },
{ _id: 1, host: “192.168.1.101:27017”, priority: 2 },
{ _id: 2, host: “192.168.1.102:27017”, priority: 1 }
]
})
# 4. 验证切换结果
rs.status()
3.3 数据同步监控
数据同步监控:
# 1. 查看复制集状态
rs.status()
# 2. 查看oplog信息
rs.printReplicationInfo()
# 3. 查看从节点同步状态
rs.printSlaveReplicationInfo()
# 4. 查看复制延迟
db.adminCommand({ replSetGetStatus: 1 }).members.forEach(function(member) {
if (member.stateStr === “SECONDARY”) {
print(“节点: ” + member.name + “, 延迟: ” + member.optimeDate + “秒”);
}
})
# 5. 监控脚本
vi /mongodb/scripts/check_replication.sh
#!/bin/bash
mongosh –host 192.168.1.100 –port 27017 –eval ”
var status = rs.status();
status.members.forEach(function(member) {
if (member.stateStr === ‘SECONDARY’) {
var lag = (new Date() – member.optimeDate) / 1000;
if (lag > 10) {
print(‘WARNING: 节点 ‘ + member.name + ‘ 延迟 ‘ + lag + ‘ 秒’);
} else {
print(‘OK: 节点 ‘ + member.name + ‘ 延迟 ‘ + lag + ‘ 秒’);
}
}
});
”
Part04-生产案例与实战讲解
4.1 复制集部署实战
复制集部署实战:
# 1. 在三台服务器上安装MongoDB
# 服务器1: 192.168.1.100
# 服务器2: 192.168.1.101
# 服务器3: 192.168.1.102
# 2. 配置mongod.conf
# 在服务器1上
cat > /mongodb/app/mongod.conf << EOF
systemLog:
destination: file
logAppend: true
path: /mongodb/logs/mongod.log
storage:
dbPath: /mongodb/fgdata
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 8
processManagement:
fork: true
pidFilePath: /mongodb/logs/mongod.pid
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27017
bindIp: 0.0.0.0
replication:
replSetName: fgedu-repl
oplogSizeMB: 2048
EOF
# 3. 启动MongoDB服务
systemctl start mongod
systemctl status mongod
# 4. 初始化复制集
mongosh --host 192.168.1.100 --port 27017
> rs.initiate({
… _id: “fgedu-repl”,
… members: [
… { _id: 0, host: “192.168.1.100:27017”, priority: 2 },
… { _id: 1, host: “192.168.1.101:27017”, priority: 1 },
… { _id: 2, host: “192.168.1.102:27017”, priority: 1 }
… ]
… })
# 输出:
{
“ok”: 1,
“$clusterTime”: {
“clusterTime”: Timestamp(1617820800, 1),
“signature”: {
“hash”: BinData(0, “AAAAAAAAAAAAAAAAAAAAAAAAAAA=”),
“keyId”: NumberLong(0)
}
},
“operationTime”: Timestamp(1617820800, 1)
}
# 5. 验证复制集状态
> rs.status()
# 输出:
{
“set”: “fgedu-repl”,
“date”: ISODate(“2026-04-08T10:00:00.000Z”),
“myState”: 1,
“term”: NumberLong(1),
“syncSource”: “”,
“syncSourceHost”: “”,
“heartbeatIntervalMillis”: NumberLong(2000),
“majorityVoteCount”: 2,
“writeMajorityCount”: 2,
“votingMembersCount”: 3,
“writableVotingMembersCount”: 3,
“optimes”: {
“lastAppliedOpTime”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(1) },
“lastDurableOpTime”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(1) },
“lastAppliedWallTime”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastDurableWallTime”: ISODate(“2026-04-08T10:00:00.000Z”)
},
“lastStableRecoveryTimestamp”: Timestamp(1617820800, 1),
“electionCandidateMetrics”: {
“lastElectionReason”: “electionTimeout”,
“lastElectionDate”: ISODate(“2026-04-08T10:00:00.000Z”),
“electionTerm”: NumberLong(1),
“lastCommittedOpTimeAtElection”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(-1) },
“lastSeenOpTimeAtElection”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(-1) },
“numVotesNeeded”: 2,
“priorityAtElection”: 1,
“electionTimeoutMillis”: NumberLong(10000),
“numCatchUpOps”: NumberLong(0),
“newTermStartDate”: ISODate(“2026-04-08T10:00:00.000Z”),
“wMajorityWriteAvailabilityDate”: ISODate(“2026-04-08T10:00:00.000Z”)
},
“members”: [
{
“_id”: 0,
“name”: “192.168.1.100:27017”,
“health”: 1,
“state”: 1,
“stateStr”: “PRIMARY”,
“uptime”: 1200,
“optime”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(1) },
“optimeDate”: ISODate(“2026-04-08T10:00:00.000Z”),
“syncSource”: “”,
“syncSourceHost”: “”,
“infoMessage”: “”,
“electionTime”: Timestamp(1617820800, 1),
“electionDate”: ISODate(“2026-04-08T10:00:00.000Z”),
“configVersion”: 1,
“configTerm”: 1,
“self”: true,
“lastHeartbeatMessage”: “”
},
{
“_id”: 1,
“name”: “192.168.1.101:27017”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 1200,
“optime”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(1) },
“optimeDurable”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(1) },
“optimeDate”: ISODate(“2026-04-08T10:00:00.000Z”),
“optimeDurableDate”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastAppliedWallTime”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastDurableWallTime”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastHeartbeat”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastHeartbeatRecv”: ISODate(“2026-04-08T10:00:00.000Z”),
“pingMs”: NumberLong(0),
“syncSource”: “192.168.1.100:27017”,
“syncSourceHost”: “192.168.1.100:27017”,
“infoMessage”: “”,
“configVersion”: 1,
“configTerm”: 1
},
{
“_id”: 2,
“name”: “192.168.1.102:27017”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 1200,
“optime”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(1) },
“optimeDurable”: { “ts”: Timestamp(1617820800, 1), “t”: NumberLong(1) },
“optimeDate”: ISODate(“2026-04-08T10:00:00.000Z”),
“optimeDurableDate”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastAppliedWallTime”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastDurableWallTime”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastHeartbeat”: ISODate(“2026-04-08T10:00:00.000Z”),
“lastHeartbeatRecv”: ISODate(“2026-04-08T10:00:00.000Z”),
“pingMs”: NumberLong(0),
“syncSource”: “192.168.1.100:27017”,
“syncSourceHost”: “192.168.1.100:27017”,
“infoMessage”: “”,
“configVersion”: 1,
“configTerm”: 1
}
],
“ok”: 1,
“$clusterTime”: {
“clusterTime”: Timestamp(1617820800, 1),
“signature”: {
“hash”: BinData(0, “AAAAAAAAAAAAAAAAAAAAAAAAAAA=”),
“keyId”: NumberLong(0)
}
},
“operationTime”: Timestamp(1617820800, 1)
}
from MongoDB视频:www.itpux.com
4.2 主从切换实战
主从切换实战:
# 1. 查看当前主节点
rs.status().members.forEach(function(member) {
if (member.stateStr === “PRIMARY”) {
print(“当前主节点: ” + member.name);
}
})
# 输出:当前主节点: 192.168.1.100:27017
# 2. 手动触发主从切换
# 在主节点上执行
rs.stepDown(60)
# 3. 查看切换结果
rs.status()
# 输出:
{
“set”: “fgedu-repl”,
“date”: ISODate(“2026-04-08T10:05:00.000Z”),
“myState”: 2,
“term”: NumberLong(2),
“syncSource”: “192.168.1.101:27017”,
“syncSourceHost”: “192.168.1.101:27017”,
“members”: [
{
“_id”: 0,
“name”: “192.168.1.100:27017”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 1500,
“optime”: { “ts”: Timestamp(1617821100, 1), “t”: NumberLong(2) },
“syncSource”: “192.168.1.101:27017”,
“syncSourceHost”: “192.168.1.101:27017”
},
{
“_id”: 1,
“name”: “192.168.1.101:27017”,
“health”: 1,
“state”: 1,
“stateStr”: “PRIMARY”,
“uptime”: 1500,
“optime”: { “ts”: Timestamp(1617821100, 1), “t”: NumberLong(2) },
“syncSource”: “”,
“syncSourceHost”: “”,
“electionTime”: Timestamp(1617821100, 1),
“electionDate”: ISODate(“2026-04-08T10:05:00.000Z”)
},
{
“_id”: 2,
“name”: “192.168.1.102:27017”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 1500,
“optime”: { “ts”: Timestamp(1617821100, 1), “t”: NumberLong(2) },
“syncSource”: “192.168.1.101:27017”,
“syncSourceHost”: “192.168.1.101:27017”
}
]
}
# 4. 验证新主节点
mongosh –host 192.168.1.101 –port 27017 –eval “db.isMaster()”
# 输出:
{
“ismaster”: true,
“secondary”: false,
“hosts”: [
“192.168.1.100:27017”,
“192.168.1.101:27017”,
“192.168.1.102:27017”
],
“setName”: “fgedu-repl”,
“setVersion”: 1,
“primary”: “192.168.1.101:27017”,
“me”: “192.168.1.101:27017”,
“electionId”: ObjectId(“7fffffff0000000000000002”),
“lastWrite”: {
“opTime”: { “ts”: Timestamp(1617821100, 1), “t”: NumberLong(2) },
“lastWriteDate”: ISODate(“2026-04-08T10:05:00.000Z”),
“majorityOpTime”: { “ts”: Timestamp(1617821100, 1), “t”: NumberLong(2) },
“majorityWriteDate”: ISODate(“2026-04-08T10:05:00.000Z”)
}
}
4.3 故障恢复实战
故障恢复实战:
# 1. 模拟主节点故障
# 停止主节点服务
ssh 192.168.1.101 “systemctl stop mongod”
# 2. 查看复制集状态
rs.status()
# 输出:
{
“set”: “fgedu-repl”,
“date”: ISODate(“2026-04-08T10:10:00.000Z”),
“myState”: 2,
“term”: NumberLong(3),
“syncSource”: “192.168.1.102:27017”,
“syncSourceHost”: “192.168.1.102:27017”,
“members”: [
{
“_id”: 0,
“name”: “192.168.1.100:27017”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 1800,
“syncSource”: “192.168.1.102:27017”,
“syncSourceHost”: “192.168.1.102:27017”
},
{
“_id”: 1,
“name”: “192.168.1.101:27017”,
“health”: 0,
“state”: 8,
“stateStr”: “(not reachable/healthy)”,
“uptime”: 0,
“lastHeartbeatMessage”: “Connection refused”
},
{
“_id”: 2,
“name”: “192.168.1.102:27017”,
“health”: 1,
“state”: 1,
“stateStr”: “PRIMARY”,
“uptime”: 1800,
“electionTime”: Timestamp(1617821400, 1),
“electionDate”: ISODate(“2026-04-08T10:10:00.000Z”)
}
]
}
# 3. 恢复故障节点
ssh 192.168.1.101 “systemctl start mongod”
# 4. 验证节点恢复
rs.status()
# 输出:
{
“set”: “fgedu-repl”,
“date”: ISODate(“2026-04-08T10:15:00.000Z”),
“myState”: 2,
“term”: NumberLong(3),
“syncSource”: “192.168.1.102:27017”,
“syncSourceHost”: “192.168.1.102:27017”,
“members”: [
{
“_id”: 0,
“name”: “192.168.1.100:27017”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 2100,
“syncSource”: “192.168.1.102:27017”,
“syncSourceHost”: “192.168.1.102:27017”
},
{
“_id”: 1,
“name”: “192.168.1.101:27017”,
“health”: 1,
“state”: 2,
“stateStr”: “SECONDARY”,
“uptime”: 300,
“syncSource”: “192.168.1.102:27017”,
“syncSourceHost”: “192.168.1.102:27017”
},
{
“_id”: 2,
“name”: “192.168.1.102:27017”,
“health”: 1,
“state”: 1,
“stateStr”: “PRIMARY”,
“uptime”: 2100,
“electionTime”: Timestamp(1617821400, 1),
“electionDate”: ISODate(“2026-04-08T10:10:00.000Z”)
}
]
}
风哥提示:复制集具有自动故障转移能力,当主节点故障时,从节点会自动选举新的主节点。
Part05-风哥经验总结与分享
5.1 主从复制最佳实践
风哥建议的主从复制最佳实践:
- 节点数量:复制集成员数量建议为奇数(3、5、7)
- 优先级设置:合理设置节点优先级,控制主节点位置
- oplog大小:根据数据写入量设置合适的oplog大小
- 读写分离:读操作分发到从节点,减轻主节点压力
- 监控告警:监控复制延迟和节点状态,及时发现异常
- 备份策略:使用隐藏节点或延迟节点进行备份
- 安全配置:启用认证,配置防火墙,限制访问
- 跨数据中心:跨数据中心部署时,考虑网络延迟
学习交流加群风哥QQ113257174
5.2 常见问题与解决方案
常见问题与解决方案:
- 问题:复制延迟高
- 解决方案:优化网络,增加从节点资源,调整读偏好
- 问题:主节点频繁切换
- 解决方案:检查网络稳定性,调整electionTimeoutMillis
- 问题:从节点无法同步
- 解决方案:检查oplog是否足够大,重新初始化从节点
- 问题:读写分离不生效
- 解决方案:检查readPreference配置,确保从节点健康
- 问题:复制集无法选举主节点
- 解决方案:检查节点数量是否满足大多数,检查网络连通性
更多视频教程www.fgedu.net.cn
注意事项
- 复制集成员数量建议为奇数
- 合理设置节点优先级
- 根据数据写入量设置合适的oplog大小
- 读操作分发到从节点,减轻主节点压力
- 监控复制延迟和节点状态
- 启用认证,配置防火墙
- 跨数据中心部署时考虑网络延迟
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
