本文详细介绍Oracle GoldenGate Extract进程的概念、类型、工作原理和配置方法,以及在生产环境中的规划、实施和管理。风哥教程参考Oracle官方文档GoldenGate等内容。学习交流加群风哥微信: itpux-com
Part01-基础概念与理论知识
1.1 Extract进程概念
Extract进程是Oracle GoldenGate的源端核心组件,负责从源数据库捕获数据变化。它读取源数据库的事务日志,捕获插入、更新和删除操作,并将这些操作写入本地的Trail文件,为后续的数据传输和应用做准备。
1.2 Extract进程类型
Oracle GoldenGate支持以下类型的Extract进程:
- 经典Extract(Classic Extract):使用传统的方法从数据库日志中捕获数据变化,适用于各种数据库平台
- 集成Extract(Integrated Extract):与Oracle数据库的日志挖掘服务器集成,提供更好的性能和功能,仅适用于Oracle数据库
- 初始加载Extract(Initial Load Extract):用于初始数据加载,一次性将源数据库的数据复制到目标数据库
1.3 Extract工作原理
Extract进程的工作原理包括以下步骤:
- 连接到源数据库,读取数据库的事务日志
- 解析日志记录,识别数据变化操作(插入、更新、删除)
- 根据配置的过滤条件,筛选需要复制的数据
- 将数据变化写入本地的Trail文件
- 记录检查点信息,便于故障恢复
$ cd /u01/app/goldengate
$ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 21.3.0.0.0 OGGCORE_21.3.0.0.0_PLATFORMS_210715.1004_FBO
Linux, x64, 64bit (optimized), Oracle 19c on Jul 17 2021 05:46:21
Copyright (C) 1995, 2021, Oracle and/or its affiliates. All rights reserved.
GGSCI (source-db) 1> INFO EXTRACT *
EXTRACT EXT1 Last Started 2026-04-03 16:30 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:05 ago)
Process ID 12345
Log Read Checkpoint Oracle Redo Logs
2026-04-03 17:45:30 Seqno 52, RBA 12345678
SCN 0.12345678 (12345678)
EXTRACT DP1 Last Started 2026-04-03 16:31 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:06 ago)
Process ID 12346
Log Read Checkpoint File /u01/app/goldengate/dirdat/et000005
2026-04-03 17:45:30.123456 RBA 12345
Part02-生产环境规划与建议
2.1 Extract规划考虑因素
在规划Extract进程时,需要考虑以下因素:
- Extract类型选择:根据数据库类型和版本选择合适的Extract类型(经典Extract或集成Extract)
- 过滤条件配置:配置适当的过滤条件,只复制需要的数据
- Trail文件配置:配置Trail文件的大小、位置和保留策略
- 检查点配置:配置检查点的频率和保留策略
- 性能配置:配置Extract进程的性能参数,如并行度、缓存大小等
2.2 Extract性能优化建议
以下是Extract进程的性能优化建议:
- 使用集成Extract:对于Oracle数据库,建议使用集成Extract,它提供更好的性能和功能
- 配置合适的Trail文件大小:根据数据量和变化频率,配置合适的Trail文件大小(如100MB)
- 启用并行处理:对于大表,可以启用并行处理,提高Extract的性能
- 优化过滤条件:使用高效的过滤条件,减少需要处理的数据量
- 配置适当的缓存大小:根据系统内存情况,配置适当的缓存大小
2.3 Extract高可用性规划
为了确保Extract进程的高可用性,可以采取以下措施:
- 配置自动重启:在Manager进程中配置Extract的自动重启参数
- 使用故障转移:配置Extract的故障转移机制,确保在节点故障时能够快速恢复
- 定期备份配置:定期备份Extract的配置文件和检查点文件
- 监控进程状态:使用GoldenGate的监控工具或第三方监控工具监控Extract进程的状态
Part03-生产环境项目实施方案
3.1 经典Extract配置
经典Extract适用于各种数据库平台,配置步骤包括:
- 配置Extract参数文件
- 添加Extract进程
- 添加Trail文件
- 启动Extract进程
$ cd /u01/app/goldengate
$ ./ggsci
GGSCI (source-db) 1> EDIT PARAMS EXT1
EXTRACT EXT1
USERID ggsadmin, PASSWORD Ggsadmin_123
EXTTRAIL /u01/app/goldengate/dirdat/et
TABLE fgedu.customers;TABLE fgedu.orders;# 2. 添加Extract进程
GGSCI (source-db) 2> ADD EXTRACT EXT1, TRANLOG, BEGIN NOW
EXTRACT added.
# 3. 添加Trail文件
GGSCI (source-db) 3> ADD EXTTRAIL /u01/app/goldengate/dirdat/et, EXTRACT EXT1, MEGABYTES 100
EXTTRAIL added.
# 4. 启动Extract进程
GGSCI (source-db) 4> START EXTRACT EXT1
Sending START request to MANAGER …
EXTRACT EXT1 starting
# 5. 验证Extract进程状态
GGSCI (source-db) 5> INFO EXTRACT EXT1
EXTRACT EXT1 Last Started 2026-04-03 17:00 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:03 ago)
Process ID 12347
Log Read Checkpoint Oracle Redo Logs
2026-04-03 17:01:00 Seqno 53, RBA 1234567
SCN 0.12345679 (12345679)
3.2 集成Extract配置
集成Extract仅适用于Oracle数据库,与数据库的日志挖掘服务器集成,提供更好的性能和功能。配置步骤包括:
- 配置数据库参数
- 创建日志挖掘服务器用户
- 配置Extract参数文件
- 添加Extract进程
- 添加Trail文件
- 启动Extract进程
$ sqlplus / as sysdba
SQL> ALTER SYSTEM SET enable_goldengate_replication=true SCOPE=BOTH;System altered.
SQL> ALTER SYSTEM SET streams_pool_size=2G SCOPE=SPFILE;System altered.
SQL> SHUTDOWN IMMEDIATE;Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> STARTUP;ORACLE instance started.
Total System Global Area 8589934592 bytes
Fixed Size 9135808 bytes
Variable Size 4294967296 bytes
Database Buffers 4278190080 bytes
Redo Buffers 7639040 bytes
Database mounted.
Database opened.
# 2. 创建日志挖掘服务器用户
SQL> GRANT LOGMINING TO ggsadmin;Grant succeeded.
# 3. 配置Extract参数文件
$ cd /u01/app/goldengate
$ ./ggsci
GGSCI (source-db) 1> EDIT PARAMS EXT1
EXTRACT EXT1
USERID ggsadmin, PASSWORD Ggsadmin_123
TRANLOGOPTIONS INTEGRATEDPARAMS (MAX_SGA_SIZE 256)
EXTTRAIL /u01/app/goldengate/dirdat/et
TABLE fgedu.*;# 4. 添加Extract进程
GGSCI (source-db) 2> ADD EXTRACT EXT1, INTEGRATED TRANLOG, BEGIN NOW
EXTRACT added.
# 5. 添加Trail文件
GGSCI (source-db) 3> ADD EXTTRAIL /u01/app/goldengate/dirdat/et, EXTRACT EXT1, MEGABYTES 100
EXTTRAIL added.
# 6. 启动Extract进程
GGSCI (source-db) 4> START EXTRACT EXT1
Sending START request to MANAGER …
EXTRACT EXT1 starting
# 7. 验证Extract进程状态
GGSCI (source-db) 5> INFO EXTRACT EXT1
EXTRACT EXT1 Last Started 2026-04-03 17:15 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:04 ago)
Process ID 12348
Log Read Checkpoint Oracle Integrated Redo Logs
2026-04-03 17:16:00 Seqno 53, RBA 2345678
SCN 0.12345680 (12345680)
LSN 1234567890
3.3 Extract监控与管理
Extract进程的监控与管理包括:
- 查看Extract进程状态
- 查看Extract报告文件
- 停止和启动Extract进程
- 修改Extract参数
- 重置Extract检查点
GGSCI (source-db) 1> INFO EXTRACT EXT1
EXTRACT EXT1 Last Started 2026-04-03 17:15 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:04 ago)
Process ID 12348
Log Read Checkpoint Oracle Integrated Redo Logs
2026-04-03 17:16:00 Seqno 53, RBA 2345678
SCN 0.12345680 (12345680)
LSN 1234567890
# 2. 查看Extract报告文件
GGSCI (source-db) 2> VIEW REPORT EXT1
# 3. 停止和启动Extract进程
GGSCI (source-db) 3> STOP EXTRACT EXT1
Sending STOP request to MANAGER …
EXTRACT EXT1 stopping
GGSCI (source-db) 4> START EXTRACT EXT1
Sending START request to MANAGER …
EXTRACT EXT1 starting
# 4. 修改Extract参数
GGSCI (source-db) 5> EDIT PARAMS EXT1
# 修改参数后需要重启Extract进程
# 5. 重置Extract检查点
GGSCI (source-db) 6> ALTER EXTRACT EXT1, BEGIN NOW
Extract EXT1 altered.
GGSCI (source-db) 7> ALTER EXTRACT EXT1, ETROLLOVER
Extract EXT1 altered.
# 6. 查看Extract统计信息
GGSCI (source-db) 8> STATS EXTRACT EXT1
Sending STATS request to EXTRACT EXT1 …
Start of Statistics at 2026-04-03 17:30:00.
Output to /u01/app/goldengate/dirdat/et:
From Table FGEDU.CUSTOMERS:
# inserts: 100
# updates: 50
# deletes: 10
# discards: 0
From Table FGEDU.ORDERS:
# inserts: 200
# updates: 80
# deletes: 20
# discards: 0
End of Statistics.
Part04-生产案例与实战讲解
4.1 经典Extract配置案例
以下是经典Extract的配置案例:
# 源数据库:source-db,IP:192.168.1.20
# 数据库:Oracle 19c Enterprise Edition
# GoldenGate版本:21.3.0.0.0
# 1. 配置Extract参数文件
$ cd /u01/app/goldengate
$ ./ggsci
GGSCI (source-db) 1> EDIT PARAMS EXT_CLASSIC
EXTRACT EXT_CLASSIC
USERID ggsadmin, PASSWORD Ggsadmin_123
EXTTRAIL /u01/app/goldengate/dirdat/ce
TABLE fgedu.products;TABLE fgedu.fgsales WHERE (sales_date > SYSDATE – 7);# 2. 添加Extract进程
GGSCI (source-db) 2> ADD EXTRACT EXT_CLASSIC, TRANLOG, BEGIN NOW
EXTRACT added.
# 3. 添加Trail文件
GGSCI (source-db) 3> ADD EXTTRAIL /u01/app/goldengate/dirdat/ce, EXTRACT EXT_CLASSIC, MEGABYTES 200
EXTTRAIL added.
# 4. 启动Extract进程
GGSCI (source-db) 4> START EXTRACT EXT_CLASSIC
Sending START request to MANAGER …
EXTRACT EXT_CLASSIC starting
# 5. 验证Extract进程状态
GGSCI (source-db) 5> INFO EXTRACT EXT_CLASSIC
EXTRACT EXT_CLASSIC Last Started 2026-04-03 17:45 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:02 ago)
Process ID 12349
Log Read Checkpoint Oracle Redo Logs
2026-04-03 17:46:00 Seqno 53, RBA 3456789
SCN 0.12345681 (12345681)
# 6. 测试数据复制
# 在源数据库上插入数据
$ sqlplus ggsadmin/Ggsadmin_123@source-db
SQL> INSERT INTO fgedu.products VALUES (1001, ‘Laptop’, 5000);1 row created.
SQL> INSERT INTO fgedu.fgsales VALUES (2001, 1001, SYSDATE, 2);1 row created.
SQL> COMMIT;Commit complete.
# 查看Extract统计信息
GGSCI (source-db) 6> STATS EXTRACT EXT_CLASSIC
Sending STATS request to EXTRACT EXT_CLASSIC …
Start of Statistics at 2026-04-03 17:50:00.
Output to /u01/app/goldengate/dirdat/ce:
From Table FGEDU.PRODUCTS:
# inserts: 1
# updates: 0
# deletes: 0
# discards: 0
From Table FGEDU.SALES:
# inserts: 1
# updates: 0
# deletes: 0
# discards: 0
End of Statistics.
4.2 集成Extract配置案例
以下是集成Extract的配置案例:
# 源数据库:source-db,IP:192.168.1.20
# 数据库:Oracle 19c Enterprise Edition
# GoldenGate版本:21.3.0.0.0
# 1. 配置数据库参数
$ sqlplus / as sysdba
SQL> ALTER SYSTEM SET enable_goldengate_replication=true SCOPE=BOTH;System altered.
SQL> ALTER SYSTEM SET streams_pool_size=3G SCOPE=SPFILE;System altered.
SQL> SHUTDOWN IMMEDIATE;Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> STARTUP;ORACLE instance started.
Total System Global Area 8589934592 bytes
Fixed Size 9135808 bytes
Variable Size 4294967296 bytes
Database Buffers 4278190080 bytes
Redo Buffers 7639040 bytes
Database mounted.
Database opened.
# 2. 创建日志挖掘服务器用户
SQL> GRANT LOGMINING TO ggsadmin;Grant succeeded.
# 3. 配置Extract参数文件
$ cd /u01/app/goldengate
$ ./ggsci
GGSCI (source-db) 1> EDIT PARAMS EXT_INTEG
EXTRACT EXT_INTEG
USERID ggsadmin, PASSWORD Ggsadmin_123
TRANLOGOPTIONS INTEGRATEDPARAMS (MAX_SGA_SIZE 512)
EXTTRAIL /u01/app/goldengate/dirdat/ie
TABLE fgedu.employees;TABLE fgedu.departments;# 4. 添加Extract进程
GGSCI (source-db) 2> ADD EXTRACT EXT_INTEG, INTEGRATED TRANLOG, BEGIN NOW
EXTRACT added.
# 5. 添加Trail文件
GGSCI (source-db) 3> ADD EXTTRAIL /u01/app/goldengate/dirdat/ie, EXTRACT EXT_INTEG, MEGABYTES 300
EXTTRAIL added.
# 6. 启动Extract进程
GGSCI (source-db) 4> START EXTRACT EXT_INTEG
Sending START request to MANAGER …
EXTRACT EXT_INTEG starting
# 7. 验证Extract进程状态
GGSCI (source-db) 5> INFO EXTRACT EXT_INTEG
EXTRACT EXT_INTEG Last Started 2026-04-03 18:00 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:03 ago)
Process ID 12350
Log Read Checkpoint Oracle Integrated Redo Logs
2026-04-03 18:01:00 Seqno 53, RBA 4567890
SCN 0.12345682 (12345682)
LSN 1234567891
# 8. 测试数据复制
# 在源数据库上更新数据
$ sqlplus ggsadmin/Ggsadmin_123@source-db
SQL> UPDATE fgedu.employees SET salary = salary * 1.1 WHERE department_id = 10;3 rows updated.
SQL> COMMIT;Commit complete.
# 查看Extract统计信息
GGSCI (source-db) 6> STATS EXTRACT EXT_INTEG
Sending STATS request to EXTRACT EXT_INTEG …
Start of Statistics at 2026-04-03 18:05:00.
Output to /u01/app/goldengate/dirdat/ie:
From Table FGEDU.EMPLOYEES:
# inserts: 0
# updates: 3
# deletes: 0
# discards: 0
From Table FGEDU.DEPARTMENTS:
# inserts: 0
# updates: 0
# deletes: 0
# discards: 0
End of Statistics.
4.3 Extract性能优化案例
以下是Extract性能优化的案例:
# 源数据库:source-db,IP:192.168.1.20
# 数据库:Oracle 19c Enterprise Edition
# GoldenGate版本:21.3.0.0.0
# 问题:Extract进程性能较差,复制延迟较大
# 1. 查看当前Extract配置
GGSCI (source-db) 1> VIEW PARAMS EXT1
EXTRACT EXT1
USERID ggsadmin, PASSWORD Ggsadmin_123
EXTTRAIL /u01/app/goldengate/dirdat/et
TABLE fgedu.large_table;# 2. 查看Extract性能指标
GGSCI (source-db) 2> INFO EXTRACT EXT1, DETAIL
EXTRACT EXT1 Last Started 2026-04-03 18:15 Status RUNNING
Checkpoint Lag 00:10:30 (updated 00:00:05 ago)
Process ID 12351
Log Read Checkpoint Oracle Redo Logs
2026-04-03 18:05:00 Seqno 53, RBA 5678901
SCN 0.12345683 (12345683)
# 3. 优化Extract配置
# 升级为集成Extract
GGSCI (source-db) 3> STOP EXTRACT EXT1
Sending STOP request to MANAGER …
EXTRACT EXT1 stopping
GGSCI (source-db) 4> DELETE EXTRACT EXT1
EXTRACT deleted.
GGSCI (source-db) 5> DELETE EXTTRAIL /u01/app/goldengate/dirdat/et
EXTTRAIL deleted.
GGSCI (source-db) 6> EDIT PARAMS EXT1
EXTRACT EXT1
USERID ggsadmin, PASSWORD Ggsadmin_123
TRANLOGOPTIONS INTEGRATEDPARAMS (MAX_SGA_SIZE 1024)
EXTTRAIL /u01/app/goldengate/dirdat/et
TABLE fgedu.large_table;# 4. 重新创建Extract进程
GGSCI (source-db) 7> ADD EXTRACT EXT1, INTEGRATED TRANLOG, BEGIN NOW
EXTRACT added.
GGSCI (source-db) 8> ADD EXTTRAIL /u01/app/goldengate/dirdat/et, EXTRACT EXT1, MEGABYTES 500
EXTTRAIL added.
# 5. 启动Extract进程
GGSCI (source-db) 9> START EXTRACT EXT1
Sending START request to MANAGER …
EXTRACT EXT1 starting
# 6. 验证性能优化效果
GGSCI (source-db) 10> INFO EXTRACT EXT1, DETAIL
EXTRACT EXT1 Last Started 2026-04-03 18:30 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:02 ago)
Process ID 12352
Log Read Checkpoint Oracle Integrated Redo Logs
2026-04-03 18:30:00 Seqno 53, RBA 6789012
SCN 0.12345684 (12345684)
LSN 1234567892
# 7. 查看优化后的统计信息
GGSCI (source-db) 11> STATS EXTRACT EXT1, TOTALSONLY
Sending STATS request to EXTRACT EXT1 …
Start of Statistics at 2026-04-03 18:35:00.
Output to /u01/app/goldengate/dirdat/et:
From Table FGEDU.LARGE_TABLE:
# inserts: 10000
# updates: 5000
# deletes: 1000
# discards: 0
End of Statistics.
Part05-风哥经验总结与分享
1. Extract进程是GoldenGate的源端核心组件,负责从源数据库捕获数据变化,其配置和性能直接影响数据复制的效率和可靠性。更多视频教程www.fgedu.net.cn
2. GoldenGate支持两种类型的Extract进程:经典Extract和集成Extract。对于Oracle数据库,建议使用集成Extract,它提供更好的性能和功能。
3. Extract进程的配置包括参数文件配置、进程创建、Trail文件添加和进程启动等步骤,需要根据业务需求和技术环境进行合理配置。
4. Extract进程的监控与管理包括查看进程状态、查看报告文件、停止和启动进程、修改参数和重置检查点等操作,需要定期进行以确保进程的正常运行。
5. Extract进程的性能优化包括使用集成Extract、配置合适的Trail文件大小、启用并行处理、优化过滤条件和配置适当的缓存大小等措施。学习交流加群风哥QQ113257174
6. 对于关键业务系统,建议配置Extract进程的高可用性措施,如自动重启、故障转移等,确保数据复制的连续性和可靠性。
7. 定期监控Extract进程的状态和性能,及时发现和解决问题,确保数据复制的正常运行。
8. 建议寻求专业的服务支持,特别是对于复杂的Extract配置和性能优化,确保GoldenGate的高效运行。
更多学习教程公众号风哥教程itpux_com。from oracle:www.itpux.com
本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html
