Oracle RAC数据库集群实例宕机崩溃故障的5种原因分析

教程发布:风哥 教程分类:ITPUX技术网 更新日期:2022-02-12 浏览学习:706

Oracle RAC数据库集群实例宕机故障的5种原因分析

以下来自Oracle官方文档:
Top 5 RAC Instance Crash Issues

DetailsIssue #1: ORA-29770 LMHB Terminate Instance

Symptoms:LMON (ospid: 31216) waits for event 'control file sequential read' for 88 secs.
Errors in file /oracle/base/diag/rdbms/prod/prod3/trace/prod3_lmhb_31304.trc (incident=2329):
ORA-29770: global enqueue process LMON (OSID 31216) is hung for more than 70 seconds
LMHB (ospid: 31304) is terminating the instance.

or
LMON (ospid: 8594) waits for event 'control file sequential read' for 118 secs.
ERROR: LMON is not healthy and has no heartbeat.
ERROR: LMHB (ospid: 8614) is terminating the instance.

Possible Causes: LMHB crashes the instance with LMON waiting on controlfile read
Bug 11890804 LMHB crashes instance with ORA-29770 after long "control file sequential read" waits

Solutions:Bug 8888434 has been fixed in 11.2.0.2+
Bug 11890804 has been fixed in 11.2.0.3+
Please refer Document 1197674.1, Document 8888434.8 and Document 11890804.8 for more details

Issue #2: Instance crash with ORA-481
Symptoms:1. PMON (ospid: 12585): terminating the instance due to error 481
LMON trace shows:
Begin DRM(107) (swin 0)
* drm quiesce for more information.

2. Fix HAIP issue per Document 1383737.1

Issue #3: ORA-600[kjbmprlst:shadow], ORA-600[kjbrref:pkey], ORA-600[kjbmocvt:rid], [kjbclose_remaster:!drm], ORA-600 [kjbrasr:pkey], instance crash
Symptoms:RAC instance crashes with ORA-600 [kjbmprlst:shadow] or ORA-600[kjbrref:pkey], or ORA-600[kjbmocvt:rid],[kjbclose_remaster:!drm], ORA-600 [kjbrasr:pkey]

Possible Causes:This group of ORA-600 are related with DRM (dynamic resource remastering) messaging or read mostly locking. Quite few bugs involved:
Document 9458781.8 Missing close message to master leaves closed lock dangling crashing the instance with assorted Internal error
Document 9835264.8 ORA-600 [kjbrasr:pkey] / ORA-600 [kjbmocvt:rid] in RAC with dynamic remastering
Document 10200390.8 ORA-600[kjbclose_remaster:!drm] in RAC with fix for 9979039
Document 10121589.8 ORA-600 [kjbmprlst:shadow] can occur in RAC
Document 11785390.8 Stack corruption / incorrect behaviour possible in RAC
Document 12408350.8 ORA-600 [kjbrasr:pkey] in RAC with read mostly locking
Document 12834027.8 ORA-600 [kjbmprlst:shadow] / ORA-600 [kjbrasr:pkey] with RAC read mostly locking

Solutions:Most of above bugs are fixed in 11.2.0.3, apply 11.2.0.3 patchset should avoid the bugs with the exception of Bug 12834027, this bug will be fixed in 12.1. Workaround for the bug is:

Disable DRM
or
Disable read-mostly object locking
eg: Run with "_gc_read_mostly_locking"=FALSE

Please refer to above Document number for each bug explanation and solution.

Issue #4: Dumps on kcldle / kclfplz / kcbbxsv_l2 / kclfprm using flash
Symptoms:ORA-7445[kcldle]
ORA-7445[kclfplz]
ORA-7445[kcbbxsv_12]
ORA-744[kclfprm] reported in alert log

Possible Causes:They are caused by various bugs which closed as base Bug 12337941 Dumps on kcldle / kclfplz / kcbbxsv_l2 / kclfprm using flash

Solutions:The bug has been fixed in 11.2.0.3, either apply the patchset or use workaround: Disable the flash cache
Refer Document 12337941.8 for more details

Issue #5: LMS gets ORA-600 [kclpdc_21] and instance crashes
Symptoms:ORA-600[kclpdc_21] reported in alert log

Possible Causes:Document 10040035.8 LMS gets ORA-600 [kclpdc_21] and instance crashes

Solutions:The bug has been fixed in 11.2.0.3

Issue for 10.2.0.5
Symptoms:1. lms report ORA-600[kjccgmb:1], instance crash with LMS: terminating instance due to error 484
2. Instance crash with:
Received an instance abort message from instance 2 (reason 0x0)
Please check instance 2 alert and LMON trace files for detail.
LMD0: terminating instance due to error 481

Possible Causes:1. Bug 11893577 - LMD CRASHED WITH ORA-00600 [KJCCGMB:1]
2. Bug 9577274 - 1OFF:UNABLE TO VIEW REQUEST OUTPUT AND LOG AFTER APPLYING FIX TO ISSUE IN BUG 9400041
Solutions:
1. For 10.2.0.5.0, please apply merge patch 12616787 only
2. For 10.2.0.5.5, please apply merge patch 13470618 only
At the time of writing, patch only available for certain platform. It is not required to apply both of above patches for any 10.2.0.5.x release.
[size=130%]
[size=130%]
[size=130%]

本文标签:
网站声明:本文由风哥整理发布,转载请保留此段声明,本站所有内容将不对其使用后果做任何承诺,请读者谨慎使用!
【上一篇】
【下一篇】