Top 5 RAC Instance Crash Issues
DetailsIssue #1: ORA-29770 LMHB Terminate Instance
Symptoms:LMON (ospid: 31216) waits for event 'control file sequential read' for 88 secs.
Errors in file /oracle/base/diag/rdbms/prod/prod3/trace/prod3_lmhb_31304.trc (incident=2329):
ORA-29770: global enqueue process LMON (OSID 31216) is hung for more than 70 seconds
LMHB (ospid: 31304) is terminating the instance.
LMON (ospid: 8594) waits for event 'control file sequential read' for 118 secs.
ERROR: LMON is not healthy and has no heartbeat.
ERROR: LMHB (ospid: 8614) is terminating the instance.
Possible Causes: LMHB crashes the instance with LMON waiting on controlfile read
Bug 11890804 LMHB crashes instance with ORA-29770 after long "control file sequential read" waits
Solutions:Bug 8888434 has been fixed in 126.96.36.199+
Bug 11890804 has been fixed in 188.8.131.52+
Please refer Document 1197674.1, Document 8888434.8 and Document 11890804.8 for more details
Issue #2: Instance crash with ORA-481
Symptoms:1. PMON (ospid: 12585): terminating the instance due to error 481
LMON trace shows:
Begin DRM(107) (swin 0)
* drm quiesce
2. Fix HAIP issue per Document 1383737.1
Issue #3: ORA-600[kjbmprlst:shadow], ORA-600[kjbrref:pkey], ORA-600[kjbmocvt:rid], [kjbclose_remaster:!drm], ORA-600 [kjbrasr:pkey], instance crash
Symptoms:RAC instance crashes with ORA-600 [kjbmprlst:shadow] or ORA-600[kjbrref:pkey], or ORA-600[kjbmocvt:rid],[kjbclose_remaster:!drm], ORA-600 [kjbrasr:pkey]
Possible Causes:This group of ORA-600 are related with DRM (dynamic resource remastering) messaging or read mostly locking. Quite few bugs involved:
Document 9458781.8 Missing close message to master leaves closed lock dangling crashing the instance with assorted Internal error
Document 9835264.8 ORA-600 [kjbrasr:pkey] / ORA-600 [kjbmocvt:rid] in RAC with dynamic remastering
Document 10200390.8 ORA-600[kjbclose_remaster:!drm] in RAC with fix for 9979039
Document 10121589.8 ORA-600 [kjbmprlst:shadow] can occur in RAC
Document 11785390.8 Stack corruption / incorrect behaviour possible in RAC
Document 12408350.8 ORA-600 [kjbrasr:pkey] in RAC with read mostly locking
Document 12834027.8 ORA-600 [kjbmprlst:shadow] / ORA-600 [kjbrasr:pkey] with RAC read mostly locking
Solutions:Most of above bugs are fixed in 184.108.40.206, apply 220.127.116.11 patchset should avoid the bugs with the exception of Bug 12834027, this bug will be fixed in 12.1. Workaround for the bug is:
Disable read-mostly object locking
eg: Run with "_gc_read_mostly_locking"=FALSE
Please refer to above Document number for each bug explanation and solution.
Issue #4: Dumps on kcldle / kclfplz / kcbbxsv_l2 / kclfprm using flash
ORA-744[kclfprm] reported in alert log
Possible Causes:They are caused by various bugs which closed as base Bug 12337941 Dumps on kcldle / kclfplz / kcbbxsv_l2 / kclfprm using flash
Solutions:The bug has been fixed in 18.104.22.168, either apply the patchset or use workaround: Disable the flash cache
Refer Document 12337941.8 for more details
Issue #5: LMS gets ORA-600 [kclpdc_21] and instance crashes
Symptoms:ORA-600[kclpdc_21] reported in alert log
Possible Causes:Document 10040035.8 LMS gets ORA-600 [kclpdc_21] and instance crashes
Solutions:The bug has been fixed in 22.214.171.124
Issue for 10.2.0.5
Symptoms:1. lms report ORA-600[kjccgmb:1], instance crash with LMS
2. Instance crash with:
Received an instance abort message from instance 2 (reason 0x0)
Please check instance 2 alert and LMON trace files for detail.
LMD0: terminating instance due to error 481
Possible Causes:1. Bug 11893577 - LMD CRASHED WITH ORA-00600 [KJCCGMB:1]
2. Bug 9577274 - 1OFF:UNABLE TO VIEW REQUEST OUTPUT AND LOG AFTER APPLYING FIX TO ISSUE IN BUG 9400041
1. For 10.2.0.5.0, please apply merge patch 12616787 only
2. For 10.2.0.5.5, please apply merge patch 13470618 only
At the time of writing, patch only available for certain platform. It is not required to apply both of above patches for any 10.2.0.5.x release.