关于Oracle数据库ORA-00600 [2103]报错的bug汇总

教程发布:风哥 教程分类:ITPUX技术网 更新日期:2022-02-12 浏览学习:910

关于Oracle数据库ORA-00600 [2103]报错的bug汇总

关于Oracle ORA-00600[2103]报错的描述:
这个错误是说CONTROL FILE ENQUEUE等待超时,超时时间是900秒,也就是错误信息后面的参数,900秒杀15分钟,也就是说,在数据库解决这个队列冲突之前,RAC hang住了15分钟,这是一个比较长的时间,对业务系统来说非常致命,一个内部参数可以控制这个超时时间,这个参数是:_controlfile_enqueue_timeout,其缺省值是900秒。

以下内容来自Oracle ID 429943.1:
Error Description:
An ORA-600 [2103] is signaled by a session when it times out trying to acquire the CF enqueue. Holding the CF enqueue is required before performing IO against the database controlfiles (this locking ensures readers and writers see a consistent version of the controlfile contents).

The most common reasons for high CF enqueue contention leading to timeout errors are frequent logfile switches and IO contention.

Summary Of Bugs On ORA-00600 [2103] Error

Bug 4671216 (Unpublished)
[i]Abstract : ASM operations on a file are blocked while it is resized
[i]Versions affected : 10.0
[i]Fixed Releases : 10.1.0.6, 10.2.0.2.

[i]Details : Creating/deleting/resizing to a large ASM file may block other
ASM operations for an extented period of time and may cause
instances to crash with ORA-600 [2103] or ORA-600 [2116] errors.

[i]Backportable: Yes[i]

Symptoms :

1. When creating/dropping/resizing a large file,

2. Mounting or opening a database on DB instances or

3. ASM v$_ table queries are blocked for an extended period of time.

4. DB instance crash with [2103] or [2116].

[i]Workaround :

None[i]

Patch Details:

One-off patch available for few platforms on top of 10.1.0.4,10.1.0.5, 10.2.0.1
Check Metalink for Patch 4671216 availability.

Bug 5134663

[i]Abstract : OERI[2103] with ASM

[i]Versions affected : 10.2

[i]Fixed Releases : 10.1.0.6, 10.2.0.3, 10.0.0.0

[i]Details : ASM may get to a GCS locking deadlock and cause the DB client to
crash with ORA-600 [2103]

[i]
Backportable: Yes

Symptoms :

The macro symptom is that ASM hangs and cannot satisfy requests from the DB
client. DB may assert [2103]. The ASM systemstate dumps show one instance
waiting to escalate a lock from S to X and another instance waiting to open or
convert a lock from NL to X.

[i]Workaround

Kill the ASM instance that is escalting the lock from S to X.

[i]Patch Details:

One-off patch available for few platforms on top of 10.2.0.1, 10.2.0.2
Check Metalink for Patch 5134663 availability.

Bug 5011019

[i]Abstract : OERI[2103] on bystander standby during target standby failover

[i]Versions affected :10.2
[i]
Fixed Releases : 10.2.0.3, 11.0.0.0

[i]Details : During a failover of a standby database to a primary, other standby databases
may not accept redo from the new primary right away. The other standby
databases may experience a delay in accepting redo from the new primary due
to the connections these other standby databases have to the old primary.
This fix alleviates the delay in cases where the connections to the old
primary is preventing new connections to the new primary.

[i]Backportable: Yes

[i]Symptoms:

Depending on the timing, you may see ORA-600 with argument 2103 in the
alert log. However, typically, the bystander standby database does not have
any connections to the old primary by the time the new primary is created.

[i]Workaround:

None

[i]Patch Details:

One-off patch available for few platforms on top of 10.2.0.1, 10.2.0.2
Check Metalink for Patch 5011019 availability.

Bug 3187730

[i]Abstract : Various hangs possible if process expecting SIGALRM does not get it

[i]Versions affected :9.2

[i]Fixed Releases : 9.2.0.7, 10.1.0.5, 10.2.0.1

[i]Details : Processes may hang due to waiting for a SIGALRM signal to arrive before
it continues. This can lead to various hang symptoms such as:
ORA-600[2103]

[i]Backportable: Yes but only to 9.2

Systemwide hang with sessions waiting for "library cache load lock"
Parallel queries hanging

[i]Symptoms

Processes hang leading symptoms like

1) ORA-600[2103]
2) SYSTEM HANG DUE TO WAIT FOR LIBRARY CACHE LOAD LOCK. (bug:3187730)
3) QC AND SLAVES ARE WAITING FOR IDLE EVENT AND PARALLEL QUERY HANG
(Bug: 3861580).

After you have successfully used the workaround you can be sure that you are
hitting this issue.

[i]Workaround:

Find the process which is hanging and manually send if SIGALRM (Unix only)

[i]Patch Details:

One-off patch available for few platforms on top of 9.2.0.4, 9.2.0.5, 9.2.0.6, 10.1.0.4
Check Metalink for Patch 3187730 availability.

Bug 2950375

[i]Abstract : OERI[2103] in RAC if LMD0 is producing a dump
[i]
Versions affected : 9.2

[i]Fixed Releases : 9.2.0.4, 10.0.0.0.

[i]Details : LMD0 may stall for long period of time (in ksdxdmpproc) when dumping
diagnostic information. This can lead to an ORA-600 [2103] error
causing instance crash.

[i]Backportable: Yes but have some internal exceptions.

[i]Symptoms:

If lmd0 was stall for long period of time in ksdxdmpproc, it may be the cause
of the problem.

[i]Workaround:

set distributed_lock_timeout to 59.

[i]Patch Details:

One-off patch available for few platforms on top of 9.2.0.3
Check metalink for patch availibility using following link Patch 2950375

Bug 3724485

[i]Abstract : Enqueue waits may occur with no obvious holder in RAC

[i]Versions affected : 9.2

[i]Fixed Releases : 9.2.0.6, 10.1.0.4, A.2.0.1

[i]Details : This problem is specific to RAC environments.
When a process requesting an enqueue gets an error and resignals, the enqueue
might not be cleaned up properly. This can result in enqueue requests for that
enqueue blocking but there being no apparent holder from systemstate dumps.
If the enqueue happens to be the CF enqueue then this can result in ORA-600 [2103]
and an instance crash.

[i]Backportable: Yes

[i]Sumptoms:

The global enqueue doesn't exist from all the systemstates and it might be
able to show the problem from the in-memory traces from the owning processes.

[i]Workaround

No workaround.

[i]Patch Details:

One-off patch available for few platforms on top of 9.2.0.4, 9.2.0.5, 10.1.0.3
Check Metalink for availibility of patch using link Patch 3724485

Bug 2872299 (Unpublished)

[i]Abstract : OERI:2103 / instance crash can occur if foreground hits data/index internal error

[i]Versions affected : 9.2

[i]Fixed Releases : 9.2.0.4, 10.0.0.0

[i]Details : Instance may crash with ORA-600 [2103] in a RAC environment if
a session hits an data/index/txn layer internal error.
If a foreground runs into a data/index/txn layer internal error it
first tries to dump the redo from all online threads in a RAC cluster
and then raise the internal error. Dumping redo from all logs will
cause the foreground to hold the controlfile enqueue CF [0] [0] while
the routine that is scanning the log for the specific redo records.

This can cause background processes like LGWR/SMON that are waiting to get
the CF enqueue in X mode to get blocked and time out after 15 minutes
crashing the instance with ORA 600[2103]

[i]Backportable: Yes

[i]Symptoms:

if foreground dumping redo for any internal error indirectly cause ORA600[2103]

[i]Workaround:

None
[i]
Patch Details:

One-off patch available for few platforms on top of 9.2.0.2, 9.2.0.3
Check Metalink for Patch 2872299 availability.

Bug 3342182

[i]Abstract : Instance hang on startup possible in RAC

[i]Versions affected : 9.2
[i]
Fixed Releases : 9.2.0.5, 10.1.0.3, 10.2.0.1
[i]
Details : After startup in a RAC environment LGWR may hang waiting for a
GES operation and user logins may also hang. This is a rare
scenario.
For this problem LGWR waits on one of the following:
o "ges inquiry response"
o "wait for scn from all nodes"
o "enqueue"
o "DFS lock handle"
o "wait for master scn"

[i]Backportable: Yes
[i]
Symptoms:

1. RAC + UDP
2. Oracle release 10.2
3. Send timeouts
4. trace suggests that an active connection was mistakenly cleaned up, eg:
"WARN: acconn .... getting closed. inactive: threshold: 0x0
WARN: potential problem in keep alive connection protocol"

[i]Workaround:

Restart each instance within 24 days to prevent the problem.

[i]Patch Details:

One-off patch available for few platforms on top of 9.2.0.3, 9.2.0.4, 8.1.7.4
Check Metalink for Patch 3342182 availability.

Bug 4047167 (Unpublished)

[i]Abstract : RAC hang possible processing synchronized dump request

[i]Versions affected : 10.2

[i]Fixed Releases : 9.2.0.8, 10.2.0.1
[i]
Details : There is no workaround for the hang scenario but it may be
possible to avoid the synchronized dump itself by addressing
whatever error led to the dump being invoked.

[i]
Backportable: Yes

[i]Symptoms:

A process may hang on a synchronized dump request in a RAC
environment due to an inter-dependency between IPC and the
wait facility. This only occurs when the dump request is
made from the IPC layer.
If a process hangs and contains the following in the call stack,
you are likely to experience this problem.
ksarcr
ksbwco
kjzddmp
ksxpsrvdt
ksxpwait

[i]Workaround:

There is no workaround for the hang scenario but it may be
possible to avoid the synchronized dump itself by addressing
whatever error led to the dump being invoked.

[i]Patch Details:

One-off patch available for few platforms on top of 9.2.0.7, 10.1.0.4
Check Metalink for Patch 4047167 availability.

Bug 3885499

[i]Abstract : ASM hang possible

[i]Versions affected : 10.1

[i]Fixed Releases :10.1.0.6, 10.2.0.3, 11.0.0.0

[i]Details : A database session may hang in the middle of an alias scan,
either through RMAN, dbms_file_transfer, ASM PL/SQL fixed package,
or XDB/FTP, causing an ASM instance hang.
if DB_BLOCK_CHECKING is enabled.

[i]Backportable: Yes

[i]Symptoms:

If a DB instance hanging in the middle of an alias scan, either through RMAN,
dbms_file_transfer, ASM PL/SQL fixed package, or XDB/FTP, and the ASM
instance hangs as a result, then we have the bug. This problem is seen in
LRG hangs in rare instances.

[i]Workaround:

Manually kill the slave background of the DB-instance process that is hung.
This operation will release the ASM foreground's resources.

[i]Patch Details:

One-off patch available for few platforms on top of 10.2.0.2
Check Metalink for Patch 3430832 availability.

Bug 4164021

[i]Abstract : OERI [2103] during BACKUP CONTROLFILE to TRACE

[i]Versions affected : 9.2

[i]Fixed Releases : 9.2.0.8, 10.1.0.5, 10.2.0.2, 11.0.0.0

[i]Details : Under rare circumstances an "alter database backup controlfile to trace"
can cause a deadlock between the foreground, the log writer and the
CR server process, causing an ORA-600[2103] in log writer in RAC
environments.

Various internal errors in queries using old connect by.

Error stack has qerix and connect by rowsources.

[i]Backportable: Yes

Workaround

Avoid backing up the controlfile to trace when the instance is mounted
shared.

[i]Patch Details:

One-off patch available for few platforms on top of 9.2.0.6, 9.2.0.7,10.1.0.4
Check Metalink for availibility of patch using link Patch 4164021

Bug 3253153

[i]Abstract : RFS may error starting managed recovery

[i]Versions affected : 9.2

[i]Fixed Releases : 9.2.0.5, 10.0.0.0

[i]Details : When starting up managed recovery, the RFS process reports the following
in the alert.log:
Mon Nov 10 11:58:08 2003
RFS: Forced Shutdown due to RFS_ERROR state
Mon Nov 10 12:11:49 2003
RFS: controlfile enqueue unavailable
Possible invalid cross-instance archival configuration
Mon Nov 10 12:11:57 2003
RFS: controlfile enqueue unavailable
Possible invalid cross-instance archival configuration
Mon Nov 10 12:14:26 2003
Media Recovery Log /oracle/PRD/saparch/PRDarch1_270139.dbf
Mon Nov 10 12:16:57 2003
RFS: Error State mode '8'

[i]Backportable: Yes

[i]Symptoms:

Mon Nov 10 11:58:08 2003
RFS: Forced Shutdown due to RFS_ERROR state
Mon Nov 10 12:11:49 2003
RFS: controlfile enqueue unavailable
Possible invalid cross-instance archival configuration
Mon Nov 10 12:11:57 2003
RFS: controlfile enqueue unavailable
Possible invalid cross-instance archival configuration
Mon Nov 10 12:14:26 2003
Media Recovery Log /oracle/PRD/saparch/PRDarch1_270139.dbf
Mon Nov 10 12:16:57 2003
RFS: Error State mode '8'

[i]Workaround

None

[i]Patch Details:

One-off patch available for few platforms on top of 9.2.0.4
Check Metalink for Patch 3253153 availability.

Bug 4029799

[i]Abstract : Generate alert messages for enqueue timeouts

[i]Versions affected : 9.2

[i]Fixed Releases :10.1.0.5, 10.2.0.1

[i]Details : Generate alert messages for enqueue timeouts.
(eg: CF timeouts)

[i]
Backportable: Yes
[i]
Symptoms:
Generate alert messages for enqueue timeouts.
(eg: CF timeouts)

[i]Workaround:

No workarounds

[i]Patch Details:

One-off patch available for few platforms on top of 10.1.0.4
Check Metalink for availability of patch using link Patch 4029799

Bug 3400979

[i]Abstract : Dump of FILE_HDRS may hang or error (can result in OERI [2103])

[i]Versions affected : 9.2

[i]Fixed Releases : 9.2.0.6, 10.1.0.4, 10.2.0.1

[i]Details : Using the diagnostic FILE_HDRS dump a process may hang or get
an IO error. In the case of a hang this process could be holding
the CF enqueue and so can lead to ORA-600 [2103] errors and
and instance crash.

[i]Backportable: Yes

[i]Workaround

Do not issues FILE_HDRS dumps.

[i]Patch Details:

Currently one off patches are not avaiable for Bug 3400979.

Bug 3394085 (Unpublished)

[i]Abstract : CRS may hang after node hard shutdown / node disconnect

[i]Versions affected : 10.1
[i]Fixed Releases : 10.1.0.3, 10.2.0.1

[i]Details : Cluster ready services (CRS) may hang after node
hard shutdown / node disconnect.

[i]Backportable: Yes

[i]Symptoms:

Do a hard reset of one of the machines in the cluster. CRSDs will hang in
prom_rpc call.

[i]Workaround:

Kill and restart CRSDs.

[i]Patch Details:

One-off patch available for few platforms on top of 10.1.0.2
Check Metalink for Patch 3394085 availability.

Bug 5181800 (Unpublished)

[i]Abstract : Async LNS holds CF enqueue while issuing network calls rfsopen/rfsclose

[i]Versions affected : 10.2

[i]Fixed Releases : 10.2.0.3, 11.0.0.0

[i]Details : A primary may be affected due to network hangs between primary and
standby with async LNS in operation in that it may hold the CF
enqueue longer than desired.

[i]Backportable: Yes

[i]Symptoms:

Look for situations where primary is affected due to network hangs
between primary and standby with async lns in operation.

[i]Workaround:

No workaround available

[i]Patch Details:

One-off patch available for few platforms on top of 10.2.0.1
Check Metalink for availability of patch using link Patch 5181800

Bug 4074603

[i]Abstract : RMAN backup of primary with a standby using FAL may omit archivelogs

[i]Versions affected : 9.2

[i]Fixed Releases : 9.2.0.7, 10.1.0.5

[i]Details : This problem is introduced in 9.2.0.6 by the fix for bug 3533351 .
The fix for bug 3533351 can cause silent loss of archivelogs
in the backup sets when using RMAN to backup the primary if
the situation in bug 3533351 arises.
This occurs as that fix marks the archive log as DELETED on
the primary which then affects any subsequent RMAN backup as
RMAN will ignore DELETED archive logs.

[i]Backportable: Yes

[i]Symptoms:

Look for non-shared logs, and FAL being performed by one instance
(the one which doesn't have access to the file), which marks the
log's CF entry as DELETED.

[i]Workaround:

Make sure archive logs are available from all nodes. This can be
achieved by either having the archive logs in shared locations, or
by having local copies of all logs.
Another workaround is to fix the gap manually.
A third workaround is to copy deleted logs to the FAL serving instance,
and register them.

[i]Patch Details:

One-off patch available for few platforms on top of 9.2.0.6, 10.1.0.4
Check Metalink for availibility of patch using link Patch 4074603

Bug 4997470 (Unpublished)

[i]Abstract : CSS startup terminates when another node comes up during reconfig
[i]
Versions affected : 10.2

[i]Fixed Releases : 10.2.0.3, 11.1.0.0

[i]Details : CSS startup incorrectly terminates when another node comes up during reconfig

[i]Backportable: Yes

[i]Symptoms:

[i]Workaround:

No workaround available

[i]Patch Details:

One-off patch available for few platforms on top of 10.2.0.2
Check Metalink for Patch 3430832 availability.

本文标签:
网站声明:本文由风哥整理发布,转载请保留此段声明,本站所有内容将不对其使用后果做任何承诺,请读者谨慎使用!
【上一篇】
【下一篇】