Oracle数据库服务器CPU高Database/session hang with 'CSS initialization' 某客户的某个系统突然CPU冲高,一个小时内CPU从5%冲到90%以上,在数据库等待事件中发现大量的会话在等待CSS initialization。查询metalink,发现有个Bug 10024824 – Database/session hang with ‘CSS initialization’
影响范围:Product (Component) Oracle Server (PCW) Range of versions believed to be affected Versions BELOW 12.1 Versions confirmed as being affected •
Platforms affected Generic (all / most platforms affected)

问题修复:This issue is fixed in • (Base Release) • (Server Patch Set)

问题症状:•Hang (Process Hang) •Hang (Involving Shared Resource) •Waits for "CSS initialization"
Hang (Involving Shared Resource)A process may hold a shared resource a lot longer than normally expected leading to many other processes having to wait for that resource. Such a resource could be a lock, a library cache pin, a latch etc.. The overall symptom is that typically numerous processes all appear to be stuck, although some processes may continue unhindered if they do not need the blocked resource.
Hang (Process Hang)A process may hang, typically in a wait state. Note that this is different to a process which is spinning and consuming CPU.
问题描述:Database/session hang with 'CSS initialization' can occurwhen the OH/log//client directory has the wrong permissionsin a RAC environment.

处理方案:Change the permission of OH/log//client directory to 771通过分析,我们发现cssN.log文件一直在产生,平均2,3分钟就产生一个,忙的时候每分钟产生5,6个。目前已经大约有几万多个了,从属主看,权限设置正常,看上去不像权限设置771之后就能解决的问题。
我们继续分析,并且用truss追踪该进程,我们此时才发现了问题的根源:进程大部分的时间是花在遍历client下cssN.log文件:truss -o /app/oracle/css.log -laefdD -p xxx
在client下不断生成大量文件,这个和oracle的一个unpublish bug 6004127 有关(ID 729349.1),Olsnodes Produces CPU Spikes With Many Logs in $CRS_HOME/log//client Directory (Doc ID 729349.1)。目前没有patch, 文档上说解决的方法加入任务自动清理client下的cssN.log;Remove the log files under $ORA_CRS_HOME/log//client/css*.log in regular intervals with a cronjob.