1. 首页 > Oracle教程 > 正文

Oracle教程FG006-Oracle数据库日常维护任务

1.2 维护任务分类

Oracle数据库维护任务可以按时间周期分类:

  • 每日任务:健康检查、告警检查、备份验证等
  • 每周任务:统计信息收集、表空间检查、日志清理等
  • 每月任务:性能分析、安全审计、存储规划等
  • 季度任务:全面健康检查、容量规划、补丁评估等
  • 年度任务:大型升级、架构评审、灾难恢复演练等

1.3 维护重要性

日常维护的重要性体现在:

  1. 预防故障:提前发现潜在问题,避免重大故障
  2. 优化性能:持续监控和调整,保持最佳性能
  3. 保障安全:定期安全检查,防止安全漏洞
  4. 延长寿命:良好的维护可以延长系统使用寿命
  5. 降低成本:预防性维护比事后修复成本更低
风哥提示:日常维护是DBA最重要的工作之一,做好日常维护可以大幅减少紧急故障处理的次数。

Part02-生产环境规划与建议

2.1 维护计划制定

制定有效的维护计划需要考虑:

考虑因素
说明

业务需求
了解业务高峰期,选择合适的维护窗口

系统特点
根据系统大小、复杂度制定相应计划

人员配置
考虑团队能力和工作安排

工具支持
利用自动化工具提高效率

合规要求
满足行业和企业的合规要求

2.2 维护时间表

建议的维护时间安排:

# 每日维护(工作日早上9点前)
– 检查数据库状态
– 检查告警日志
– 检查备份状态
– 检查表空间使用情况
– 检查监听器状态

# 每周维护(周日凌晨)
– 收集统计信息
– 清理过期日志
– 检查索引状态
– 验证备份完整性
– 分析AWR报告

# 每月维护(每月第一个周末)
– 全面性能分析
– 安全审计
– 存储容量规划
– 检查碎片情况
– review变更记录

# 季度维护(每季度末)
– 全面健康检查
– 容量规划评估
– 补丁评估和规划
– 灾难恢复演练
– 架构评审

2.3 维护工具选择

常用的Oracle维护工具:

  • Oracle Enterprise Manager:图形化管理工具
  • AWR/ASH/ADDM:性能分析工具
  • RMAN:备份恢复工具
  • SQL*Plus:命令行管理工具
  • 自定义脚本:Shell/Python脚本自动化
  • 第三方工具:如Zabbix、Nagios监控
生产环境建议:结合使用多种工具,自动化程度越高,人工错误越少。

Part03-生产环境项目实施方案

3.1 每日维护任务

3.1.1 数据库状态检查

# 连接数据库
$ sqlplus / as sysdba

# 检查实例状态
SQL> select instance_name, status, database_status, startup_time from v$instance;INSTANCE_NAME STATUS DATABASE_STATUS STARTUP_TIME
—————- ———— —————– ——————-
fgedudb OPEN ACTIVE 2026-03-31 19:00:00

# 检查数据库状态
SQL> select name, open_mode, database_role, log_mode from v$database;NAME OPEN_MODE DATABASE_ROLE LOG_MODE
——— ——————– —————- ————
FGEDUDB READ WRITE PRIMARY ARCHIVELOG

# 检查表空间使用情况
SQL> select tablespace_name,
round((1 – free/total)*100, 2) as used_pct,
round(total/1024/1024, 2) as total_mb,
round(free/1024/1024, 2) as free_mb
from (
select tablespace_name,
sum(bytes) as total,
sum(maxbytes) as max_total
from dba_data_files
group by tablespace_name
) a,
(
select tablespace_name,
sum(bytes) as free
from dba_free_space
group by tablespace_name
) b
where a.tablespace_name = b.tablespace_name
order by used_pct desc;TABLESPACE_NAME USED_PCT TOTAL_MB FREE_MB
—————————— ——— ———- ———-
SYSTEM 85.2 1024 151
SYSAUX 72.5 2048 562
UNDOTBS1 45.3 2048 1121
USERS 25.1 512 383
TEMP 10.5 1024 916

3.1.2 告警日志检查

# 查看告警日志位置
SQL> show parameter background_dump_dest;NAME TYPE VALUE
———————————— ———– ——————————
background_dump_dest string /oracle/app/oracle/diag/rdbms/fgedudb/fgedudb/trace

# 查看最近的告警
$ cd /oracle/app/oracle/diag/rdbms/fgedudb/fgedudb/trace
$ tail -100 alert_fgedudb.log

# 检查是否有ORA-错误
$ grep “ORA-” alert_fgedudb.log | tail -20

# 检查是否有严重错误
$ grep -E “ORA-00600|ORA-07445|ORA-01555” alert_fgedudb.log | tail -10

# 检查检查点信息
$ grep “Completed checkpoint” alert_fgedudb.log | tail -10
Wed Mar 31 20:00:00 2026
Completed checkpoint up to RBA [0x123.4567.89], SCN: 123456789
Wed Mar 31 21:00:00 2026
Completed checkpoint up to RBA [0x123.5678.90], SCN: 123456790

3.1.3 备份状态检查

# 检查RMAN备份状态
$ rman target /RMAN> list backup summary;List of Backups
===============
Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag
——- — — – ———– ————— ——- ——- ———- —
1234 B F A DISK 31-MAR-26 1 1 YES TAG20260331T200000
1235 B F A DISK 31-MAR-26 1 1 YES TAG20260331T210000

# 检查最近的备份
RMAN> list backup of database completed after ‘sysdate-1’;# 检查归档日志备份
RMAN> list backup of archivelog all completed after ‘sysdate-1’;# 验证备份有效性
RMAN> validate backupset 1234;Starting validate at 31-MAR-26
using channel ORA_DISK_1
channel ORA_DISK_1: starting validation of datafile backup set
channel ORA_DISK_1: reading from backup piece /backup/rman/fgedudb_1234_1.bkp
channel ORA_DISK_1: piece handle=/backup/rman/fgedudb_1234_1.bkp tag=TAG20260331T200000
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: validation complete, elapsed time: 00:01:30
Finished validate at 31-MAR-26

3.1.4 监听器和会话检查

# 检查监听器状态
$ lsnrctl status

LSNRCTL for Linux: Version 19.0.0.0.0 – Production on 31-MAR-2026 22:00:00

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rac1)(PORT=1521)))
STATUS of the LISTENER
————————
Alias LISTENER
Version TNSLSNR for Linux: Version 19.0.0.0.0 – Production
Start Date 31-MAR-2026 19:00:00
Uptime 0 days 3 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /oracle/app/oracle/product/19c/db_1/network/admin/listener.ora
Listener Log File /oracle/app/oracle/diag/tnslsnr/rac1/listener/alert/log.xml
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=rac1)(PORT=1521)))
Services Summary…
Service “fgedudb” has 1 instance(s).
Instance “fgedudb”, status READY, has 1 handler(s) for this service…
The command completed successfully

# 检查当前会话数
SQL> select count(*) from v$session where type=’USER’;COUNT(*)
———-
25

# 检查活跃会话
SQL> select sid, serial#, username, status, machine, program
from v$session
where type=’USER’ and status=’ACTIVE’;SID SERIAL# USERNAME STATUS MACHINE PROGRAM
———- ———- ———- ——– ————— —————-
123 45 FGAPP_USER ACTIVE app-server-01 JDBC Thin Client
124 67 FGAPP_USER ACTIVE app-server-02 JDBC Thin Client

# 检查阻塞会话
SQL> select blocking_session, sid, serial#, username, status
from v$session
where blocking_session is not null;no rows selected

3.2 每周维护任务

3.2.1 统计信息收集

# 检查统计信息状态
SQL> select owner, table_name, last_analyzed
from dba_tables
where owner in (‘SYS’, ‘SYSTEM’, ‘FGAPP_USER’)
and last_analyzed < sysdate - 7 order by last_analyzed;OWNER TABLE_NAME LAST_ANALYZED ---------- ------------------------------ ------------------- FGAPP_USER ORDERS 2026-03-20 10:00:00 FGAPP_USER CUSTOMERS 2026-03-21 14:00:00 # 收集schema统计信息 SQL> exec dbms_stats.gather_schema_stats(
ownname => ‘FGAPP_USER’,
estimate_percent => dbms_stats.auto_sample_size,
method_opt => ‘for all columns size auto’,
cascade => true,
degree => 4,
no_invalidate => false
);PL/SQL procedure successfully completed.

# 收集字典统计信息
SQL> exec dbms_stats.gather_dictionary_stats;PL/SQL procedure successfully completed.

# 收集固定对象统计信息
SQL> exec dbms_stats.gather_fixed_objects_stats;PL/SQL procedure successfully completed.

3.2.2 索引检查和维护

# 检查索引状态
SQL> select owner, index_name, table_name, status
from dba_indexes
where status != ‘VALID’
and owner not in (‘SYS’, ‘SYSTEM’);no rows selected

# 检查索引使用情况
SQL> select owner, index_name, table_name, used
from dba_indexes di
left outer join v$object_usage voi
on di.owner = voi.owner
and di.index_name = voi.index_name
where di.owner in (‘FGAPP_USER’)
order by used nulls first;OWNER INDEX_NAME TABLE_NAME USED
———- ——————– ——————– —-
FGAPP_USER IDX_ORDERS_DATE ORDERS YES
FGAPP_USER IDX_CUST_EMAIL CUSTOMERS YES
FGAPP_USER IDX_ORDERS_CUSTID ORDERS NO

# 重建碎片严重的索引
SQL> alter index FGAPP_USER.IDX_ORDERS_DATE rebuild online;Index altered.

# 或者收缩索引
SQL> alter index FGAPP_USER.IDX_CUST_EMAIL coalesce;Index altered.

3.2.3 日志清理

# 检查跟踪文件大小
$ cd /oracle/app/oracle/diag/rdbms/fgedudb/fgedudb/trace
$ ls -lh *.trc | head -20

-rw-r—– 1 oracle oinstall 10M Mar 31 19:00 fgedudb_ora_1234.trc
-rw-r—– 1 oracle oinstall 5M Mar 31 20:00 fgedudb_ora_5678.trc

# 清理7天前的跟踪文件
$ find /oracle/app/oracle/diag/rdbms/fgedudb/fgedudb/trace -name “*.trc” -mtime +7 -delete

# 清理7天前的trm文件
$ find /oracle/app/oracle/diag/rdbms/fgedudb/fgedudb/trace -name “*.trm” -mtime +7 -delete

# 检查监听器日志
$ cd /oracle/app/oracle/diag/tnslsnr/rac1/listener/alert
$ ls -lh log.xml

-rw-r—– 1 oracle oinstall 50M Mar 31 22:00 log.xml

# 轮换监听器日志
$ lsnrctl set current_listener LISTENER
$ lsnrctl set log_status off
$ mv log.xml log_$(date +%Y%m%d).xml
$ lsnrctl set log_status on

# 清理旧的归档日志
RMAN> delete noprompt archivelog all completed before ‘sysdate-7’;released channel: ORA_DISK_1
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=123 device type=DISK
List of Archived Log Copies for database with db_unique_name FGEDUDB
=====================================================================
Key Thrd Seq S Low Time
——- —- ——- – ———
1234 1 1001 A 31-MAR-26
1235 1 1002 A 31-MAR-26
Deleted 2 objects

3.3 每月维护任务

3.3.1 AWR性能分析

# 查看可用的AWR快照
SQL> select snap_id, begin_interval_time, end_interval_time
from dba_hist_snapshot
order by snap_id desc
fetch first 10 rows only;SNAP_ID BEGIN_INTERVAL_TIME END_INTERVAL_TIME
———- —————————– —————————–
1234 31-MAR-26 09.00.00.000 PM 31-MAR-26 10.00.00.000 PM
1233 31-MAR-26 08.00.00.000 PM 31-MAR-26 09.00.00.000 PM

# 生成AWR报告
SQL> @?/rdbms/admin/awrrpt.sql

Current Instance
~~~~~~~~~~~~~~~~

DB Id DB Name Inst Num Instance
———– ———— ——– ————
123456789 FGEDUDB 1 fgedudb

Specify the Report Type
~~~~~~~~~~~~~~~~~~~~~~~
Would you like an HTML report, or a plain text report?
Enter ‘html’ for an HTML report, or ‘text’ for plain text
Defaults to ‘html’
Enter value for report_type: html

Type Specified: html

Instances in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

DB Id Inst Num DB Name Instance Host
———— ——– ———— ———— ————
123456789 1 FGEDUDB fgedudb rac1

Using 123456789 for database Id
Using 1 for instance number

Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed. Pressing without
specifying a number lists all completed snapshots.

Enter value for num_days: 7

Listing the last 7 days of Completed Snapshots

Snap
Instance DB Name Snap Id Snap Started Level
———— ———— ——— —————— —–
fgedudb FGEDUDB 1200 24-Mar-26 00:00 1
fgedudb FGEDUDB 1201 24-Mar-26 01:00 1

Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 1230
Begin Snapshot Id specified: 1230

Enter value for end_snap: 1234
End Snapshot Id specified: 1234

Specify the Report Name
~~~~~~~~~~~~~~~~~~~~~~~
The default report file name is awrrpt_1_1230_1234.html. To use this name,
press to continue, otherwise enter an alternative.

Enter value for report_name: /backup/awr_report_20260331.html

Report written to /backup/awr_report_20260331.html

3.3.2 安全审计检查

# 检查审计设置
SQL> show parameter audit_trail;NAME TYPE VALUE
———————————— ———– ——————————
audit_trail string DB, EXTENDED

# 查看审计记录
SQL> select username, action_name, action, timestamp, returncode
from dba_audit_trail
where timestamp > sysdate – 30
order by timestamp desc
fetch first 20 rows only;USERNAME ACTION_NAME ACTION TIMESTAMP RETURNCODE
———- ——————– —— ——————- ———-
SYS LOGON 100 2026-03-31 22:00:00 0
FGAPP_USER SELECT 3 2026-03-31 21:50:00 0
SYS ALTER USER 145 2026-03-31 21:00:00 0

# 检查失败的登录
SQL> select username, userhost, timestamp, returncode
from dba_audit_trail
where action_name = ‘LOGON’
and returncode != 0
and timestamp > sysdate – 30;no rows selected

# 检查权限变更
SQL> select grantee, privilege, admin_option, grantor, timestamp
from dba_audit_trail
where action_name in (‘GRANT’, ‘REVOKE’)
and timestamp > sysdate – 30;no rows selected

3.4 季度维护任务

3.4.1 全面健康检查

# 使用ORAchk工具进行健康检查
# 下载并安装ORAchk
$ cd /backup
$ unzip orachk.zip
$ cd orachk

# 运行ORAchk
$ ./orachk

ORAchk Version : 19.3.0
Created on : 2026-03-31
Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.

Using ssh for checks
Checking prompts for user equivalence …
User equivalence OK for rac1
User equivalence OK for rac2

Enter which system(s) to check (comma separated list of host names).
List must include ‘localhost’ if local system is to be checked.
Using ‘rac1,rac2’ as target list.

Running checks on rac1,rac2

Collecting – AIX
Collecting – HP-UX
Collecting – Linux
Collecting – Solaris
Collecting – ZLinux
Collecting – Oracle Home
Collecting – Database

Report written to: /backup/orachk/orachk_fgedudb_rac1_rac2_20260331_220000.html

3.4.2 容量规划评估

# 检查表空间增长趋势
SQL> select to_char(creation_time, ‘YYYY-MM’) as month,
count(*) as num_files,
sum(bytes)/1024/1024/1024 as total_gb
from dba_data_files
group by to_char(creation_time, ‘YYYY-MM’)
order by month;MONTH NUM_FILES TOTAL_GB
——– ———- ———-
2026-01 5 15.2
2026-02 6 18.5
2026-03 8 25.3

# 检查数据增长历史
SQL> select snap_id,
to_char(begin_interval_time, ‘YYYY-MM-DD HH24:MI’) as snap_time,
(select sum(bytes) from dba_hist_tbspc_space_usage where snap_id = dh.snap_id)/1024/1024/1024 as total_gb
from dba_hist_snapshot dh
where begin_interval_time > sysdate – 30
order by snap_id desc
fetch first 10 rows only;SNAP_ID SNAP_TIME TOTAL_GB
———- —————— ———-
1234 2026-03-31 22:00 25.3
1233 2026-03-31 21:00 25.2
1232 2026-03-31 20:00 25.1

# 预测未来6个月的存储需求
# 根据历史增长率计算
# 当前月增长率:(25.3 – 18.5) / 2 = 3.4 GB/月
# 6个月预测:25.3 + 3.4 * 6 = 45.7 GB

Part04-生产案例与实战讲解

4.1 自动化脚本编写

4.1.1 每日检查脚本

#!/bin/bash
# daily_check.sh – Oracle数据库每日检查脚本
# from:www.itpux.com.qq113257174.wx:itpux-com
# web `http://www.fgedu.net.cn`

ORACLE_HOME=/oracle/app/oracle/product/19c/db_1
ORACLE_SID=fgedudb
export ORACLE_HOME ORACLE_SID
PATH=$ORACLE_HOME/bin:$PATH
DATE=$(date +%Y%m%d_%H%M%S)
LOG_DIR=/backup/daily_checks
LOG_FILE=$LOG_DIR/daily_check_$DATE.log

mkdir -p $LOG_DIR

echo “========================================” >> $LOG_FILE
echo “Oracle Daily Check – $DATE” >> $LOG_FILE
echo “========================================” >> $LOG_FILE

# 1. 检查数据库状态
echo “” >> $LOG_FILE
echo “1. Database Status Check” >> $LOG_FILE
echo “————————” >> $LOG_FILE
sqlplus -s / as sysdba << EOF >> $LOG_FILE
set pagesize 0 feedback off verify off heading off
select ‘Instance: ‘ || instance_name || ‘, Status: ‘ || status from v\$instance;select ‘Database: ‘ || name || ‘, Mode: ‘ || open_mode from v\$database;exit;EOF

# 2. 检查表空间使用
echo “” >> $LOG_FILE
echo “2. Tablespace Usage” >> $LOG_FILE
echo “——————-” >> $LOG_FILE
sqlplus -s / as sysdba << EOF >> $LOG_FILE
set linesize 100 pagesize 50
col tablespace_name for a20
col used_pct for 999.99
col total_mb for 999999
col free_mb for 999999
select tablespace_name,
round((1 – free/total)*100, 2) as used_pct,
round(total/1024/1024, 2) as total_mb,
round(free/1024/1024, 2) as free_mb
from (
select tablespace_name,
sum(bytes) as total,
sum(maxbytes) as max_total
from dba_data_files
group by tablespace_name
) a,
(
select tablespace_name,
sum(bytes) as free
from dba_free_space
group by tablespace_name
) b
where a.tablespace_name = b.tablespace_name
order by used_pct desc;exit;EOF

# 3. 检查告警日志错误
echo “” >> $LOG_FILE
echo “3. Alert Log Errors” >> $LOG_FILE
echo “——————–” >> $LOG_FILE
ALERT_LOG=$(sqlplus -s / as sysdba << EOF set pagesize 0 feedback off verify off heading off select value from v\$diag_info where name = 'Diag Trace';exit;EOF) grep "ORA-" $ALERT_LOG/alert_$ORACLE_SID.log | tail -20 >> $LOG_FILE

# 4. 检查备份状态
echo “” >> $LOG_FILE
echo “4. Backup Status” >> $LOG_FILE
echo “—————–” >> $LOG_FILE
rman target / << EOF >> $LOG_FILE
list backup summary;exit;EOF

echo “” >> $LOG_FILE
echo “========================================” >> $LOG_FILE
echo “Daily Check Completed – $DATE” >> $LOG_FILE
echo “========================================” >> $LOG_FILE

# 发送邮件通知
# mail -s “Oracle Daily Check – $DATE” dba@fgedu.net.cn < $LOG_FILE echo "Daily check completed. Log file: $LOG_FILE"

4.1.2 设置定时任务

# 编辑crontab
$ crontab -e

# 添加每日检查任务(每天早上8点执行)
0 8 * * * /backup/scripts/daily_check.sh >> /backup/scripts/daily_check_cron.log 2>&1

# 添加每周统计信息收集(每周日凌晨2点)
0 2 * * 0 /backup/scripts/gather_stats.sh >> /backup/scripts/gather_stats_cron.log 2>&1

# 添加每周日志清理(每周日凌晨3点)
0 3 * * 0 /backup/scripts/cleanup_logs.sh >> /backup/scripts/cleanup_logs_cron.log 2>&1

# 查看当前crontab
$ crontab -l

0 8 * * * /backup/scripts/daily_check.sh >> /backup/scripts/daily_check_cron.log 2>&1
0 2 * * 0 /backup/scripts/gather_stats.sh >> /backup/scripts/gather_stats_cron.log 2>&1
0 3 * * 0 /backup/scripts/cleanup_logs.sh >> /backup/scripts/cleanup_logs_cron.log 2>&1

4.2 健康检查实战

4.2.1 使用SQL脚本进行健康检查

— health_check.sql – 数据库健康检查脚本
set echo off
set feedback off
set linesize 120
set pagesize 50
set serveroutput on

spool /backup/health_check_$(date +%Y%m%d).log

prompt ========================================
prompt Oracle Database Health Check
prompt ========================================
prompt

prompt 1. Database Information
prompt ————————
col host_name for a20
col instance_name for a15
col database_role for a15
col open_mode for a20
select i.host_name,
i.instance_name,
i.status,
d.database_role,
d.open_mode,
d.log_mode
from v$instance i, v$database d;prompt
prompt 2. Tablespace Usage (Critical > 90%)
prompt ————————————-
col tablespace_name for a25
col used_pct for 999.99
col total_gb for 9999.99
col free_gb for 9999.99
select tablespace_name,
round((1 – free/total)*100, 2) as used_pct,
round(total/1024/1024/1024, 2) as total_gb,
round(free/1024/1024/1024, 2) as free_gb
from (
select tablespace_name,
sum(bytes) as total,
sum(nvl(maxbytes, bytes)) as max_total
from dba_data_files
group by tablespace_name
) a,
(
select tablespace_name,
sum(bytes) as free
from dba_free_space
group by tablespace_name
) b
where a.tablespace_name = b.tablespace_name
order by used_pct desc;prompt
prompt 3. Invalid Objects
prompt ——————
col owner for a15
col object_type for a20
col object_name for a30
select owner, object_type, object_name, status
from dba_objects
where status != ‘VALID’
and owner not in (‘SYS’, ‘SYSTEM’)
order by owner, object_type, object_name;prompt
prompt 4. Locked Objects
prompt —————–
col owner for a15
col object_name for a30
col oracle_username for a15
col os_user_name for a15
select lo.oracle_username,
lo.os_user_name,
do.owner,
do.object_name,
do.object_type,
lo.locked_mode
from v$locked_object lo, dba_objects do
where lo.object_id = do.object_id;prompt
prompt 5. Long Running Sessions
prompt ————————
col username for a15
col status for a10
col machine for a20
col program for a30
col sql_id for a15
select s.username,
s.status,
s.machine,
s.program,
s.sql_id,
s.last_call_et as sec_running,
round(s.last_call_et/60, 2) as min_running
from v$session s
where s.type = ‘USER’
and s.status = ‘ACTIVE’
and s.last_call_et > 600
order by s.last_call_et desc;prompt
prompt ========================================
prompt Health Check Completed
prompt ========================================

spool off

4.3 告警处理流程

4.3.1 告警响应流程

# 告警处理流程图
1. 收到告警通知

2. 确认告警真实性

3. 评估影响范围

4. 确定处理优先级

5. 执行故障排查

6. 实施解决方案

7. 验证恢复正常

8. 记录处理过程

9. 根因分析

10. 预防措施

# 告警级别定义
P1 – 严重:数据库不可用,业务完全中断
P2 – 高:严重性能问题,部分业务受影响
P3 – 中:一般性能问题或警告
P4 – 低:信息性告警,无业务影响

# 响应时间要求
P1:15分钟内响应,2小时内恢复
P2:30分钟内响应,4小时内恢复
P3:2小时内响应,24小时内处理
P4:24小时内响应,按需处理

Part05-风哥经验总结与分享

5.1 最佳实践

  • 自动化优先:尽可能自动化日常维护任务,减少人为错误
  • 文档齐全:维护操作文档和故障处理手册
  • 定期演练:定期进行灾难恢复和故障处理演练
  • 监控完善:建立完善的监控告警体系
  • 备份验证:定期验证备份的有效性
  • 变更管理:所有变更都要有记录和回滚计划
  • 容量规划:提前3-6个月进行容量规划
  • 安全加固:定期进行安全审计和加固

5.2 维护检查清单

任务
频率
负责人
状态

检查数据库状态
每日
DBA

检查表空间使用
每日
DBA

检查告警日志
每日
DBA

检查备份状态
每日
DBA

收集统计信息
每周
DBA

清理过期日志
每周
DBA

分析AWR报告
每周
DBA

安全审计检查
每月
DBA/Security

全面健康检查
季度
DBA

容量规划评估
季度
DBA

灾难恢复演练
半年
DBA/IT

5.3 常见问题与解决

5.3.1 表空间满

# 问题:表空间使用率超过90%
# 解决方案:

# 1. 检查是哪个表空间
SQL> select tablespace_name,
round((1 – free/total)*100, 2) as used_pct
from (
select tablespace_name,
sum(bytes) as total
from dba_data_files
group by tablespace_name
) a,
(
select tablespace_name,
sum(bytes) as free
from dba_free_space
group by tablespace_name
) b
where a.tablespace_name = b.tablespace_name
and (1 – free/total)*100 > 90;# 2. 添加数据文件
SQL> alter tablespace USERS add datafile ‘/oracle/app/oracle/oradata/fgedudb/users02.dbf’ size 10G autoextend on next 1G maxsize 32G;Tablespace altered.

# 3. 或者扩展现有数据文件
SQL> alter database datafile ‘/oracle/app/oracle/oradata/fgedudb/users01.dbf’ resize 20G;Database altered.

# 4. 启用自动扩展
SQL> alter database datafile ‘/oracle/app/oracle/oradata/fgedudb/users01.dbf’ autoextend on next 1G maxsize 32G;Database altered.

5.3.2 归档日志满

# 问题:归档目录满导致数据库挂起
# 解决方案:

# 1. 检查归档目录空间
$ df -h /backup/archivelog
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 500G 490G 10G 98% /backup/archivelog

# 2. 备份归档日志
RMAN> backup archivelog all delete input;# 3. 或者直接删除旧的归档日志
RMAN> delete noprompt archivelog all completed before ‘sysdate-7’;# 4. 验证数据库恢复正常
SQL> alter system archive log current;System altered.

风哥提示:日常维护工作看似繁琐,但却是数据库稳定运行的基础。建立完善的维护流程和自动化脚本,可以大幅减轻DBA的工作负担。学习交流加群风哥微信: itpux-com

生产环境建议:在执行任何维护操作前,先在测试环境验证,确保操作的安全性。同时,维护操作最好安排在业务低峰期进行。

风哥提示:监控告警是日常维护的重要组成部分,及时发现问题可以避免小问题演变成大故障。建议配置多级告警,不同级别的问题通知不同的人员。更多学习教程公众号风哥教程itpux_com

注意事项:自动化脚本虽然方便,但也要定期检查脚本执行结果,防止脚本失效导致问题没有及时发现。

风哥提示:维护工作要有记录,每次维护操作都要记录在维护日志中,包括操作内容、时间、人员、结果等信息,便于后续审计和问题追溯。学习交流加群风哥QQ113257174

持续改进:定期review维护流程,根据实际情况调整维护计划,使维护工作更加高效和完善。

风哥提示:请根据实际情况调整配置和参数,确保生产环境的安全性和稳定性。学习交流加群风哥QQ113257174

更多学习教程公众号风哥教程itpux_com

from oracle:www.itpux.com

本文由风哥教程整理发布,仅用于学习测试使用,转载注明出处:http://www.fgedu.net.cn/10327.html

联系我们

在线咨询:点击这里给我发消息

微信号:itpux-com

工作日:9:30-18:30,节假日休息