风哥教程

培训 . 交流 . 分享
Make progress together!

Oracle大数据量分区表统计信息的管理(DBMS_STATS.COPY_TABLE_STATS)

[复制链接]
内容发布:风哥| 发布时间:2014-2-10 11:09:06
Oracle大数据量分区表统计信息的管理(DBMS_STATS.COPY_TABLE_STATS)

对于大数据量分区表,当数据加载后,如果不能及时更新统计信息,将导致sql产生不正确的执行计划,引起查询性能的下降,而对于一个数据加载比较频繁的分区表,往往无法在每次加载数据后及时收集统计信息,而且对于数据量比较大的表来说,收集统计信息本身就是比较耗费资源的操作。
这种情况下,可以采用DBMS_STATS.COPY_TABLE_STATS这个存储过程来将一个分区的
统计信息复制到另外一个分区,从而避免由于无法及时更新统计信息而导致的sql执行计划出现问题。

关于DBMS_STATS.COPY_TABLE_STATS的介绍
Statistics can be copied from one partition of a table to another partition using DBMS_STATS.COPY_TABLE_STATS.
Statistics of all dependent object such as columns and local indexes will also be copied as will histogram information for all columns apart from the partition key column.
The histogram information for the partition key cannot be copied as we do not know the exact values and mapping of the partition key.

Statistics can optionally be scaled based on the given scale_factor.

The definition of the procedure is as follows:
DBMS_STATS.COPY_TABLE_STATS (
   ownname    VARCHAR2,
   tabname    VARCHAR2,
   srcpartnameVARCHAR2,
   dstpartnameVARCHAR2,
   scale_factor     VARCHAR2 DEFAULT 1,
   flagsNUMBER DEFAULT,
   forceBOOLEAN DEFAULT FALSE);

Where:
· Ownname: Name of the schema
· Tabname:  Table name of source and destination [sub] partition
· Srcpartname: Source [sub] partition
· Dtspartname: Destination [sub] partition
· Scale factor:   To scale nblks, nrows etc. in destination partition
· Flags:  For internal Oracle use (should be left as NULL)
· Force:  When TRUE, copy statistics will be copied even when statisitcs are locked by DBMS_STATS.LOCK_*_STATISTICS

Note: If there are no statistics present for the source partition then nothing is copied.
Note:
In 10g, this procedure is FOR INTERNAL USE ONLY.  Use of the procedure in 10g may work but is not supported due to this.
In 11g the procedure is externalised and is supported.

我们下面来看一个关于DBMS_STATS.COPY_TABLE_STATS的测试:
SQL>
create or replace function display_raw(rawval raw, type varchar2)
  return varchar2 is
  cn  number;
  cv  varchar2(32);
  cd  date;
  cnv nvarchar2(32);
  cr  rowid;
  cc  char(32);
begin
  if (type = 'NUMBER') then
    dbms_stats.convert_raw_value(rawval, cn);
    return to_char(cn);
  elsif (type = 'VARCHAR2') then
    dbms_stats.convert_raw_value(rawval, cv);
    return to_char(cv);
  elsif (type = 'DATE') then
    dbms_stats.convert_raw_value(rawval, cd);
    return to_char(cd);
  elsif (type = 'NVARCHAR2') then
    dbms_stats.convert_raw_value(rawval, cnv);
    return to_char(cnv);
  elsif (type = 'ROWID') then
    dbms_stats.convert_raw_value(rawval, cr);
    return to_char(cnv);
  elsif (type = 'CHAR') then
    dbms_stats.convert_raw_value(rawval, cc);
    return to_char(cc);
  else
    return 'UNKNOWN DATATYPE';
  end if;
end;
/

Function created.

SQL> set lines 132 pages 100
SQL> col COLUMN_NAME for a20
SQL> col PARTITION_NAME for a10
SQL> col low_val for a30
SQL> col high_val for a30
SQL> col data_type for a20
SQL> alter session set nls_date_format='yyyymmdd hh24:mi:ss';

Session altered.

SQL> SELECT TABLE_NAME,PARTITION_NAME,NUM_ROWS,TO_CHAR(LAST_ANALYZED,'YYYYMMDD HH24:MI:SS') FROM USER_TAB_PARTITIONS WHERE TABLE_NAME='SALES';


TABLE_NAME     PARTITION_NAME   NUM_ROWS TO_CHAR(LAST_ANAL
------------------------------ ------------------------------ ---------- -----------------
SALES    P201301 99920 20140207 20:50:59
SALES    P201302 96640 20140207 20:50:59
SALES    P201303100680 20140207 20:50:59
SALES    P201304100500 20140207 20:50:59
SALES    P201305 99780 20140207 20:50:59
SALES    P201306  0 20140207 20:50:59
SALES    P201307  0 20140207 20:50:59
SALES    P201308  0 20140207 20:50:59
SALES    P201309  0 20140207 20:50:59
SALES    P201310  0 20140207 20:50:59
SALES    P201311  0 20140207 20:50:59
SALES    P201312  0 20140207 20:50:59

12 rows selected.

SQL>
select a.column_name,
a.partition_name,
a.num_distinct,
display_raw(a.low_value, b.data_type) as low_val,
display_raw(a.high_value, b.data_type) as high_val,
b.data_type
  from dba_part_col_statistics a, dba_tab_cols b
where a.owner = 'SYS'
   and b.owner = 'SYS'
   and a.table_name = 'SALES'
   and a.table_name = b.table_name
   and a.column_name = b.column_name
   and a.column_name = 'TIME_ID'
   and a.low_value is not null
order by 1, 2
/

COLUMN_NAME    PARTITION_ NUM_DISTINCT LOW_VAL  HIGH_VAL DATA_TYPE
-------------------- ---------- ------------ ------------------------------ ------------------------------
TIME_ID  P201301   1 20130110 00:00:00  20130110 00:00:00  DATE
TIME_ID  P201302   1 20130210 00:00:00  20130210 00:00:00  DATE
TIME_ID  P201303   1 20130310 00:00:00  20130310 00:00:00  DATE
TIME_ID  P201304   1 20130410 00:00:00  20130410 00:00:00  DATE
TIME_ID  P201305   1 20130510 00:00:00  20130510 00:00:00  DATE

12 rows selected.

SQL> EXEC DBMS_STATS.COPY_TABLE_STATS (user, 'SALES', 'p201305', 'p201306', FORCE=>TRUE);

PL/SQL procedure successfully completed.

SQL>
select a.column_name,
       a.partition_name,
       a.num_distinct,
       display_raw(a.low_value, b.data_type) as low_val,
       display_raw(a.high_value, b.data_type) as high_val,
       b.data_type
  from dba_part_col_statistics a, dba_tab_cols b
where a.owner = 'SYS'
   and b.owner = 'SYS'
   and a.table_name = 'SALES'
   and a.table_name = b.table_name
   and a.column_name = b.column_name
   and a.column_name = 'TIME_ID'
   and a.low_value is not null
order by 1, 2
/

COLUMN_NAME    PARTITION_ NUM_DISTINCT LOW_VAL  HIGH_VAL DATA_TYPE
-------------------- ---------- ------------ ------------------------------ ------------------------------
TIME_ID  P201301   1 20130110 00:00:00  20130110 00:00:00  DATE
TIME_ID  P201302   1 20130210 00:00:00  20130210 00:00:00  DATE
TIME_ID  P201303   1 20130310 00:00:00  20130310 00:00:00  DATE
TIME_ID  P201304   1 20130410 00:00:00  20130410 00:00:00  DATE
TIME_ID  P201305   1 20130510 00:00:00  20130510 00:00:00  DATE
TIME_ID  P201306   1 20130601 00:00:00  20130701 00:00:00  DATE

6 rows selected.

SQL>
SELECT TABLE_NAME,PARTITION_NAME,NUM_ROWS,TO_CHAR(LAST_ANALYZED,'YYYYMMDD HH24:MI:SS') FROM USER_TAB_PARTITIONS WHERE TABLE_NAME='SALES';

TABLE_NAME     PARTITION_   NUM_ROWS TO_CHAR(LAST_ANAL
------------------------------ ---------- ---------- -----------------
SALES    P201301  100020 20140207 08:42:09
SALES    P201302   98820 20140207 08:42:09
SALES    P201303   98960 20140207 08:42:09
SALES    P201304  100520 20140207 08:42:09
SALES    P201305  100280 20140207 08:42:09
SALES    P201306  100280 20140207 08:42:09
SALES    P201307  0 20140207 08:42:09
SALES    P201308  0 20140207 08:42:09
SALES    P201309  0 20140207 08:42:09
SALES    P201310 0 20140207 08:42:09
SALES    P201311  0 20140207 08:42:09
SALES    P201312  0 20140207 08:42:09

12 rows selected.

SQL> EXEC DBMS_STATS.COPY_TABLE_STATS (user, 'SALES', 'p201305', 'p201307', FORCE=>TRUE);

PL/SQL procedure successfully completed.

SQL>
select a.column_name,
       a.partition_name,
       a.num_distinct,
       display_raw(a.low_value, b.data_type) as low_val,
       display_raw(a.high_value, b.data_type) as high_val,
       b.data_type
  from dba_part_col_statistics a, dba_tab_cols b
where a.owner = 'SYS'
   and b.owner = 'SYS'
   and a.table_name = 'SALES'
   and a.table_name = b.table_name
   and a.column_name = b.column_name
   and a.column_name = 'TIME_ID'
   and a.low_value is not null
order by 1, 2
/
COLUMN_NAME  PARTITION_ NUM_DISTINCT LOW_VAL  HIGH_VAL DATA_TYPE
-------------------- ---------- ------------ ------------------------------ ------------------------------
TIME_ID  P201301   1 20130110 00:00:00  20130110 00:00:00  DATE
TIME_ID  P201302   1 20130210 00:00:00  20130210 00:00:00  DATE
TIME_ID  P201303   1 20130310 00:00:00  20130310 00:00:00  DATE
TIME_ID  P201304   1 20130410 00:00:00  20130410 00:00:00  DATE
TIME_ID  P201305   1 20130510 00:00:00  20130510 00:00:00  DATE
TIME_ID  P201306   1 20130601 00:00:00  20130701 00:00:00  DATE
TIME_ID  P201307   1 20130701 00:00:00  20130801 00:00:00  DATE

7 rows selected.

SQL> SELECT TABLE_NAME,PARTITION_NAME,NUM_ROWS,TO_CHAR(LAST_ANALYZED,'YYYYMMDD HH24:MI:SS') FROM USER_TAB_PARTITIONS WHERE TABLE_NAME='SALES';

TABLE_NAME     PARTITION_   NUM_ROWS TO_CHAR(LAST_ANAL
------------------------------ ---------- ---------- -----------------
SALES    P201301  100020 20140207 08:42:09
SALES    P201302   98820 20140207 08:42:09
SALES    P201303   98960 20140207 08:42:09
SALES    P201304  100520 20140207 08:42:09
SALES    P201305  100280 20140207 08:42:09
SALES    P201306  100280 20140207 08:42:09
SALES    P201307  100280 20140207 08:42:09
SALES    P201308  0 20140207 08:42:09
SALES    P201309  0 20140207 08:42:09
SALES    P201310  0 20140207 08:42:09
SALES    P201311  0 20140207 08:42:09
SALES    P201312  0 20140207 08:42:09

12 rows selected.

可以看出,对于分区键所在的列,oracle自动将列的最大值和最小值置为分区边界值,对于其它列的统计信息则是直接复制,虽然这不能完全反映数据分布的真实情况,但能最大程度的接近真实的统计信息。

以下我们看一个官方的例子:
Example

Setup
REM CREATE TABLE
REM ============
CREATE TABLE sales_range
(salesman_id NUMBER(5),
salesman_name VARCHAR2(30),
sales_amount NUMBER(10),
sales_date DATE)
PARTITION BY RANGE(sales_date)
(
PARTITION sales_jan2000 VALUES LESS THAN(TO_DATE('02/01/2000','DD/MM/YYYY')),
PARTITION sales_feb2000 VALUES LESS THAN(TO_DATE('03/01/2000','DD/MM/YYYY')),
PARTITION sales_mar2000 VALUES LESS THAN(TO_DATE('04/01/2000','DD/MM/YYYY')),
PARTITION sales_apr2000 VALUES LESS THAN(TO_DATE('05/01/2000','DD/MM/YYYY'))
);
REM CREATE LOCAL INDEX
REM ==================
create index salesman_id_ix on sales_range (salesman_id) local;
REM POPULATE TABLE
REM ===============
insert into sales_range values(10,'SCOTT',1000,'01-JAN-2000');
insert into sales_range values(20,'SMITH',1200,'01-JAN-2000');
insert into sales_range values(30,'ALLEN',1300,'01-JAN-2000');
commit;
REM GATHER STATISTICS
REM ==================
exec dbms_stats.gather_table_stats(ownname=>'', tabname=> 'sales_range', granularity=>'ALL');
· Table SALES_RANGE has been created with 3 partitions containing a local index.
· Data has been inserted into partition SALES_JAN2000 and statistics gathered on all partitions

SQL> select count(*) from sales_range partition (SALES_JAN2000);
  COUNT(*)
----------
         3

SQL> select count(*) from sales_range partition (SALES_FEB2000);
  COUNT(*)
----------
         0

SQL>  select num_rows,AVG_ROW_LEN,BLOCKS,PARTITION_NAME from user_tab_partitions where table_name='SALES_RANGE';
  NUM_ROWS AVG_ROW_LEN     BLOCKS PARTITION_NAME
---------- ----------- ---------- ------------------------------
         3          20         46 SALES_JAN2000
         0           0          0 SALES_FEB2000
         0           0          0 SALES_MAR2000
         0           0          0 SALES_APR2000

SQL> select partition_name, num_rows,leaf_blocks,avg_leaf_blocks_per_key from  user_ind_partitions where index_name='SALESMAN_ID_IX';
PARTITION_NAME                   NUM_ROWS LEAF_BLOCKS AVG_LEAF_BLOCKS_PER_KEY
------------------------------ ---------- ----------- -----------------------
SALES_JAN2000                           3           1                       1
SALES_FEB2000                           0           0                       0
SALES_MAR2000                           0           0                       0
SALES_APR2000                           0           0                       0
· SALES_JAN2000 shows a count of  3 rows whereas SALES_FEB2000 shows 0 rows
· USER_TAB_PARTITIONS and USER_TAB_INDEXES show non-zero information for partition SALES_JAN2000 that has been populated with data
· The other partitions that do not contain data show all statisitc information as 0.
Copying Statistics

SQL> EXEC DBMS_STATS.COPY_TABLE_STATS ('','SALES_RANGE','SALES_JAN2000','SALES_FEB2000',2);
PL/SQL procedure successfully completed.
Statistics are copied from SALES_JAN2000 partition to SALES_FEB2000 partition using a scale-factor of 2.
SQL> select num_rows,AVG_ROW_LEN,BLOCKS,PARTITION_NAME from user_tab_partitions where table_name='SALES_RANGE';
  NUM_ROWS AVG_ROW_LEN     BLOCKS PARTITION_NAME
---------- ----------- ---------- ------------------------------
         3          20         46 SALES_JAN2000
         6          20         92 SALES_FEB2000
         0           0          0 SALES_MAR2000
         0           0          0 SALES_APR2000

SQL> select partition_name, num_rows,leaf_blocks,avg_leaf_blocks_per_key from  user_ind_partitions where index_name='SALESMAN_ID_IX';
PARTITION_NAME                   NUM_ROWS LEAF_BLOCKS AVG_LEAF_BLOCKS_PER_KEY
------------------------------ ---------- ----------- -----------------------
SALES_JAN2000                           3           1                       1
SALES_FEB2000                           6           2                       2
SALES_MAR2000                           0           0                       0
SALES_APR2000                           0           0                       0
USER_TAB_PARTITIONS and USER_TAB_INDEXES  now show non-zero information for partition SALES_FEB2000.
As we have used a scale-factor of 2, the copied statistics have been mutliplied by a factor of 2.



上一篇:Oracle数据库收集优化统计信息数据(Optimizer Statistics)的最佳实践方法
下一篇:Oracle分区表的维护和索引的相关注意事项
专业提供Oracle/MySQL/NoSQL/Linux数据库培训与技术支持服务,QQ号:113257174
关注风哥教程微信公众号itpux_com  ,了解本站最新技术资料的分享.

欢迎加QQ群,提供超多高质量Oracle/Unix/Linux技术文档与视频教程的下载。

Oracle/MySQL/Linux群4-5:189070296  150201289  
Oracle/MySQL/Linux群6-8:244609803   522261684   522651731
备注:请勿重复加群,另请注明 from itpux
回复

使用道具 举报

内容发布:transdeli| 发布时间:2016-7-5 16:50:12
太好了 太实用了太好了 太实用了
回复 支持 反对

使用道具 举报

内容发布:刘元俊| 发布时间:2014-2-15 12:26:28
不错的帖子,顶
回复 支持 反对

使用道具 举报

1框架
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

热门文章教程

  • 大数据技术与应用入门培训教程(电子版下载
  • Oracle 12cR2 九大新功能全面曝光_详解云数
  • Oracle OCP认证考试IZ0-053题库共712题数据
  • MySQL5权威指南(第3版)PDF电子版下载
  • 风哥Oracle数据库巡检工具V1.0(附2.6网页
  • Oracle19c数据库发布与下载地址
快速回复 返回顶部 返回列表