zl程序教程

您现在的位置是:首页 >  后端

当前栏目

OracleCBO优化模式中的5种索引访问方法浅析

方法索引模式 优化 访问 浅析 oraclecbo
2023-06-13 09:15:38 时间

本文主要讨论以下几种索引访问方法:

1.索引唯一扫描(INDEXUNIQUESCAN)
2.索引范围扫描(INDEXRANGESCAN)
3.索引全扫描(INDEXFULLSCAN)
4.索引跳跃扫描(INDEXSKIPSCAN)
5.索引快速全扫描(INDEXFASTFULLSCAN)

索引唯一扫描(INDEXUNIQUESCAN)

通过这种索引访问数据的特点是对于某个特定的值只返回一行数据,通常如果在查询谓语中使用UNIQE和PRIMARYKEY索引的列作为条件的时候会选用这种扫描;访问的高度总是索引的高度加一,除了某些特殊的情况,如另外存储的LOB对象。

复制代码代码如下:


SQL>setautotracetraceonlyexplain
SQL>select*fromhr.employeeswhereemployee_id=100;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:1833546154

---------------------------------------------------------------------------------------------
|Id |Operation                  |Name         |Rows |Bytes|Cost(%CPU)|Time    |
---------------------------------------------------------------------------------------------
|  0|SELECTSTATEMENT           |              |    1|   69|    1  (0)|00:00:01|
|  1| TABLEACCESSBYINDEXROWID|EMPLOYEES    |    1|   69|    1  (0)|00:00:01|
|* 2|  INDEXUNIQUESCAN        |EMP_EMP_ID_PK|    1|      |    0  (0)|00:00:01|
---------------------------------------------------------------------------------------------

PredicateInformation(identifiedbyoperationid):
---------------------------------------------------

  2-access("EMPLOYEE_ID"=100)

索引范围扫描(INDEXRANGESCAN)

谓语中包含将会返回一定范围数据的条件时就会选用索引范围扫描,索引可以是唯一的亦可以是不唯一的;所指定的条件可以是(<,>,LIKE,BETWEEN,=)等运算符,不过使用LIKE的时候,如果使用了通配符%,极有可能就不会使用范围扫描,因为条件过于的宽泛了,下面是一个示例:

复制代码代码如下:


SQL>select*fromhr.employeeswhereDEPARTMENT_ID=30;

6rowsselected.

ExecutionPlan
----------------------------------------------------------
Planhashvalue:2056577954

-------------------------------------------------------------------------------------------------
|Id |Operation                  |Name             |Rows |Bytes|Cost(%CPU)|Time    |
-------------------------------------------------------------------------------------------------
|  0|SELECTSTATEMENT           |                  |    6|  414|    2  (0)|00:00:01|
|  1| TABLEACCESSBYINDEXROWID|EMPLOYEES        |    6|  414|    2  (0)|00:00:01|
|* 2|  INDEXRANGESCAN         |EMP_DEPARTMENT_IX|    6|      |    1  (0)|00:00:01|
-------------------------------------------------------------------------------------------------

PredicateInformation(identifiedbyoperationid):
---------------------------------------------------

  2-access("DEPARTMENT_ID"=30)

Statistics
----------------------------------------------------------
         8 recursivecalls
         0 dbblockgets
         7 consistentgets
         1 physicalreads
         0 redosize
      1716 bytessentviaSQL*Nettoclient
       523 bytesreceivedviaSQL*Netfromclient
         2 SQL*Netroundtripsto/fromclient
         0 sorts(memory)
         0 sorts(disk)
         6 rowsprocessed

范围扫描的条件需要准确的分析返回数据的数目,范围越大就越可能执行全表扫描;

复制代码代码如下:
SQL>selectdepartment_id,count(*)fromhr.employeesgroupbydepartment_idorderbycount(*);

DEPARTMENT_ID  COUNT(*)
-----------------------
          10         1
          40         1
                      1
          70         1
          20         2
         110         2
          90         3
          60         5
          30         6
         100         6
          80        34
          50        45

12rowsselected.

--这里使用数值最多的50来执行范围扫描
SQL>setautotracetraceonlyexplain
SQL>select*fromhr.employeeswhereDEPARTMENT_ID=50;

45rowsselected.

ExecutionPlan
----------------------------------------------------------
Planhashvalue:1445457117

-------------------------------------------------------------------------------
|Id |Operation        |Name     |Rows |Bytes|Cost(%CPU)|Time    |
-------------------------------------------------------------------------------
|  0|SELECTSTATEMENT |          |   45| 3105|    3  (0)|00:00:01|
|* 1| TABLEACCESSFULL|EMPLOYEES|   45| 3105|    3  (0)|00:00:01|
-------------------------------------------------------------------------------

PredicateInformation(identifiedbyoperationid):
---------------------------------------------------

  1-filter("DEPARTMENT_ID"=50)

Statistics
----------------------------------------------------------
         0 recursivecalls
         0 dbblockgets
        10 consistentgets
         0 physicalreads
         0 redosize
      4733 bytessentviaSQL*Nettoclient
       545 bytesreceivedviaSQL*Netfromclient
         4 SQL*Netroundtripsto/fromclient
         0 sorts(memory)
         0 sorts(disk)
        45 rowsprocessed

可以看到在获取范围数据较大的时候,优化器还是执行了全表扫描方法。

一种对于索引范围扫描的优化方法是使用升序排列的索引来获得降序排列的数据行,这种情况多发生在查询中包含有索引列上的ORDERBY子句的时候,这样就可避免一次排序操作了,如下:

复制代码代码如下:
SQL>setautotracetraceonlyexplain
SQL>select*fromhr.employees
 2 wheredepartment_idin(90,100)
 3 orderbydepartment_iddesc;

 ExecutionPlan
----------------------------------------------------------
Planhashvalue:3707994525

---------------------------------------------------------------------------------------------------
|Id |Operation                    |Name             |Rows |Bytes|Cost(%CPU)|Time    |
---------------------------------------------------------------------------------------------------
|  0|SELECTSTATEMENT             |                  |    9|  621|    2  (0)|00:00:01|
|  1| INLISTITERATOR             |                  |      |      |           |         |
|  2|  TABLEACCESSBYINDEXROWID|EMPLOYEES        |    9|  621|    2  (0)|00:00:01|
|* 3|   INDEXRANGESCANDESCENDING|EMP_DEPARTMENT_IX|    9|      |    1  (0)|00:00:01|
---------------------------------------------------------------------------------------------------

PredicateInformation(identifiedbyoperationid):
---------------------------------------------------

  3-access("DEPARTMENT_ID"=90OR"DEPARTMENT_ID"=100)

上例中,索引条目被相反的顺序读取,避免了排序操作。

索引全扫描(INDEXFULLSCAN)

索引全扫描的操作将会扫描索引结构的每一个叶子块,读取每个条目的的行编号,并取出数据行,既然是访问每一个索引叶子块,那么它相对的全表扫描的优势在哪里呢?实际上在索引块中因为包含的信息列数较少,通常都是索引键和ROWID,所以对于同一个数据块和索引块,包含的索引键的条目数通常都是索引块中居多,因此如果查询字段列表中所有字段都是索引的一部分的时候,就可以完全跳过对表数据的访问了,这种情况索引全扫描的方法会获得更高的效率。

发生索引全扫描的情况有很多,几种典型的场景:

1,查询总缺少谓语,但获取的列可以通过索引直接获得

复制代码代码如下:
SQL>selectemailfromhr.employees;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:2196514524

---------------------------------------------------------------------------------
|Id |Operation       |Name        |Rows |Bytes|Cost(%CPU)|Time    |
---------------------------------------------------------------------------------
|  0|SELECTSTATEMENT|             |  107|  856|    1  (0)|00:00:01|
|  1| INDEXFULLSCAN|EMP_EMAIL_UK|  107|  856|    1  (0)|00:00:01|
---------------------------------------------------------------------------------

2,查询谓语中包含一个位于索引中非引导列上的条件(其实也取决于引导列值的基数大小,如果引导列的唯一值较少,也可能出现跳跃扫描的情况)

复制代码代码如下:
SQL>selectfirst_name,last_namefromhr.employees
 2 wherefirst_namelike"A%";

ExecutionPlan
----------------------------------------------------------
Planhashvalue:2228653197

--------------------------------------------------------------------------------
|Id |Operation       |Name       |Rows |Bytes|Cost(%CPU)|Time    |
--------------------------------------------------------------------------------
|  0|SELECTSTATEMENT|            |    3|   45|    1  (0)|00:00:01|
|* 1| INDEXFULLSCAN|EMP_NAME_IX|    3|   45|    1  (0)|00:00:01|
--------------------------------------------------------------------------------

PredicateInformation(identifiedbyoperationid):
---------------------------------------------------

  1-access("FIRST_NAME"LIKE"A%")
      filter("FIRST_NAME"LIKE"A%")

SQL>SETLONG2000000
SQL>selectdbms_metadata.get_ddl("INDEX","EMP_NAME_IX","HR")fromdual;

DBMS_METADATA.GET_DDL("INDEX","EMP_NAME_IX","HR")
--------------------------------------------------------------------------------

 CREATEINDEX"HR"."EMP_NAME_IX"ON"HR"."EMPLOYEES"("LAST_NAME","FIRST_NAME"
)
 PCTFREE10INITRANS2MAXTRANS255NOLOGGINGCOMPUTESTATISTICS
 STORAGE(INITIAL65536NEXT1048576MINEXTENTS1MAXEXTENTS2147483645
 PCTINCREASE0FREELISTS1FREELISTGROUPS1BUFFER_POOLDEFAULTFLASH_CACHEDE
FAULTCELL_FLASH_CACHEDEFAULT)
 TABLESPACE"EXAMPLE"
--可以看到EMP_NAME_IX索引是建立在列(("LAST_NAME","FIRST_NAME")上的,使用了带非引导列FIRST_NAME的谓语

3,数据通过一个已经排序的索引获得从而省去单独的排序操作

复制代码代码如下:
SQL>select*fromhr.employeesorderbyemployee_id;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:2186312383

---------------------------------------------------------------------------------------------
|Id |Operation                  |Name         |Rows |Bytes|Cost(%CPU)|Time    |
---------------------------------------------------------------------------------------------
|  0|SELECTSTATEMENT           |              |  107| 7383|    3  (0)|00:00:01|
|  1| TABLEACCESSBYINDEXROWID|EMPLOYEES    |  107| 7383|    3  (0)|00:00:01|
|  2|  INDEXFULLSCAN          |EMP_EMP_ID_PK|  107|      |    1  (0)|00:00:01|
---------------------------------------------------------------------------------------------

--同样可以使用升序索引返回降序数据
SQL>selectemployee_idfromhr.employeesorderbyemployee_iddesc;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:753568220

--------------------------------------------------------------------------------------------
|Id |Operation                 |Name         |Rows |Bytes|Cost(%CPU)|Time    |
--------------------------------------------------------------------------------------------
|  0|SELECTSTATEMENT          |              |  107|  428|    1  (0)|00:00:01|
|  1| INDEXFULLSCANDESCENDING|EMP_EMP_ID_PK|  107|  428|    1  (0)|00:00:01|
--------------------------------------------------------------------------------------------


在上面的例子中可以看出,索引全扫描也可以想范围扫描一样,通过升序索引返回降序数据,而它的优化不止这一种,当我们查询某一列的最大值或最小值而这一列又是索引列的时候,索引全扫描就会获得非常显著的优势,因为这时的优化器并没有对索引的数据进行全部叶子节点的检索,而只是对一个根块,第一个或最后一个叶子块的扫描,这无疑会显著的提高性能!!

复制代码代码如下:
--索引全扫描获得最小值
SQL>selectmin(department_id)fromhr.employees;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:613773769

------------------------------------------------------------------------------------------------
|Id |Operation                 |Name             |Rows |Bytes|Cost(%CPU)|Time    |
------------------------------------------------------------------------------------------------
|  0|SELECTSTATEMENT          |                  |    1|    3|    1  (0)|00:00:01|
|  1| SORTAGGREGATE           |                  |    1|    3|           |         |
|  2|  INDEXFULLSCAN(MIN/MAX)|EMP_DEPARTMENT_IX|    1|    3|    1  (0)|00:00:01|
------------------------------------------------------------------------------------------------

--如果同时包含MAX和MIN的求值,优化器并不会主动选择效率较高的索引全扫描方法
SQL>selectmin(department_id),max(department_id)fromhr.employees;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:1756381138

--------------------------------------------------------------------------------
|Id |Operation         |Name     |Rows |Bytes|Cost(%CPU)|Time    |
--------------------------------------------------------------------------------
|  0|SELECTSTATEMENT  |          |    1|    3|    3  (0)|00:00:01|
|  1| SORTAGGREGATE   |          |    1|    3|           |         |
|  2|  TABLEACCESSFULL|EMPLOYEES|  107|  321|    3  (0)|00:00:01|
--------------------------------------------------------------------------------
--一种替代的优化方案
SQL>select
 2 (selectmin(department_id)fromhr.employees)min_id,
 3 (selectmax(department_id)fromhr.employees)max_id
 4 fromdual;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:2189307159

------------------------------------------------------------------------------------------------
|Id |Operation                 |Name             |Rows |Bytes|Cost(%CPU)|Time    |
------------------------------------------------------------------------------------------------
|  0|SELECTSTATEMENT          |                  |    1|      |    2  (0)|00:00:01|
|  1| SORTAGGREGATE           |                  |    1|    3|           |         |
|  2|  INDEXFULLSCAN(MIN/MAX)|EMP_DEPARTMENT_IX|    1|    3|    1  (0)|00:00:01|
|  3| SORTAGGREGATE           |                  |    1|    3|           |         |
|  4|  INDEXFULLSCAN(MIN/MAX)|EMP_DEPARTMENT_IX|    1|    3|    1  (0)|00:00:01|
|  5| FASTDUAL                |                  |    1|      |    2  (0)|00:00:01|
------------------------------------------------------------------------------------------------

索引跳跃扫描(INDEXSKIPSCAN)

这种扫描方式也是一种特例,因为在早期的版本中,优化器会因为使用了非引导列而拒绝使用索引。跳跃扫描的前提有着对应的情景,当谓语中包含索引中非引导列上的条件,并且引导列的唯一值较小的时候,就有极有可能使用索引跳跃扫描方法;同索引全扫描,范围扫描一样,它也可以升序或降序的访问索引;不同的是跳跃扫描会根据引导列的唯一值数目将复合索引分成多个较小的逻辑子索引,引导列的唯一值数目越小,分割的子索引数目也就越少,就越可能达到相对全表扫描较高的运算效率。

复制代码代码如下:
--创建测试表,以dba_objects表为例
SQL>createtabletestasselect*fromdba_objects;

Tablecreated.

--创建一个复合索引,这里选取了一个唯一值较少的owner列作为引导列
SQL>createindexi_testontest(owner,object_id,object_type);

Indexcreated.

--分析表收集统计信息
SQL>execdbms_stats.gather_table_stats("SYS","TEST");

PL/SQLproceduresuccessfullycompleted.

--先看一下引导列的唯一值的比较
SQL>selectcount(*),count(distinctowner)fromtest;

 COUNT(*)COUNT(DISTINCTOWNER)
------------------------------
    72482                  29

--使用非引导列的条件查询来访问触发SKIPSCAN
SQL>select*fromtestwhereobject_id=46;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:1001786056

--------------------------------------------------------------------------------------
|Id |Operation                  |Name  |Rows |Bytes|Cost(%CPU)|Time    |
--------------------------------------------------------------------------------------
|  0|SELECTSTATEMENT           |       |    1|   97|   31  (0)|00:00:01|
|  1| TABLEACCESSBYINDEXROWID|TEST  |    1|   97|   31  (0)|00:00:01|
|* 2|  INDEXSKIPSCAN          |I_TEST|    1|      |   30  (0)|00:00:01|
--------------------------------------------------------------------------------------

PredicateInformation(identifiedbyoperationid):
---------------------------------------------------

  2-access("OBJECT_ID"=46)
      filter("OBJECT_ID"=46)

Statistics
----------------------------------------------------------
       101 recursivecalls
         0 dbblockgets
        38 consistentgets
         0 physicalreads
         0 redosize
      1610 bytessentviaSQL*Nettoclient
       523 bytesreceivedviaSQL*Netfromclient
         2 SQL*Netroundtripsto/fromclient
         3 sorts(memory)
         0 sorts(disk)
         1 rowsprocessed

--来看看这条语句全扫描的效率
SQL>select/*+full(test)*/*fromtestwhereobject_id=46;

ExecutionPlan
----------------------------------------------------------
Planhashvalue:1357081020

--------------------------------------------------------------------------
|Id |Operation        |Name|Rows |Bytes|Cost(%CPU)|Time    |
--------------------------------------------------------------------------
|  0|SELECTSTATEMENT |     |    1|   97|  282  (1)|00:00:04|
|* 1| TABLEACCESSFULL|TEST|    1|   97|  282  (1)|00:00:04|
--------------------------------------------------------------------------

PredicateInformation(identifiedbyoperationid):
---------------------------------------------------

  1-filter("OBJECT_ID"=46)

Statistics
----------------------------------------------------------
         1 recursivecalls
         0 dbblockgets
      1037 consistentgets
         0 physicalreads
         0 redosize
      1607 bytessentviaSQL*Nettoclient
       523 bytesreceivedviaSQL*Netfromclient
         2 SQL*Netroundtripsto/fromclient
         0 sorts(memory)
         0 sorts(disk)
         1 rowsprocessed

分析上面的查询可以看出,我们使用的索引中引导列有29个唯一值,也就是说在执行索引跳跃扫描的时候,分割成了29个逻辑子索引来查询,只产生了38次逻辑读;而相对全表扫描的1037次逻辑读,性能提升非常明显!

索引快速全扫描(INDEXFASTFULLSCAN)

这种访问方法在获取数据上和全表扫描相同,都是通过无序的多块读取来进行的,因此也就无法使用它来避免排序代价了;索引快速全扫描通常发生在查询列都在索引中并且索引中一列有非空约束时,当然这个条件也容易发生索引全扫描,它的存在多可用来代替全表扫描,比较数据获取不需要访问表上的数据块。

复制代码代码如下:
--依旧使用上面创建的test表
SQL>desctest
 Name                                     Null?   Type
 -----------------------------------------------------------------------------
 OWNER                                             VARCHAR2(30)
 OBJECT_NAME                                       VARCHAR2(128)
 SUBOBJECT_NAME                                    VARCHAR2(30)
 OBJECT_ID                                NOTNULLNUMBER
 DATA_OBJECT_ID                                    NUMBER
 OBJECT_TYPE                                       VARCHAR2(19)
 CREATED                                           DATE
 LAST_DDL_TIME                                     DATE
 TIMESTAMP                                         VARCHAR2(19)
 STATUS                                            VARCHAR2(7)
 TEMPORARY                                         VARCHAR2(1)
 GENERATED                                         VARCHAR2(1)
 SECONDARY                                         VARCHAR2(1)
 NAMESPACE                                         NUMBER
 EDITION_NAME                                      VARCHAR2(30)

--在object_id列上创建索引
SQL>createindexpri_inxontest(object_id);

Indexcreated.

--直接执行全表扫描
SQL>selectobject_idfromtest;

72482rowsselected.

ExecutionPlan
----------------------------------------------------------
Planhashvalue:1357081020

--------------------------------------------------------------------------
|Id |Operation        |Name|Rows |Bytes|Cost(%CPU)|Time    |
--------------------------------------------------------------------------
|  0|SELECTSTATEMENT |     |72482|  353K|  282  (1)|00:00:04|
|  1| TABLEACCESSFULL|TEST|72482|  353K|  282  (1)|00:00:04|
--------------------------------------------------------------------------

Statistics
----------------------------------------------------------
         1 recursivecalls
         0 dbblockgets
      5799 consistentgets
         0 physicalreads
         0 redosize
   1323739 bytessentviaSQL*Nettoclient
     53675 bytesreceivedviaSQL*Netfromclient
      4834 SQL*Netroundtripsto/fromclient
         0 sorts(memory)
         0 sorts(disk)
     72482 rowsprocessed

--修改object_id为notnull
SQL>altertabletestmodify(object_idnotnull);

Tablealtered.

--再次使用object_id列查询就可以看到使用了快速全扫描了
SQL>selectobject_idfromtest;

72482rowsselected.

ExecutionPlan
----------------------------------------------------------
Planhashvalue:3806735285

--------------------------------------------------------------------------------
|Id |Operation           |Name   |Rows |Bytes|Cost(%CPU)|Time    |
--------------------------------------------------------------------------------
|  0|SELECTSTATEMENT    |        |72482|  353K|   45  (0)|00:00:01|
|  1| INDEXFASTFULLSCAN|PRI_INX|72482|  353K|   45  (0)|00:00:01|
--------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
       167 recursivecalls
         0 dbblockgets
      5020 consistentgets
       161 physicalreads
         0 redosize
   1323739 bytessentviaSQL*Nettoclient
     53675 bytesreceivedviaSQL*Netfromclient
      4834 SQL*Netroundtripsto/fromclient
         4 sorts(memory)
         0 sorts(disk)
     72482 rowsprocessed

PS,这个INDEXFASTFULLSCAN的例子真是不好模拟,上面的例子弄了好久。。。。。