oracle group by placement可能导致错误结果的bug
Last week I’ve mentioned on Twitter that we ran into wrong result bug. We found workaround quickly but I’ve decided to spend some time to reproduce error and write blog post to warn you about this optimizer behavior.
Special thanks to my colleague who spotted odd results which led us to this finding.
My test (virtual) environment is:
OS: Oracle Enterprise Linux 5.8
DB: Oracle EE 11.1.0.7.12
In test I will use three tables:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
CONT Name Null Type ------- ---- ------------- CUST_ID NUMBER(38) CODE VARCHAR2(100) CUST Name Null Type ------- -------- ---------- CUST_ID NOT NULL NUMBER(38) DRAG Name Null Type ------- ---- --------- DRAG_ID NUMBER(6) |
To gather fresh statistics for the tables:
1
2
3
4
5
6
|
begin dbms_stats.gather_table_stats(ownname=> user ,tabname=> 'CONT' ,estimate_percent=>100, cascade => TRUE ); dbms_stats.gather_table_stats(ownname=> user ,tabname=> 'CUST' ,estimate_percent=>100, cascade => TRUE ); dbms_stats.gather_table_stats(ownname=> user ,tabname=> 'DRAG' ,estimate_percent=>100, cascade => TRUE ); end ; / |
More details about tables:
1
2
3
4
5
6
7
8
9
10
|
select table_name, num_rows, blocks, partitioned, last_analyzed from dba_tables where table_name in ( 'CONT' , 'CUST' , 'DRAG' ); TABLE_NAME NUM_ROWS BLOCKS PARTITIONED LAST_ANALYZED ------------ ---------- ---------- ----------- ------------------- CONT 1181949 2892 NO 04.02.2014 14:49:24 DRAG 314 5 NO 04.02.2014 14:49:25 CUST 576233 902 NO 04.02.2014 14:49:25 |
Information about indexes:
1
2
3
4
5
6
7
8
9
|
select index_name, table_name, uniqueness, distinct_keys, clustering_factor from dba_indexes where table_name in ( 'CONT' , 'CUST' , 'DRAG' ); INDEX_NAME TABLE_NAME UNIQUENESS DISTINCT_KEYS CLUSTERING_FACTOR -------------- ------------ ---------- ------------- ----------------- I_CUST_ID CONT NONUNIQUE 468738 753983 PK_CUST_ID CUST UNIQUE 576233 878 |
We have three small and simple tables with just two indexes. CUST table has primary key on “cust_id” column.
After this little introduction it is time for some tests.
I will flush buffer cache and shared pool before every query execution.
1
2
3
4
5
|
SQL> alter system flush shared_pool; System altered. SQL> alter system flush buffer_cache; System altered. |
First query execution and execution plan:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
select /*+ gather_plan_statistics */ count (co.code) as cnt from drag t, cust cus, cont co where 1=1 and t.drag_id = cus.cust_id and cus.cust_id = co.cust_id(+) group by t.drag_id; CNT --------------- 2 2 2 2 1 2 2 2 2 ... 303 rows |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
SQL> select * from table (dbms_xplan.display_cursor( null , null , 'ALLSTATS LAST' )); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------------------- SQL_ID gpnrgy2vawafq, child number 0 ------------------------------------- select /*+ gather_plan_statistics */ count (co.code) as cnt from drag t, cust cus, cont co where 1=1 and t.drag_id = cus.cust_id and cus.cust_id = co.cust_id(+) group by t.drag_id Plan hash value: 3989628059 --------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E- Rows | A- Rows | A- Time | Buffers | Reads | OMem | 1Mem | Used-Mem | --------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 303 |00:00:00.62 | 3734 | 3724 | | | | | 1 | HASH GROUP BY | | 1 | 303 | 303 |00:00:00.62 | 3734 | 3724 | 1096K| 1096K| 1264K (0)| |* 2 | HASH JOIN OUTER | | 1 | 792 | 1084 |00:00:00.16 | 3734 | 3724 | 1206K| 1206K| 1244K (0)| |* 3 | HASH JOIN | | 1 | 314 | 314 |00:00:00.04 | 890 | 885 | 1452K| 1452K| 1470K (0)| | 4 | TABLE ACCESS FULL | DRAG | 1 | 314 | 314 |00:00:00.01 | 7 | 6 | | | | | 5 | TABLE ACCESS FULL | CUST | 1 | 576K| 576K|00:00:00.02 | 883 | 879 | | | | | 6 | TABLE ACCESS FULL | CONT | 1 | 1181K| 1181K|00:00:00.01 | 2844 | 2839 | | | | --------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access( "CUS" . "CUST_ID" = "CO" . "CUST_ID" ) 3 - access( "T" . "DRAG_ID" = "CUS" . "CUST_ID" ) |
Check result of the query - this is correct query result.
Now to simulate what we experienced in production.
1
2
3
4
|
SQL> alter system flush shared_pool; System altered. SQL> alter system flush buffer_cache; System altered. |
With hint I want to force PK_CUST_ID index usage because this was preferred plan in production.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
select /*+ gather_plan_statistics index (cus PK_CUST_ID) */ count (co.code) as cnt from drag t, cust cus, cont co where 1=1 and t.drag_id = cus.cust_id and cus.cust_id = co.cust_id(+) group by t.drag_id; CNT --------------- 0 0 0 0 0 0 0 0 0 0 ... 303 rows |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
SQL> select * from table (dbms_xplan.display_cursor( null , null , 'ALLSTATS LAST' )); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 9vf9uf7mhdmdz, child number 0 ------------------------------------- select /*+ gather_plan_statistics index (cus PK_CUST_ID) */ count (co.code) as cnt from drag t, cust cus, cont co where 1=1 and t.drag_id = cus.cust_id and cus.cust_id = co.cust_id(+) group by t.drag_id Plan hash value: 3263881209 ----------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E- Rows | A- Rows | A- Time | Buffers | Reads | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 303 |00:00:00.70 | 3459 | 3094 | | | | | 1 | HASH GROUP BY | | 1 | 303 | 303 |00:00:00.70 | 3459 | 3094 | 934K| 934K| 1267K (0)| |* 2 | HASH JOIN OUTER | | 1 | 764 | 1046 |00:00:00.22 | 3459 | 3094 | 1134K| 1134K| 1198K (0)| | 3 | NESTED LOOPS | | 1 | 303 | 303 |00:00:02.02 | 615 | 255 | | | | | 4 | VIEW | VW_GBC_9 | 1 | 303 | 303 |00:00:00.01 | 7 | 6 | | | | | 5 | HASH GROUP BY | | 1 | 303 | 303 |00:00:00.01 | 7 | 6 | 1012K| 1012K| 1249K (0)| | 6 | TABLE ACCESS FULL | DRAG | 1 | 314 | 314 |00:00:00.01 | 7 | 6 | | | | |* 7 | INDEX UNIQUE SCAN | PK_CUST_ID | 303 | 1 | 303 |00:00:00.22 | 608 | 249 | | | | | 8 | TABLE ACCESS FULL | CONT | 1 | 1181K| 1181K|00:00:00.01 | 2844 | 2839 | | | | ----------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access( "CUS" . "CUST_ID" = "CO" . "CUST_ID" ) 7 - access( "ITEM_1" = "CUS" . "CUST_ID" ) |
Check result of the query!
Count is displaying all 0 values because it received only NULLs to count.
Other functions like max and min are also affected by this error.
Check steps 4,5 and 6 in execution plan.
Instead of quick full scan on DRAG table Oracle transformed query and created inline view using smart group-by optimization.
In 10053 trace I could easily find what Oracle was doing.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
SELECT /*+ INDEX ( "CUS" "PK_CUST_ID" ) */ SUM ( "VW_GBC_9" . "ITEM_2" ) "CNT" FROM ( SELECT "T" . "DRAG_ID" "ITEM_1" , COUNT ( "CO" . "CODE" ) "ITEM_2" , "T" . "DRAG_ID" "ITEM_3" FROM "ADMIN" . "DRAG" "T" WHERE 1=1 GROUP BY "T" . "DRAG_ID" , "T" . "DRAG_ID" ) "VW_GBC_9" , "ADMIN" . "CUST" "CUS" , "ADMIN" . "CONT" "CO" WHERE "VW_GBC_9" . "ITEM_1" = "CUS" . "CUST_ID" AND "CUS" . "CUST_ID" = "CO" . "CUST_ID" (+) GROUP BY "VW_GBC_9" . "ITEM_3" ; |
Quick workaround to fix this bug:
- Set "_optimizer_group_by_placement"=FALSE.
You could check in 10053 trace value of this parameter.
In my case: _optimizer_group_by_placement = true
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
SQL> alter session set "_optimizer_group_by_placement" = FALSE ; Session altered. select /*+ gather_plan_statistics index (cus PK_CUST_ID) */ count (co.code) as cnt from drag t, cust cus, cont co where 1=1 and t.drag_id = cus.cust_id and cus.cust_id = co.cust_id(+) group by t.drag_id; CNT ---------- 2 2 2 2 1 2 2 2 2 ... 303 rows |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
SQL> select * from table (dbms_xplan.display_cursor( null , null , 'ALLSTATS LAST' )); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------------------- SQL_ID a91bzhvupzquh, child number 0 ------------------------------------- select /*+ gather_plan_statistics index (cus PK_CUST_ID)*/ count (co.code) as cnt from drag t, cust cus, cont co where 1=1 and t.drag_id = cus.cust_id and cus.cust_id = co.cust_id(+) group by t.drag_id Plan hash value: 2460166079 ------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E- Rows | A- Rows | A- Time | Buffers | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 303 |00:00:00.16 | 3481 | | | | | 1 | HASH GROUP BY | | 1 | 303 | 303 |00:00:00.16 | 3481 | 1096K| 1096K| 1232K (0)| |* 2 | HASH JOIN OUTER | | 1 | 792 | 1084 |00:00:00.01 | 3481 | 1206K| 1206K| 1529K (0)| | 3 | NESTED LOOPS | | 1 | 314 | 314 |00:00:00.01 | 637 | | | | | 4 | TABLE ACCESS FULL | DRAG | 1 | 314 | 314 |00:00:00.01 | 7 | | | | |* 5 | INDEX UNIQUE SCAN| PK_CUST_ID | 314 | 1 | 314 |00:00:00.01 | 630 | | | | | 6 | TABLE ACCESS FULL | CONT | 1 | 1181K| 1181K|00:00:00.01 | 2844 | | | | ------------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access( "CUS" . "CUST_ID" = "CO" . "CUST_ID" ) 5 - access( "T" . "DRAG_ID" = "CUS" . "CUST_ID" ) 27 rows selected. |
Oracle Support note associated with "_optimizer_group_by_placement" parameter.
Note.8945586.8 Ext/Pub Bug 8945586 - Wrong results using GROUP BY placement:
Description
Wrong results can occur when using GROUP BY placement where the aggregate column gets pruned from select list.
I’ve even found that “_optimizer_group_by_placement” parameter was mentioned in "Oracle® Fusion Middleware Oracle WebCenter Analytics Installation and Upgrade Guide".
Oracle 11g (11.1.0.6 and above) in default or Oracle Real Application Clusters (RAC) configuration
When running Oracle 11g versions prior to 11.1.0.7.0 the Oracle system parameter _optimizer_group_by_placement must be set to false. This can either be set in the init.ora file of the respective database instances or by by issuing an ALTER SYSTEM command as follows:
SQLPLUS /nolog
CONNECT / AS SYSDBA
ALTER SYSTEM SET "_optimizer_group_by_placement"=false
group by的优化bug还是挺多的,还有比如_optimizer_aggr_groupby_elim,但是不出bug的情况下,性能提升还是非常明显的,大家一定要仔细检查结果,不要只看性能。
相关文章
- 训练 detectron2 遇到的 bug : 训练过程中无法评估测试集
- 秋天的第一个BUG!推荐5款测试员不可不知的bug管理工具!
- 禅道—禅道Bug管理工具环境搭建
- BUG:oracle 10g 10.2.0.1 linux-64位 升级到 10.2.0.5:Seed_Database.dfb 文件没有升级:会造成后续dbca或使用该文件的操作失败!
- oracle dbca 【bug】:JAVA_JIT_ENABLED=false
- FoxMail的Bug
- 印象笔记被曝严重Bug,数百万用户数据恐遭泄露!
- 写了交互给后台后来不能用?bug多多多又找不到文件效率低?工作流程帮你优化起来~~~~
- Oracle 11gR2 RAC的两个bug
- 自己开发遇到的坑IE7 图片scrollTop BUG解决方法
- 关于onmouseover和onmouseout的bug
- oracle已知会导致错误结果的bug列表(Bug Issues Known to cause Wrong Results)
- Drools 6.4.0Final版本KieScanner内存泄漏Bug
- 一步一步学习 ReactNative + Redux(3.1:bug修复)
- 禅道的使用—bug管理工具
- 【我的Android进阶之旅】解决bug:You need to use a Theme.AppCompat theme (or descendant) with this activity.