CBO对子查询有且只有三种处理方式:
- subquery unnesting
- Push Subquery
- Filter
子查询展开的最大作用我认为是让CBO对sql关联顺序拥有更多的选择,以便得到最好的执行计划。如果子查询不能展开,那么将以Filter的方式在最后一步才会被执行。
subquery unnesting:
The purpose of subquery unnesting is to inject semi- (IN, EXISTS), anti-join (NOT IN, NOT EXISTS), and scalar subqueries into the FROM clause of the containing query block, and to transform them into inline views. Some unnestings are performed as heuristic-based query transformations, and others are carried out as cost-based query transformations. The main reason for applying this query transformation is to enable all available join methods. In fact, without subquery unnesting, a subquery might have to be executed once for every row returned by the containing query block . Subquery unnesting can’t always be applied, though. For example, unnesting isn’t possible if a subquery contains some types of aggregation, or if it contains the rownum pseudocolumn. Semi- and anti-join subqueries containing set operators can only be unnested as of version 11.2. In addition, from version 12.1 onward, scalar subquery unnesting has been improved to process scalar subqueries in SELECT clauses
Unnesting a subquery can be summarized in two steps.
- The first step, as shown in the following query, is to rewrite the subquery as an inline view. Note that what follows isn’t a valid SQL statement, because the operator implementing the semi-join (s=) isn’t available in the SQL syntax (it’s used internally by the SQL engine only):
- The second step, as shown here, is to rewrite the inline view as a regular join
一般来说,对于简单子查询(比如单表的select-project-join结构)的展开属于启发式查询转换,复杂子查询(比如子查询多表关联)的展开属于非启发式查询转换。
hint和参数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
SQL> select name from v$sql_hint where name like '%UNNEST%'; NAME ---------------------------------------------------------------- UNNEST NO_UNNEST SQL> @sp unnest -- show parameter by sp -- show hidden parameter by sp old 3: where x.indx=y.indx and ksppinm like '_%&p%' new 3: where x.indx=y.indx and ksppinm like '_%unnest%' NAME VALUE DESC ---------------------------------------- ---------- ------------------------------------------------------------------------------------------ _unnest_subquery TRUE enables unnesting of complex subqueries _optimizer_unnest_all_subqueries TRUE enables unnesting of every type of subquery _distinct_view_unnesting FALSE enables unnesting of in subquery into distinct view _optimizer_unnest_disjunctive_subq TRUE Unnesting of disjunctive subqueries (TRUE/FALSE) _optimizer_unnest_corr_set_subq TRUE Unnesting of correlated set subqueries (TRUE/FALSE) |
可以看到unnest查询转换的前提条件是_optimizer_unnest_all_subqueries必须为true,对于复杂子查询必须_unnest_subquery也为true,可以看到从11.2开始,CBO也支持集合的子查询展开由_optimizer_unnest_corr_set_subq控制,12.1支持标量子查询的子查询展开,由参数_optimizer_unnest_scalar_sq控制。
子查询展开限制:
- 子查询包含rownum
- 子查询和or操作符在同一个查询块(select … from xxx where exists(subquery) or xxx)
- 对于简单子查询_optimizer_unnest_all_subqueries必须为true
- 对于复杂子查询必须_unnest_subquery和_optimizer_unnest_all_subqueries均为true,并且需要比较转换后的cost小于转换前的cost
子查询展开转换的结果:semi-join/anti-join/inter-join
- exists/in/any => SU: Transform an ANY subquery to semi-join or distinct
- not exists/not in/all => SU: Transform ALL/NOTEXISTS subquery into a regular anti-join
简单子查询展开示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
SQL> explain plan for select * from test.t1 where exists(select 1 from test.t2 where t2.object_id=t1.object_id); Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- Plan hash value: 4117515796 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 86347 | 8685K| 953 (1)| 00:00:12 | |* 1 | HASH JOIN | | 86347 | 8685K| 953 (1)| 00:00:12 | | 2 | SORT UNIQUE | | 86347 | 421K| 344 (1)| 00:00:05 | | 3 | TABLE ACCESS FULL| T2 | 86347 | 421K| 344 (1)| 00:00:05 | | 4 | TABLE ACCESS FULL | T1 | 86350 | 8263K| 342 (1)| 00:00:05 | ---------------------------------------------------------------------------- PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T2"."OBJECT_ID"="T1"."OBJECT_ID") 从10053可以发现,转换并未做成本对比,所以简单子查询展开为启发式查询转换。 SU: Checking validity of unnesting subquery SEL$2 (#2) SU: Passed validity checks. SU: Transforming EXISTS subquery to a join. |
执行计划中并未出现semi-join的原因是,semi-join被转换为了inter-join,outline里面可以看到该hint SEMI_TO_INNER(@”SEL$5DA710D3″ “T2″@”SEL$2”)
复杂子查询展开示例:
复杂子查询展开之后,会构造出一个名为VW_SQ的view,再与外部关联
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
SQL> explain plan for select * from test.t1 where exists(select 1 from test.t2,test.t3 where t2.object_id=t1.object_id and t2.object_id=t3.object_id); Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Plan hash value: 821440350 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 86343 | 9359K| | 1724 (1)| 00:00:21 | |* 1 | HASH JOIN RIGHT SEMI| | 86343 | 9359K| 2112K| 1724 (1)| 00:00:21 | | 2 | VIEW | VW_SQ_1 | 86343 | 1096K| | 828 (1)| 00:00:10 | |* 3 | HASH JOIN | | 86343 | 843K| 1440K| 828 (1)| 00:00:10 | | 4 | TABLE ACCESS FULL| T2 | 86347 | 421K| | 344 (1)| 00:00:05 | | 5 | TABLE ACCESS FULL| T3 | 86343 | 421K| | 344 (1)| 00:00:05 | | 6 | TABLE ACCESS FULL | T1 | 86350 | 8263K| | 342 (1)| 00:00:05 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("ITEM_1"="T1"."OBJECT_ID") 3 - access("T2"."OBJECT_ID"="T3"."OBJECT_ID") SQL> @sp hash_enable -- show parameter by sp -- show hidden parameter by sp old 3: where x.indx=y.indx and ksppinm like '_%&p%' new 3: where x.indx=y.indx and ksppinm like '_%hash_enable%' NAME VALUE DESC ---------------------------------------- ---------- ------------------------------------------------------------------------------------------ _right_outer_hash_enable TRUE Right Outer/Semi/Anti Hash Enabled |
执行计划为RIGHT SEMI,这个功能可以让子查询部分成为HASH JOIN的构造结果集,由参数_right_outer_hash_enable控制,默认为true
从10053可以发现,转换做了成本对比,所以复杂子查询展开为启发式查询转换。
1 2 3 4 5 6 7 8 9 10 11 |
SU: Checking validity of unnesting subquery SEL$2 (#2) SU: Passed validity checks, but requires costing. SU: Using search type: exhaustive SU: Starting iteration 1, state space = (2) : (1) SU: Unnesting subquery query block SEL$2 (#2)Registered qb: SEL$683B0107 0xb9cc4ba0 (SUBQ INTO VIEW FOR COMPLEX UNNEST SEL$2) SU: Finished interleaved join pred push down SU: Updated best state, Cost = 1723.59 --子查询展开成本 SU: Starting iteration 2, state space = (2) : (0) SU: Not update best state, Cost = 51873250.53 --子查询不展开成本 SU: Will unnest subquery SEL$2 (#2) --子查询展开成本小于子查询不展开成本,所以确定转换 SU: Reconstructing original query from best state. |
对于anti-join需要注意的是,对于not in的处理:
- 11G前必须连接列上有not null约束或者在sql上写明连接列is not null,CBO会对子查询展开做反连接(ANTI JOIN)
- 11G以后只需要参数_optimizer_null_aware_antijoin=true,CBO会对子查询展开做一个针对null的反连接(ANTI NA JOIN)
- 11G之后如果_optimizer_null_aware_antijoin=false,但是连接列上有not null约束或者在sql上写明连接列is not null,CBO仍然会对子查询展开做一个正常的反连接(ANTI JOIN)
12c之前对于标量子查询通常都需要改写为外连接,,12cr1提供了标量子查询的展开,scalar subquery => SU: Unnesting scalar subquery query block SEL$2 (#2),由参数_optimizer_unnest_scalar_sq控制,但并不是所有标量子查询都能做展开。
- 对于聚合操作(非count)的标量子查询展开为启发式查询展开,展开成为外连接形式
- 对于count(*)的标量子查询不允许展开,10053查看拒绝理由为SU: bypassed: Scalar subquery has null-mutating select item.
- 非聚合操作的标量子查询不允许展开,10053查看拒绝理由为SU: bypassed: Scalar subquery may return more than one row
由于篇幅就不演示标量子查询的展开了。通过mos可以看到该特性非常非常多,建议生产环境还是关闭_optimizer_unnest_scalar_sq
Push Subquery:
By default, if Oracle cannot unnest a subquery it postpones executes of that subquery to the end of the execution plan.
Historically the push_subq hint told Oracle to run any outstanding subqueries at the earliest possible moment. But in 10g this changes (for the better). Imagine I have two subqueries in the query; it is quite possible that the optimum execution plan is to run one early and the other late, but the original implementation of push_subq is an ‘all or nothing’ implementation. 10g allows you to be selective about which subqueries should be pushed.
hint:
1 2 3 4 5 6 |
SQL> select name from v$sql_hint where name like '%PUSH_SUBQ%'; NAME ---------------------------------------------------------------- PUSH_SUBQ NO_PUSH_SUBQ |
当子查询无法展开时,通常将以filter的方式执行子查询,并且子查询将在执行计划最后一步执行,子查询推入的出现将可能让子查询在无法展开的情况下优先被执行,也是为了能够得到更多可能的执行顺序,从而获得最好的执行计划。
常见的子查询适用场景:
- 子查询带有聚合函数的(MAX/MIN/SUM/…)
- 有多个子查询,有某个子查询可以过滤掉大量的数据,但是无法子查询展开,子查询推入可以选择该子查询被优先执行。
- 对于未能子查询展开的sql,通常子查询都是在最后一步执行,即多表关联完后才去执行子查询,此时如果子查询可以过滤大量数据即可减少后续连接的成本,子查询推进可以让子查询先关联后再与后续的表进行连接。多表关联附带子查询。
Filter:
当子查询不能展开,并且没有推入时,只能通过filter来执行,并且只能在执行计划的最后一步执行子查询。filter这个操作在《Cost Based Oracle Fundamental》有介绍。它的独特之处在于会在内存中维护一个了hash table,使之成为加强版的NESTED LOOP,filter的执行性能跟关联列值distinct数有关。
Oracle limits the size of the in-memory hash table (presumably to stop excessive memory consumption in unlucky cases).In 8i and 9i the limit on the size of the hash table seems to be 256 entries, in 10g it seems to be
1,024.This means the performance of a subquery filter can be affected by the number of different driving values that exist, the order in which they appear in the pass through the driving table,and the actual values. If the hash table is simply too small, or you have driving values that just happen to cause excessive collisions on the hash table, then you may execute the subquery far more frequently than is strictly necessary.
其他子查询相关的查询转换:
Subquery Coalescing:
- The purpose of subquery coalescing is to combine equivalent semi- and anti-join subqueries into a single query block. The main reason for applying this heuristic-based query transformation, which is available as of version 11.2, is to reduce the number of table accesses, and thus to reduce the number of joins.
- Coalescing two subqueries of the same type might drastically reduce the number of logical I/O as far as it can eliminate an entire table access. Coalescing two subqueries of different types might pre-empt the CBO from taking advantage of the unnesting transformation. Fortunately if you know how to coalesce two different subqueries you will know how to de-coalesce them to allow the CBO taking advantage of unnesting the subqueries with their main query blocks.
Subquery Coalescing属于启发式查询转换,意思是将多个子查询等价合并到同一个子查询中,由参数_optimizer_coalesce_subqueries控制。这个功能我感觉是为了避免sql开发人员sql写的烂设计的。。。能有效避免多余的资源消耗。
参数与hint:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
SQL> @sp coalesce_sub -- show parameter by sp -- show hidden parameter by sp old 3: where x.indx=y.indx and ksppinm like '_%&p%' new 3: where x.indx=y.indx and ksppinm like '_%coalesce_sub%' NAME VALUE DESC ---------------------------------------- ---------- ------------------------------------------------------------------------------------------ _optimizer_coalesce_subqueries TRUE consider coalescing of subqueries optimization SQL> select name from v$sql_hint where name like '%COAL%'; NAME ---------------------------------------------------------------- COALESCE_SQ NO_COALESCE_SQ |
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
SELECT /*+qb_name(main)*/* FROM lxy t1 WHERE EXISTS (SELECT /*+qb_name(sub1)*/1 FROM lxy_1 t2 WHERE t2.object_id = t1.object_id AND t2.OWNER='SYS') OR EXISTS (SELECT /*+qb_name(sub2)*/ 1 FROM lxy_1 t3 WHERE t3.object_id = t1.object_id AND t3.object_type='TABLE'); Plan hash value: 1212017683 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1771 | 176K| 104 (0)| 00:00:02 | |* 1 | HASH JOIN RIGHT SEMI| | 1771 | 176K| 104 (0)| 00:00:02 | | 2 | VIEW | VW_SQ_1 | 1771 | 23023 | 52 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL | LXY_1 | 1771 | 30107 | 52 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | LXY | 14243 | 1237K| 52 (0)| 00:00:01 | -------------------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$8BBAAD22 2 - SEL$8CB19D9E / VW_SQ_1@SEL$E8701212 3 - SEL$8CB19D9E / T2@SUB1 4 - SEL$8BBAAD22 / T1@MAIN Outline Data ------------- /*+ BEGIN_OUTLINE_DATA FULL(@"SEL$8CB19D9E" "T2"@"SUB1") SWAP_JOIN_INPUTS(@"SEL$8BBAAD22" "VW_SQ_1"@"SEL$E8701212") USE_HASH(@"SEL$8BBAAD22" "VW_SQ_1"@"SEL$E8701212") LEADING(@"SEL$8BBAAD22" "T1"@"MAIN" "VW_SQ_1"@"SEL$E8701212") NO_ACCESS(@"SEL$8BBAAD22" "VW_SQ_1"@"SEL$E8701212") FULL(@"SEL$8BBAAD22" "T1"@"MAIN") OUTLINE(@"SUB1") OUTLINE(@"MAIN") OUTLINE(@"SUB2") COALESCE_SQ(@"SUB1") OUTLINE(@"SEL$4B4BED4E") OUTLINE(@"SEL$E8701212") COALESCE_SQ(@"SUB2") OUTLINE(@"SEL$5DD85A06") UNNEST(@"SEL$5DD85A06") OUTLINE_LEAF(@"SEL$8BBAAD22") OUTLINE_LEAF(@"SEL$8CB19D9E") ALL_ROWS DB_VERSION('11.2.0.4') OPTIMIZER_FEATURES_ENABLE('11.2.0.4') IGNORE_OPTIM_EMBEDDED_HINTS END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("ITEM_1"="T1"."OBJECT_ID") 3 - filter("T2"."OWNER"='SYS' OR "T2"."OBJECT_TYPE"='TABLE') 从outline清楚的看到CBO对子查询进行了合并,谓词条件变成"T2"."OWNER"='SYS' OR "T2"."OBJECT_TYPE"='TABLE',减少了LXY_1的扫描次数,从10053可以看到具体的转换。 转换前: Query before EXDS SQ coalescing:******* UNPARSED QUERY IS ******* SELECT /*+ QB_NAME ("MAIN") */ "T1"."OWNER" "OWNER","T1"."OBJECT_NAME" "OBJECT_NAME","T1"."SUBOBJECT_NAME" "SUBOBJECT_NAME","T1"."OBJECT_ID" "OBJECT_ID","T1"."DATA_OBJECT_ID" "DATA_OBJECT_ID","T1"."OBJECT_TYPE" "OBJECT_TYPE","T1"."CREATED" "CREATED","T1"."LAST_DDL_TIME" "LAST_DDL_TIME","T1"."TIMESTAMP" "TIMESTAMP","T1"."STATUS" "STATUS","T1"."TEMPORARY" "TEMPORARY","T1"."GENERATED" "GENERATED","T1"."SECONDARY" "SECONDARY","T1"."NAMESPACE" "NAMESPACE","T1"."EDITION_NAME" "EDITION_NAME" FROM "SYS"."LXY" "T1" WHERE EXISTS (SELECT /*+ QB_NAME ("SUB1") */ 1 "1" FROM "SYS"."LXY_1" "T2" WHERE "T2"."OBJECT_ID"="T1"."OBJECT_ID" AND "T2"."OWNER"='SYS') OR EXISTS (SELECT /*+ QB_NAME ("SUB2") */ 1 "1" FROM "SYS"."LXY_1" "T3" WHERE "T3"."OBJECT_ID"="T1"."OBJECT_ID" AND "T3"."OBJECT_TYPE"='TABLE') 转换后: Query after subquery coalescing:******* UNPARSED QUERY IS ******* SELECT /*+ QB_NAME ("MAIN") */ "T1"."OWNER" "OWNER","T1"."OBJECT_NAME" "OBJECT_NAME","T1"."SUBOBJECT_NAME" "SUBOBJECT_NAME","T1"."OBJECT_ID" "OBJECT_ID","T1"."DATA_OBJECT_ID" "DATA_OBJECT_ID","T1"."OBJECT_TYPE" "OBJECT_TYPE","T1"."CREATED" "CREATED","T1"."LAST_DDL_TIME" "LAST_DDL_TIME","T1"."TIMESTAMP" "TIMESTAMP","T1"."STATUS" "STATUS","T1"."TEMPORARY" "TEMPORARY","T1"."GENERATED" "GENERATED","T1"."SECONDARY" "SECONDARY","T1"."NAMESPACE" "NAMESPACE","T1"."EDITION_NAME" "EDITION_NAME" FROM "SYS"."LXY" "T1" WHERE EXISTS (SELECT /*+ QB_NAME ("SUB1") */ 1 "1" FROM "SYS"."LXY_1" "T2" WHERE "T2"."OBJECT_ID"="T1"."OBJECT_ID" AND ("T2"."OWNER"='SYS' OR "T2"."OBJECT_TYPE"='TABLE')) |