在oracle sql优化中,往往只关注逻辑读的消耗,而忽略了join消耗的时间,特别是高频值进行join关联出大量rows的场景。往往join消耗的时间会比扫描对象消耗的逻辑读物理读更耗时。
例如下面这个sql(11g):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
SQL Monitoring Report SQL Text ------------------------------ SELECT /*+monitor test111*/ distinct t.OBJECT_ID, t.DATA_OBJECT_ID FROM test.t t, test.t1 t1 WHERE t.object_type = t1.object_type and t.DATA_OBJECT_ID is not null ORDER BY t.DATA_OBJECT_ID Global Information ------------------------------ Status : DONE (ALL ROWS) Instance ID : 1 Session : SYS (41:31) SQL ID : dc02gzj9a9ukr SQL Execution ID : 16777216 Execution Started : 04/21/2025 13:54:55 First Refresh Time : 04/21/2025 13:54:55 Last Refresh Time : 04/21/2025 13:55:11 Duration : 16s Module/Action : sqlplus@ora19c-node1 (TNS V1-V3)/- Service : SYS$USERS Program : sqlplus@ora19c-node1 (TNS V1-V3) Fetch Calls : 601 Global Stats ================================================= | Elapsed | Cpu | Other | Fetch | Buffer | | Time(s) | Time(s) | Waits(s) | Calls | Gets | ================================================= | 16 | 16 | 0.09 | 601 | 2489 | ================================================= SQL Plan Monitoring Details (Plan Hash Value=1001474438) =================================================================================================================================== | Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Mem | Activity | Activity Detail | | | | | (Estim) | | Active(s) | Active | | (Actual) | (Max) | (%) | (# samples) | =================================================================================================================================== | 0 | SELECT STATEMENT | | | | 15 | +2 | 1 | 8995 | | | | | 1 | SORT UNIQUE | | 18M | 136K | 16 | +1 | 1 | 8995 | 499K | 100.00 | Cpu (15) | | 2 | HASH JOIN | | 18M | 740 | 15 | +2 | 1 | 32M | 2M | | | | 3 | TABLE ACCESS FULL | T | 8995 | 346 | 1 | +2 | 1 | 8995 | | | | | 4 | TABLE ACCESS FULL | T1 | 86949 | 346 | 15 | +2 | 1 | 86730 | | | | =================================================================================================================================== |
可以看到该sql在11g中,逻辑读仅仅2489,但是却消耗了16s时间,这就是因为高频值的join消耗了大量的时间。
oracle在12c版本中,对于该sql进行了逻辑改写的查询转换,partial join eval(PJE),属于一种启发式查询转换,也就是基于规则的查询转换,当满足规则时就会进行转换改写。
PJE的规则满足条件如下:
- 多表连接
- 多表连接查询sql的select list的字段全部来自同一个表
- select list需要带有distinct去重操作
PJE的改写逻辑:
会把JOIN转换成SEMI JOIN,巧妙的规避了高频值JOIN返回大量rows,类似JOIN等价改为子查询。
PJE的参数控制和hint:
由隐藏参数_optimizer_partial_join_eval控制,默认为true。hint为PARTIAL_JOIN/NO_PARTIAL_JOIN。
23ai中重新跑一次测试的sql:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
SQL Monitoring Report SQL Text ------------------------------ SELECT /*+monitor test111*/ distinct t.OBJECT_ID, t.DATA_OBJECT_ID FROM test.t t, test.t1 t1 WHERE t.object_type = t1.object_type and t.DATA_OBJECT_ID is not null ORDER BY t.DATA_OBJECT_ID Global Information ------------------------------ Status : DONE (ALL ROWS) Instance ID : 1 Session : SYS (282:43874) SQL ID : dc02gzj9a9ukr SQL Execution ID : 16777216 Execution Started : 04/21/2025 14:02:44 First Refresh Time : 04/21/2025 14:02:44 Last Refresh Time : 04/21/2025 14:02:44 Duration : .052885s Module/Action : sqlplus@db (TNS V1-V3)/- Service : vastdata Program : sqlplus@db (TNS V1-V3) Fetch Calls : 633 Global Stats ================================================= | Elapsed | Cpu | Other | Fetch | Buffer | | Time(s) | Time(s) | Waits(s) | Calls | Gets | ================================================= | 0.05 | 0.05 | 0.00 | 633 | 1845 | ================================================= SQL Plan Monitoring Details (Plan Hash Value=634468599) =================================================================================================================================== | Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Mem | Activity | Activity Detail | | | | | (Estim) | | Active(s) | Active | | (Actual) | (Max) | (%) | (# samples) | =================================================================================================================================== | 0 | SELECT STATEMENT | | | | 1 | +0 | 1 | 9478 | . | | | | 1 | SORT UNIQUE | | 9478 | 881 | 1 | +0 | 1 | 9478 | 550KB | | | | 2 | HASH JOIN SEMI | | 9478 | 880 | 1 | +0 | 1 | 9478 | 2MB | | | | 3 | TABLE ACCESS FULL | T | 9478 | 440 | 1 | +0 | 1 | 9478 | . | | | | 4 | TABLE ACCESS FULL | T1 | 73257 | 440 | 1 | +0 | 1 | 13225 | . | | | =================================================================================================================================== |
由于SEMI JOIN不需要返回所有JOIN上的行,只需要检查是否匹配上,所以JOIN消耗的时间会大幅度降低。
10053:
1 2 3 4 5 6 |
OPTIMIZER STATISTICS AND COMPUTATIONS PJE: Checking validity of partial join eval on query block SEL$1 (#1) PJE: Passed validity of partial join eval by query block SEL$1 (#1) number: [0] PJE: Partial join eval conversion for query block SEL$1 (#1). PJE: Table marked for partial join eval: T1[T1]#1 |
outline会出现PARTIAL_JOIN(@”SEL$1″ “T1″@”SEL$1”)。
该特性在国产数据库中,几乎没有厂商支持。