今天看了一下好友anbob的文章,主要内容涉及CBO的一些查询转换,主题是在开发场景中如查询分页计算多少页或统计查询,有些开发是习惯基于明细的查询SQL,而外层直接加1层汇聚查询,如select count(*) from (select ….), 但子查询中可能有一些函数或主查询根本不需要的列, 在oracle中的查询转换中如select-project-join或select list pruning, 或VIEW merge SPJ,CVM 都是为了不影响SQL结果一致性,而优化低效的SQL。
Oracle、Oceanbase、Kingbase、GaussDB、达梦数据库比较系列(二十七):子查询中的函数投影裁剪
文章讲述的内容是在内嵌视图中的函数,在做了merge view转换之后,不需要的字段会被裁剪掉(SLP)。
但是在我自己测试的时候觉得还是存在一些小缺陷。记录一下。先说结论,我个人认为SLP投影应该发生在merge view之前。
测试demo:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
SQL> select BANNER from v$version; BANNER --------------------------------------------------------------------------------------------------------------------------------- Oracle Database 23ai Enterprise Edition Release 23.0.0.0.0 - Production SQL> create table test.t as select * from dba_objects; Table created. SQL> alter table test.t add primary key(object_id); Table altered. SQL> create index test.idx1 on test.t(data_object_id); Index created. SQL> EXEC dbms_stats.gather_table_stats(ownname => 'TEST', tabname => 'T', estimate_percent => dbms_stats.auto_sample_size, CASCADE => true, method_opt => 'for all columns size repeat', no_invalidate => false); PL/SQL procedure successfully completed. |
SPJ的simple view,我就不测试了,主要测试一个带有group by的complex view。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
SQL> explain plan for select count(*) from (select data_object_id,max(last_ddl_time) from test.t where data_object_id is not null group by data_object_id); Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Plan hash value: 548494858 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 441 (1)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | | 2 | VIEW | | 6218 | | 441 (1)| 00:00:01 | | 3 | HASH GROUP BY | | 6218 | 62180 | 441 (1)| 00:00:01 | |* 4 | TABLE ACCESS FULL| T | 9478 | 94780 | 440 (1)| 00:00:01 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("DATA_OBJECT_ID" IS NOT NULL) 16 rows selected. |
由于data_object_id为普通索引,oracle的索引是不存储null的,所以特意加了条件is not null。可以发现view merge并没有发生,所以就不涉及后续的SLP了。假如SLP是在VIEW MERGE之前,该sql完全可以不需要访问表T,访问DATA_OBJECT_ID上的索引即可。
拒绝view merge的理由是包含了聚合并且外部查询不包含其他表:
1 2 3 4 5 6 7 8 |
Query after VW_MRG2: qb SEL$2 (#2):******* UNPARSED QUERY IS ******* SELECT "T"."DATA_OBJECT_ID" "DATA_OBJECT_ID",MAX("T"."LAST_DDL_TIME") "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" WHERE "T"."DATA_OBJECT_ID" IS NOT NULL GROUP BY "T"."DATA_OBJECT_ID" CVM: from$_subquery$_001[from$_subquery$_001] (view does not have rowid) has no rowid CVM: CVM bypassed on view SEL$2(#2): Outer query and view query contain aggregates, and outer query contains no other tables. Query after VW_MRG2: qb SEL$1 (#1):******* UNPARSED QUERY IS ******* SELECT COUNT(*) "COUNT(*)" FROM (SELECT "T"."DATA_OBJECT_ID" "DATA_OBJECT_ID",MAX("T"."LAST_DDL_TIME") "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" WHERE "T"."DATA_OBJECT_ID" IS NOT NULL GROUP BY "T"."DATA_OBJECT_ID") "from$_subquery$_001" |
将group by字段换成主键时,又可以了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
SQL> explain plan for select count(*) from (select object_id,max(last_ddl_time) from test.t group by object_id); Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Plan hash value: 1802720789 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 46 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| SYS_C008566 | 73257 | 46 (0)| 00:00:01 | ----------------------------------------------------------------------------- 9 rows selected. |
一开始进行view merge评估时同样也是判断cvm无法合并
1 2 3 4 5 6 7 8 |
Query after VW_MRG2: qb SEL$2 (#2):******* UNPARSED QUERY IS ******* SELECT "T"."OBJECT_ID" "OBJECT_ID",MAX("T"."LAST_DDL_TIME") "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" GROUP BY "T"."OBJECT_ID" CVM: from$_subquery$_001[from$_subquery$_001] (view does not have rowid) has no rowid CVM: CVM bypassed on view SEL$2(#2): Outer query and view query contain aggregates, and outer query contains no other tables. Query after VW_MRG2: qb SEL$1 (#1):******* UNPARSED QUERY IS ******* SELECT COUNT(*) "COUNT(*)" FROM (SELECT "T"."OBJECT_ID" "OBJECT_ID",MAX("T"."LAST_DDL_TIME") "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" GROUP BY "T"."OBJECT_ID") "from$_subquery$_001" |
但由于object_id为主键,这个时候做了一个group by消除,把group by去掉了
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
QB before group-by removal: qb SEL$2 (#2):******* UNPARSED QUERY IS ******* SELECT "T"."OBJECT_ID" "OBJECT_ID",MAX("T"."LAST_DDL_TIME") "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" GROUP BY "T"."OBJECT_ID" QB before group-by elimination: qb SEL$2 (#2):******* UNPARSED QUERY IS ******* SELECT "T"."OBJECT_ID" "OBJECT_ID",MAX("T"."LAST_DDL_TIME") "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" GROUP BY "T"."OBJECT_ID" Registered qb: SEL$55E1DC0E 0x862997a8 (ELIMINATION OF GROUP BY SEL$2; SEL$2) --------------------- QUERY BLOCK SIGNATURE --------------------- signature (): qb_name=SEL$55E1DC0E nbfros=1 flg=0 fro(0): flg=0 objn=151464 hint_alias="T"@"SEL$2" QB after group-by elimination: qb SEL$55E1DC0E (#2):******* UNPARSED QUERY IS ******* SELECT "T"."OBJECT_ID" "OBJECT_ID","T"."LAST_DDL_TIME" "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" Registered qb: SEL$3D33D98B 0x862997a8 (ELIMINATION OF GROUP BY SEL$55E1DC0E; SEL$55E1DC0E) --------------------- QUERY BLOCK SIGNATURE --------------------- signature (): qb_name=SEL$3D33D98B nbfros=1 flg=0 fro(0): flg=0 objn=151464 hint_alias="T"@"SEL$2" QB after group-by removal: qb SEL$3D33D98B (#2):******* UNPARSED QUERY IS ******* SELECT "T"."OBJECT_ID" "OBJECT_ID","T"."LAST_DDL_TIME" "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" |
之后对于SPJ简单视图做了,view merge
1 2 3 4 5 |
Query after VW_MRG2: qb SEL$3D33D98B (#2):******* UNPARSED QUERY IS ******* SELECT "T"."OBJECT_ID" "OBJECT_ID","T"."LAST_DDL_TIME" "MAX(LAST_DDL_TIME)" FROM "TEST"."T" "T" CVM: Merging SPJ view SEL$3D33D98B (#2) into SEL$1 (#1) CNT: Considering count(col) to count(*) on query block SEL$1 (#1) |
再经过SLP转换为最终sql:
1 2 |
Final query after transformations: qb SEL$13F8F429 (#1):******* UNPARSED QUERY IS ******* SELECT COUNT(*) "COUNT(*)" FROM "TEST"."T" "T" |
这只是碰巧碰到了主键的优势,能够对该聚合做group by消除。
假如SLP投影发生在merge view之前:
1 |
select count(*) from (select data_object_id from test.t where data_object_id is not null group by data_object_id); |
是否就能改写为:
1 |
select count(*) from (select data_object_id from test.t where data_object_id is not null group by data_object_id); |
执行计划也将变为:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
SQL> explain plan for select count(*) from (select data_object_id from test.t where data_object_id is not null group by data_object_id); Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Plan hash value: 1041817260 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 10 (10)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | | 2 | VIEW | VM_NWVW_0 | 6218 | | 10 (10)| 00:00:01 | | 3 | HASH GROUP BY | | 6218 | 12436 | 10 (10)| 00:00:01 | |* 4 | INDEX FAST FULL SCAN| IDX1 | 9478 | 18956 | 9 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("DATA_OBJECT_ID" IS NOT NULL) 16 rows selected. |