在ORACLE里,一般认为”与NULL是等价的,除了一些特别的语法,比如有id is null,但是没有id is ”,以及”作为字符类型的特殊性,在decode等函数要求类型匹配的时候,与NULL可能不等价。本篇文章主要说明下”与NULL在SQL优化器中也有很大的区别,如果不了解这个区别,很可能SQL语句的性能,将要陷入灾难境地。此问题来源于:http://www.itpub.net/thread-1838241-1-1.html 讨论
1.示例分析
create table t1 as select * from dba_objects ; create table t2 as select * from dba_objects ; create table t3 as select * from dba_objects ; / begin for xx in 1 .. 5 loop insert into t1 select * from t1; insert into t2 select * from t2; insert into t3 select * from t3; commit; end loop; end; / create index i1 on t1(object_name); create index i2 on t2(object_name); create index i3 on t3(object_name); create index i4 on t1(subobject_name); create index i5 on t2(subobject_name); create index i6 on t3(subobject_name); –收集统计信息省略 |
针对下列语句,ORACLE对t2,t3走了FULL TABLE SCAN:
dingjun123@ORADB> show rel release 1102000100 dingjun123@ORADB> set autotrace traceonly exp Execution Plan |
看表t2、t3对应谓词的实际基数情况:
dingjun123@ORADB> SELECT COUNT(*) FROM t2 WHERE ”=’T’ OR “T2″.”OBJECT_NAME”=’T'; COUNT(*) ———- 128 1 row selected. Elapsed: 00:00:11.50 dingjun123@ORADB> SELECT COUNT(*) FROM t3 WHERE ”=’T’ OR “T3″.”SUBOBJECT_NAME”=’T'; |
CBO优化器估算的对应谓词的基数与实际的技术差别如下:
表名 | 估算基数 | 实际基数 | 差别倍数 |
T2 | 24167 | 128 | 188.80 |
T3 | 26045 | 0 | 26045 |
很显然,实际基数与估算的基数差别太大,从而优化器选择了错误的执行路径,正确的执行路径应该是走索引的。究其原因,可以分析下,
dingjun123@ORADB> SELECT COUNT(*) FROM t2 WHERE ”T2″.”OBJECT_NAME”=’T'; Elapsed: 00:00:00.00 Execution Plan ———————————————————- Plan hash value: 2583336616 ————————————————————————– | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ————————————————————————– | 0 | SELECT STATEMENT | | 1 | 24 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 24 | | | |* 2 | INDEX RANGE SCAN| I2 | 52 | 1248 | 3 (0)| 00:00:01 | ————————————————————————– Predicate Information (identified by operation id): ————————————————— 2 – access(“T2″.”OBJECT_NAME”=’T') dingjun123@ORADB> SELECT COUNT(*) FROM t2 WHERE ”=’T'; |
很显然,是由于谓词”=’T'造成了优化器的估算错误,”=’T'优化器估算100%的选择性,通过OR一合并,针对表T2的最终基数是24167,因此,优化器选择全表扫描。这很显然是错误的,”=’T',走FILTER应该转为NULL IS NOT NULL的形式,最终此分支根本无需计算才对。下面换成NULL测试:
dingjun123@ORADB> SELECT COUNT(*) FROM t2 WHERE NULL=’T'; Elapsed: 00:00:00.00 Execution Plan ———————————————————- Plan hash value: 402395414 ——————————————————————– | Id | Operation | Name | Rows | Cost (%CPU)| Time | ——————————————————————– | 0 | SELECT STATEMENT | | 1 | 0 (0)| | | 1 | SORT AGGREGATE | | 1 | | | |* 2 | FILTER | | | | | | 3 | TABLE ACCESS FULL| T2 | 2411K| 9411 (1)| 00:01:53 | ——————————————————————– Predicate Information (identified by operation id): ————————————————— 2 – filter(NULL IS NOT NULL) |
换成NULL=’T'后,FILTER自动转为NULL IS NOT NULL,显然,这是个永远不成立的条件,所以,根本无须执行ID=3的操作,最终ID=0的结果COST=0,针对这种FILTER单分支的执行计划注意,虽然子步骤3COST=9411,但是实际可能没有执行,详细的计划可以用DBMS_XPLAN.DISPLAY_CURSOR查看:
dingjun123@ORADB> alter session set statistics_level=all; Session altered. Elapsed: 00:00:00.04 dingjun123@ORADB> SELECT COUNT(*) FROM t2 WHERE NULL=’T'; dingjun123@ORADB> @display_cursor 20 rows selected. |
通过ID=3步骤的Starts=0,很清晰地看出,最终未执行ID=3的步骤。从这点上说,优化器针对”与NULL的相关计算方式还是有很大区别。
2.问题解决
通过以上分析,将”改为NULL,问题得到有效解决或者把谓词写到每个子SQL里面。当然,只要是非”的都是可以的,比如’ ‘(空格)。
dingjun123@ORADB> SELECT * 2 FROM (SELECT t1.subobject_name, t1.object_name, t1.object_type 3 FROM t1 4 UNION ALL 5 SELECT NULL, t2.object_name, t2.object_type 6 FROM t2 7 UNION ALL 8 SELECT t3.subobject_name, NULL, t3.object_type FROM t3) t 9 WHERE (t.subobject_name = ‘T’ OR t.object_name = ‘T’); 256 rows selected. Execution Plan Predicate Information (identified by operation id): Statistics |
ORACLE优化器虽然强大,但是优化器要判断的东西实在太多太复杂,导致ORACLE强大的优化器也不一定是很完善的,有各种各样的BUG,或未完善的地方,值得我们注意,但是,相信各种新的版本上,会有更多惊喜的特性。
注:以上问题,在RBO中同样存在。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29960155/viewspace-1372540/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29960155/viewspace-1372540/