您现在的位置是：首页 > 数据库

当前栏目

多列复合索引的使用绕过微软sqlserver的一个缺陷

SQLServer 索引微软使用一个绕过缺陷多列

2023-06-13 09:14:34 时间

然而，微软sqlserver在处理这类索引时，有个重要的缺陷，那就是把本该编译成索引seek的操作编成了索引扫描，这可能导致严重性能下降

举个例子来说明问题，假设某个表T有索引(cityid,sentdate,userid),现在有个分页列表功能，要获得大于某个多列复合索引V0的若干个记录的查询，用最简单表意的方式写出来就是V>=V0,如果分解开来，就是:
cityid>@cityid0or(cityid=@cityid0and(sentdate>@sentdate0or(sentdate=@sentdate0anduserid>=@userid0))),

当你写出上述查询时，你会期待sqlserver会自动的把上述识别为V>=V0类型的边界条件，并使用indexseek操作来实施该查询。然而，微软的sqlserver(2005版)有一个重要缺陷(其他的sqlserver如何还不得知),当它遇到这样sql时，sqlserver就会采用indexscan来实施，结果是您建立好的索引根本就没有被使用，如果这个表的数据量很大，那所造成的性能下降是非常大的。
对于这个问题，我曾经提交给微软的有关人士，他们进一步要求我去一个正式的网站上去提交这个缺陷，我懒得去做。

不过，对这个缺陷，还是有个办法能够绕过去的，只要把上面给出的条件变变形，sqlserver还是能够变回到是用indexseek,而不是低性能的indexscan.具体请看我的英文原文吧(对不起了,我一旦写了中文,就不想翻成英文,反过来也一样,估计大家英文都还可以,实在不行的就看黑体部分吧,)：
Theseekpredicateoftheform"x>bookmark_of_x"isneededinpagingrelatedquery.Thecompilerhasnodifficultytoparseitcorrectlyifxisasinglecolumnindex,ortwocolumnsindex,however,ifxisathreecolumnsindexormore,thenthecompilerwillhaveahardtimetorecognizeit.Thisfailurewillresultinthattheseekpredicateendedupinresiduepredicate,whichresultsinamuchworseexecutionplan.
Toillustratethepoint,takeaexample,
CreatetableA(aint,bint,cint,dfloat,primarykey(a,b,c))
nowchecktheplanforthequery:
selectc,dfromAwhere(a>111ora=111and
(b>222orb=222andc>333))
youcanseeatablescanopisused,andtheWhereclauseendedupinresiduepredicate.
However,ifyourewritethequeryinanequivalentform:
selectc,dfromAwherea>111ora=111andb>222ora=111andb=222andc>333
Thenthecompilercanchooseanindexseekop,whichisdesired.
Theproblemis,thecompilershouldbeabletorecognizethefirstformofseekpredicateonmultiplecolumnsindex,itsavestheuserfromhavingtopayextratimetofigureoutaget-around,nottomentionthefirstformisamoreefficientformofsameexpression.
上面的问题，可以说是部分的绕过去了，但是，也有绕不过的时候，接着看下面一段:
Itlookslikethatsqlserverlacksaconseptofvectorbookmark,orvectorcomparisonorwhateveryouliketocallit.
Theworkaroundisnotaperfectworkaround.Ifsqlserverweretounderstandtheconceptofvectorbookmark,thenthefollowingtwowouldbethesameinexecutionplanandperformance:
1.selecttop(n)*fromAwherevectorIndex>=@vectorIndex
2.select*fromAwherevectorIndex>=@vectorIndexandvectorIndex<=@vectorIndexEnd
--@vectorIndexEndcorrespondstothelastrowof1.
However,testhasshownthat,thesecondstatementtakesfarmoretimethanthefirststatement,andsqlserveractuallyonlyseektothebeginingofthevectorrangeandscantotheendofthewholeIndex,insteadofstopattheendofthevectorrange.
Notonlysqlservercompilebadlywhenthevectorbookmarkhas3columns,testhasshownthatevenwithasfewas2columns,sqlsererstillcannotcorrectlyrecognizethisisactuallyavectorrange,example:
3.selecttop(100)a,b,c,dfromAwherea>60ora=60andb>20
4.selecta,b,c,dfromAwhere(a>60ora=60andb>20)and
(a<60ora=60andb<=21),

上面两个查询实质相同(表中的数据刚好如此)，并且给出同业的结果集，但是，3比4的速度要快的多，如果去看executionplan也证明3确实应当比4快.
也就是说,即使在索引vectorIndex只含两列的情况下,sqlserver也无法正确的理解范围表达式@vectorIndex0<vectorIndex<@vectorIndex1,它能把前半部分正确的解读为seek,但是,后半部分无法正确解读,导致,sqlserver会一直扫描到整个表的末尾,而不是在@vectorIndex1处停下来.
以下测试代码,有兴趣的人可以拿去自己玩:

复制代码代码如下:

CREATETABLE[dbo].[A](
[a][int]NOTNULL,
[b][int]NOTNULL,
[c][int]NOTNULL,
[d][float]NULL,
PRIMARYKEYCLUSTERED([a]ASC,[b]ASC,[c]ASC)
)
declare@aint,@bint,@cint
set@a=1
while@a<=100
begin
set@b=1
begintran
while@b<=100
begin
set@c=1
while@c<=100
begin
INSERTINTOA(a,b,c,d)
VALUES(@a,@b,@c,@a+@b+@c)
set@c=@c+1
end
set@b=@b+1
end
commit
set@a=@a+1
end
SETSTATISTICSPROFILEON
SETSTATISTICStimeON
SETSTATISTICSioON

selecttop(10)a,b,c,dfromAwhere(a>60ora=60and
(b>20orb=20andc>=31))
selecta,b,c,dfromAwhere(a>60ora=60and
(b>20orb=20andc>=31))and(a<60ora=60and
(b<20orb=20andc<=40))

selecttop(10)a,b,c,dfromAwherea>60ora=60andb>20ora=60andb=20andc>=31
selecta,b,c,dfromAwhere(a>60ora=60andb>20ora=60andb=20andc>=31)and
(a<60ora=60andb<20ora=60andb=20andc<=40)
selecttop(100)a,b,c,dfromAwherea>60ora=60andb>20
selecta,b,c,dfromAwhere(a>60ora=60andb>20)and(a<60ora=60andb<=21)
selecttop(100)a,b,c,dfromAwherea>60ora=60andb>20
selecta,b,c,dfromAwhere(a>60ora=60andb>20)and(a<60ora=60andb<=21)

猜你喜欢

【Java AWT 图形界面编程】LayoutManager 布局管理器 ③ ( BorderLayout 布局 )
平衡Oracle的负载均衡策略（oracle级别）
ORA-07452: specified resource manager plan does not exist in the data dictionary ORACLE 报错故障修复远程处理
acle突破Oracle的使用技巧（oracle使用or）
时间解决Java操作Redis中key过期时间的问题（redisjava过期）
MySQL中的自增长主键让它为你的数据做记录（c mysql自增长主键）
提升MySQL连接性能：解决慢查询问题（连接mysql慢）
《图解算法》系列学习（二）
ORA-24127: TABLESPACE parameter specified with an ACTION other than CREATE_ACTION ORACLE 报错故障修复远程处理
通用汽车投资7100万美元在加州新建设计工作室
Gitlab配置webhook趟坑全纪录&由此引发的常见环境问题排查思路与思考总结
MySQL与Redis融合，搭建新一代数据库（mysqlredis）
MySQL Error number: MY-012250; Symbol: ER_IB_MSG_425; SQLSTATE: HY000 报错故障修复远程处理
PHP和Redis的默契合作：探究性能优化（phpredis性能）
从Linux iw命令学习无线网络管理（linuxiw命令）
Oracle9i解压加载捷径解锁安装迷局（oracle9i解压）
Oracle求字符长度的简易方法（oracle求字符长度）
Ext.state.Manager.setProvider(new Ext.state.CookieProvider())详解编程语言
origin安装嵌入python_python爬虫之git的使用（origin说明）
phpgoogle或baidu分页代码

zl程序教程

当前栏目

多列复合索引的使用绕过微软sqlserver的一个缺陷

相关文章