zl程序教程

您现在的位置是:首页 >  数据库

当前栏目

postgres 并行扫描--索引的正确打开方式

索引 -- 正确 扫描 并行 Postgres 打开方式
2023-09-27 14:25:36 时间

一,并行顺序扫描

  01,了解顺序扫描

  了解并行前应该先学下顺序扫描,也就是全表扫描,每次检索都是全表跑一边,这就导致大表扫描期间就会消耗大量的CPU ,内存 磁盘I/o,这个对数据库影响还是很大的. OLTP 事务型的数据库一般都要避免这个扫描方式

  02,创建环境

CREATE TABLE test_bigl ( ID int4, NAME CHARACTER VARYING ( 32 ), create_time timestamp without TIME ZONE DEFAULT clock_TIMESTAMP ( ) );

--创建表结构

INSERT INTO test_bigl(id,name) SELECT n,n||'test' FROM generate_series(1,50000000) n; 


--插入百万级别的数据

   03, 实操了解扫描与并行扫描

先来一个操作:

kingledb=> EXPLAIN SELECT * FROM test_bigl WHERE name='test';
                                  QUERY PLAN
------------------------------------------------------------------------------
 Gather  (cost=1000.00..619228.77 rows=1 width=24)
   Workers Planned: 2
   ->  Parallel Seq Scan on test_bigl  (cost=0.00..618228.67 rows=1 width=24)
         Filter: ((name)::text = 'test'::text)
(4 rows)

查看这个加深的 能了解到这个是顺序扫描

 

kingledb=> EXPLAIN ANALYZE SELECT * FROM test_bigl WHERE name='test';
                                                          QUERY PLAN

--------------------------------------------------------------------------------------
----------------------------------------
 Gather  (cost=1000.00..619228.77 rows=1 width=24) (actual time=2957.652..2961.262 row
s=0 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on test_bigl  (cost=0.00..618228.67 rows=1 width=24) (actual
time=2951.512..2951.512 rows=0 loops=3)
         Filter: ((name)::text = 'test'::text)
         Rows Removed by Filter: 16666667
 Planning time: 0.098 ms
 Execution time: 2961.299 ms
(8 rows)

这个能查看到执行计划中两个增加的选项:
Workers Planned: 执行计划预估的并行进程数
Workers Launched: 查询实际获得的并行进程数


发现这两个都是为2
Parallel Seq Scan on test_bigl --这个代表这这个进行了并行的循序扫描
Planning time: 0.098 ms -->这个是sql生成执行计划的时间
Execution time: 2961.299 ms -->这个是sql实际执行时间
其实到这里你就发现我这个一直开启了并行度,现在我们关闭它
kingledb=> set max_parallel_workers_per_gather = 0;
SET


0 代表关闭

其他正整数代表 开启的并行度

我们再次执行下

kingledb=> EXPLAIN ANALYZE SELECT * FROM test_bigl WHERE name='test';
                                                   QUERY PLAN

--------------------------------------------------------------------------------------
---------------------------
 Seq Scan on test_bigl  (cost=0.00..982812.00 rows=1 width=24) (actual time=10028.226.
.10028.226 rows=0 loops=1)
   Filter: ((name)::text = 'test'::text)
   Rows Removed by Filter: 50000000
 Planning time: 0.237 ms
 Execution time: 10028.266 ms
(5 rows)

--可以发现上面加深颜色的都不存在了,而且执行时间慢了5倍!!!,相当恐怖

二,并行索引扫描

  01,索引扫描

    也就是在表上创建索引~~~

   02,操作观察

--创建索引
kingledb=# CREATE INDEX index_test_bigl on kingle.test_bigl USING btree (id); CREATE INDEX kingledb=#

    

--不开启并行度查看索引扫描
kingledb=# EXPLAIN ANALYZE SELECT count(name) FROM kingle.test_bigl WHERE id <100000
;
                                                                  QUERY PLAN

--------------------------------------------------------------------------------------
--------------------------------------------------------
 Aggregate  (cost=3684.05..3684.06 rows=1 width=8) (actual time=54.445..54.445 rows=1
loops=1)
   ->  Index Scan using index_test_bigl on test_bigl  (cost=0.56..3442.86 rows=96474 w
idth=12) (actual time=0.056..35.887 rows=99999 loops=1)
         Index Cond: (id < 100000)
 Planning time: 1.212 ms
 Execution time: 54.614 ms
(5 rows)
--开启并行度扫描
kingledb=# EXPLAIN ANALYZE SELECT count(name) FROM kingle.test_bigl WHERE id <1000000
;
                                     QUERY PLAN
--------------------------------------------------------------------------------------
---------------------------------------------------------------------------------
 Finalize Aggregate  (cost=30200.04..30200.05 rows=1 width=8) (actual time=191.625..19
1.625 rows=1 loops=1)
   ->  Gather  (cost=30199.62..30200.03 rows=4 width=8) (actual time=191.585..195.111
rows=5 loops=1)
         Workers Planned: 4
         Workers Launched: 4
         ->  Partial Aggregate  (cost=29199.62..29199.63 rows=1 width=8) (actual time=
178.688..178.688 rows=1 loops=5)
               ->  Parallel Index Scan using index_test_bigl on test_bigl  (cost=0.56.
.28564.97 rows=253860 width=12) (actual time=0.073..148.722 rows=200000 loops=5)
                     Index Cond: (id < 1000000)
 Planning time: 0.204 ms
 Execution time: 195.215 ms
(9 rows)

三,并行index-only扫描

  

kingledb=# EXPLAIN ANALYZE SELECT "count"(*) from kingle.test_bigl WHERE id <1000000;
                                                                                QUERY
PLAN
--------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=30200.04..30200.05 rows=1 width=8) (actual time=184.775..18
4.776 rows=1 loops=1)
   ->  Gather  (cost=30199.62..30200.03 rows=4 width=8) (actual time=181.043..184.904
rows=5 loops=1)
         Workers Planned: 4
         Workers Launched: 4
         ->  Partial Aggregate  (cost=29199.62..29199.63 rows=1 width=8) (actual time=
170.658..170.658 rows=1 loops=5)
               ->  Parallel Index Only Scan using index_test_bigl on test_bigl  (cost=
0.56..28564.97 rows=253860 width=0) (actual time=0.114..142.576 rows=200000 loops=5)
                     Index Cond: (id < 1000000)
                     Heap Fetches: 206058
 Planning time: 0.340 ms
 Execution time: 185.024 ms
(10 rows)

四,并行bitmap heap扫描

kingledb=# EXPLAIN ANALYZE SELECT count(*) from kingle.test_bigl WHERE id<1000000 or id >490000000
kingledb-# ;
                                                                           QUERY PLAN

--------------------------------------------------------------------------------------
---------------------------------------------------------------------------
 Finalize Aggregate  (cost=382523.64..382523.65 rows=1 width=8) (actual time=209.846..
209.847 rows=1 loops=1)
   ->  Gather  (cost=382523.22..382523.63 rows=4 width=8) (actual time=206.653..210.28
5 rows=5 loops=1)
         Workers Planned: 4
         Workers Launched: 4
         ->  Partial Aggregate  (cost=381523.22..381523.23 rows=1 width=8) (actual tim
e=197.044..197.044 rows=1 loops=5)
               ->  Parallel Bitmap Heap Scan on test_bigl  (cost=19268.67..380888.57 r
ows=253860 width=0) (actual time=85.352..170.740 rows=200000 loops=5)
                     Recheck Cond: ((id < 1000000) OR (id > 490000000))
                     Heap Blocks: exact=1478
                     ->  BitmapOr  (cost=19268.67..19268.67 rows=1015441 width=0) (act
ual time=91.773..91.773 rows=0 loops=1)
                           ->  Bitmap Index Scan on index_test_bigl  (cost=0.00..18756
.37 rows=1015441 width=0) (actual time=91.749..91.749 rows=999999 loops=1)
                                 Index Cond: (id < 1000000)
                           ->  Bitmap Index Scan on index_test_bigl  (cost=0.00..4.57
rows=1 width=0) (actual time=0.014..0.014 rows=0 loops=1)
                                 Index Cond: (id > 490000000)
 Planning time: 0.456 ms
 Execution time: 210.420 ms
(15 rows)

kingledb=#