【PostgreSQL】パラレルクエリ

PostgreSQLのパラレルクエリ

PostgreSQL9.6からは「パラレルクエリ」が利用できる。パラレルクエリは複数のプロセスを使って分散してクエリを処理するため、大きなテーブルの集計で高速化が期待できる。クエリは、テーブル、インデックスのスキャン、結合、集計などの各タスクを分散できる部分は分散し、出来ない部分シングルプロセスで実行する。もともと処理しているworkerをleader workerとして、動的に作成されるworkerをparallel workerとして機能させる。

パラレルクエリの詳細および条件

パラレルクエリを実行するための条件がいくつかある。条件が満たされていれば「自動」で実行される。

参照処理

更新などの処理、カーソル処理(declare cursor)ではパラレルクエリは利用できない。読み込みなどの参照処理で利用できる。create table as select ...では参照処理結果をパラレルで処理する。

パラレル安全

パラレル安全は、利用される関数がパラレル処理ができるかどうかを示すもので、パラレル安全な関数が利用される。

workerプロセス

workerプロセスのworkerの上限が決まっているので、それを超えたworker数では処理できない。

起動条件

サイズの小さなデータにパラレルクエリは相性が悪い。デフォルトでは、テーブルサイズ8MB以上(インデックスサイズ512KB)でworker2つ、24MB以上でworker3つ、72MB以上でworker4つとなる。

パラメタ設定

パラメタ	デフォルト	反映タイミング	説明
max_worker_processes	8	PostgreSQL起動時	動的プロセスworkerの最大数
max_parallel_workers	8	任意	パラレルクエリのworkerの最大数
max_parallel_workers_per_gather	2	任意	パラレルクエリでのスキャンや集計処理あたりの最大数
max_parallel_table_scan_size	8MB	任意	パラレルクエリの実行最小テーブルサイズ
max_parallel_index_scan_size	512KB	任意	パラレルクエリの実行最小インデックスサイズ

パラレルクエリの効果

まずはパラレルクエリがない状態の実行計画をみていく。ordersテーブルへのスキャンに始まり、32434489行を1ループしている。7042ミリ秒かかっている。

set max_parallel_workers_per_gather to 0;
explain (analyze on, timing off, costs off) select count(1) from public.orders;

                       QUERY PLAN                        
---------------------------------------------------------
 Aggregate (actual rows=1 loops=1)
   ->  Seq Scan on orders (actual rows=32434489 loops=1)
 Planning Time: 5.554 ms
 Execution Time: 7042.972 ms
(4 rows)

次に、worker数を2にする。つまり、leaderが1つで、parallel workerが2つ起動する状態。4435ミリ秒かかっている。32434489を3(1+2)つに割ると10811496なので、この行数を各workerがスキャンし、その結果を集計している。

set max_parallel_workers_per_gather to 2;
explain (analyze on, timing off, costs off) select count(1) from public.orders;
                                  QUERY PLAN                                  
------------------------------------------------------------------------------
 Finalize Aggregate (actual rows=1 loops=1)
   ->  Gather (actual rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate (actual rows=1 loops=3)
               ->  Parallel Seq Scan on orders (actual rows=10811496 loops=3)
 Planning Time: 0.409 ms
 Execution Time: 4435.346 ms
(8 rows)

次に、worker数を4にする。5429ミリ秒かかっている。32434489を5(1+4)つに割ると6486898なので、この行数を各workerがスキャンし、その結果を集計している。

set max_parallel_workers_per_gather to 4;
explain (analyze on, timing off, costs off) select count(1) from public.orders;
                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Finalize Aggregate (actual rows=1 loops=1)
   ->  Gather (actual rows=5 loops=1)
         Workers Planned: 4
         Workers Launched: 4
         ->  Partial Aggregate (actual rows=1 loops=5)
               ->  Parallel Seq Scan on orders (actual rows=6486898 loops=5)
 Planning Time: 0.118 ms
 Execution Time: 5429.966 ms
(8 rows)

worker数を増やしたからといって、3000万くらいのデータであれば、worker数は2つで十分らしく、それ以上は効率が悪くなっている。