2 15.1. How Parallel Query Works #
4 When the optimizer determines that parallel query is the fastest
5 execution strategy for a particular query, it will create a query plan
6 that includes a Gather or Gather Merge node. Here is a simple example:
7 EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
9 -------------------------------------------------------------------------------
11 Gather (cost=1000.00..217018.43 rows=1 width=97)
13 -> Parallel Seq Scan on pgbench_accounts (cost=0.00..216018.33 rows=1 width
15 Filter: (filler ~~ '%x%'::text)
18 In all cases, the Gather or Gather Merge node will have exactly one
19 child plan, which is the portion of the plan that will be executed in
20 parallel. If the Gather or Gather Merge node is at the very top of the
21 plan tree, then the entire query will execute in parallel. If it is
22 somewhere else in the plan tree, then only the portion of the plan
23 below it will run in parallel. In the example above, the query accesses
24 only one table, so there is only one plan node other than the Gather
25 node itself; since that plan node is a child of the Gather node, it
28 Using EXPLAIN, you can see the number of workers chosen by the planner.
29 When the Gather node is reached during query execution, the process
30 that is implementing the user's session will request a number of
31 background worker processes equal to the number of workers chosen by
32 the planner. The number of background workers that the planner will
33 consider using is limited to at most max_parallel_workers_per_gather.
34 The total number of background workers that can exist at any one time
35 is limited by both max_worker_processes and max_parallel_workers.
36 Therefore, it is possible for a parallel query to run with fewer
37 workers than planned, or even with no workers at all. The optimal plan
38 may depend on the number of workers that are available, so this can
39 result in poor query performance. If this occurrence is frequent,
40 consider increasing max_worker_processes and max_parallel_workers so
41 that more workers can be run simultaneously or alternatively reducing
42 max_parallel_workers_per_gather so that the planner requests fewer
45 Every background worker process that is successfully started for a
46 given parallel query will execute the parallel portion of the plan. The
47 leader will also execute that portion of the plan, but it has an
48 additional responsibility: it must also read all of the tuples
49 generated by the workers. When the parallel portion of the plan
50 generates only a small number of tuples, the leader will often behave
51 very much like an additional worker, speeding up query execution.
52 Conversely, when the parallel portion of the plan generates a large
53 number of tuples, the leader may be almost entirely occupied with
54 reading the tuples generated by the workers and performing any further
55 processing steps that are required by plan nodes above the level of the
56 Gather node or Gather Merge node. In such cases, the leader will do
57 very little of the work of executing the parallel portion of the plan.
59 When the node at the top of the parallel portion of the plan is Gather
60 Merge rather than Gather, it indicates that each process executing the
61 parallel portion of the plan is producing tuples in sorted order, and
62 that the leader is performing an order-preserving merge. In contrast,
63 Gather reads tuples from the workers in whatever order is convenient,
64 destroying any sort order that may have existed.