# PostgreSQL 教程: 理解并行查询的成本估算

## 聚合的并行计划示例

``````CREATE TABLE testpar (id integer, str text);

INSERT INTO testpar (id, str)
SELECT i, repeat(chr(65 + mod(i, 26)), 64) as str
FROM generate_series(1, 1000000) AS s(i);

ANALYZE testpar;
``````

``````EXPLAIN SELECT count(*) FROM testpar;

QUERY PLAN
-------------------------------------------------------------------------------------------
Finalize Aggregate  (cost=16625.55..16625.56 rows=1 width=8)
->  Gather  (cost=16625.33..16625.54 rows=2 width=8)
Workers Planned: 2
->  Partial Aggregate  (cost=15625.33..15625.34 rows=1 width=8)
->  Parallel Seq Scan on testpar  (cost=0.00..14583.67 rows=416667 width=0)
(5 rows)
``````

Gather 下面的所有节点都是该计划的并行部分。它将由所有工作进程（在本例中规划了 2 个）和领导进程（除非被 parallel_leader_participation 选项禁用）来执行。Gather 节点及其上的所有节点，都由领导进程顺序执行。

``````SELECT reltuples::numeric, round(reltuples / 2.4) AS per_process
FROM pg_class WHERE relname = 'testpar';

reltuples | per_process
-----------+-------------
1000000 |      416667
(1 row)
``````

Parallel Seq Scan 节点成本的计算方式与 Seq Scan 成本大致相同。我们通过让每个进程处理更少的行来节省时间，但我们仍然需要扫描全表，因此 I/O 成本不会发生改变：

``````SELECT round((
relpages * current_setting('seq_page_cost')::real +
reltuples / 2.4 * current_setting('cpu_tuple_cost')::real
)::numeric, 2)
FROM pg_class WHERE relname = 'testpar';

round
----------
14583.67
(1 row)
``````

Partial Aggregate 节点聚合工作进程生成的所有数据（在本例中是对行进行计数）。

``````WITH t(startup_cost) AS (
SELECT 14583.67 + round((
reltuples / 2.4 * current_setting('cpu_operator_cost')::real
)::numeric, 2)
FROM pg_class WHERE relname='testpar'
)
SELECT startup_cost,
startup_cost + round((
1 * current_setting('cpu_tuple_cost')::real
)::numeric, 2) AS total_cost
FROM t;

startup_cost | total_cost
--------------+------------
15625.34 |   15625.35
(1 row)
``````

``````SELECT
15625.34 + round(
current_setting('parallel_setup_cost')::numeric,
2) AS setup_cost,
15625.35 + round(
current_setting('parallel_setup_cost')::numeric +
2 * current_setting('parallel_tuple_cost')::numeric,
2) AS total_cost;

setup_cost | total_cost
------------+------------
16625.34 |   16625.55
(1 row)
``````

Finalize Aggregate 节点将 Gather 节点收集的每部分数据连接在一起。其成本的估算方式与常规的 Aggregate 相同。启动成本包括三行的聚合成本和 Gather 节点的总成本（因为 Finalize Aggregate 需要其所有输出来进行计算）。在总成本之上附加的一点是，结果行的输出成本。

``````WITH t(startup_cost) AS (
SELECT 16625.55 + round((
3 * current_setting('cpu_operator_cost')::real
)::numeric, 2)
FROM pg_class WHERE relname = 'testpar'
)
SELECT startup_cost,
startup_cost + round((
1 * current_setting('cpu_tuple_cost')::real
)::numeric, 2) AS total_cost
FROM t;

startup_cost | total_cost
--------------+------------
16625.56 |   16625.57
(1 row)
``````

PostgreSQL 优化