子查询中的条件无法用于分区剪枝

John Doe 一月 13, 2025

摘要:在本文中,我们将学习到一个场景,在 PostgreSQL 中的某些情况下,子查询中的条件无法用于分区剪枝。

目录

介绍

最近,有一位客户发现,有一个性能不佳的查询,它在 Oracle 上执行非常快,但是一旦迁移到 PostgreSQL,查询就很慢。经过调查发现,该查询在规划阶段没有修剪分区,导致执行的时间要长得多。也就是说,该查询在规划阶段没有进行分区修剪,在执行阶段也没有进行(在 EXPLAIN 输出的计划中标注的 “never executed”,可以表明这一点)。

这个查询带有 “WHERE” 和 “JOIN” 子句,这些子句显式地指定了分区键,并且是在分区键上进行连接的,那么为什么在规划期间没有进行分区修剪呢?

测试用例设置

我们需要为下面的演示创建测试用例,此处我们使用了 employee 数据库,它是一个 PostgreSQL 的示例数据库,主要用于各种测试应用程序。

压缩包中包含 3 个转储文件:

  • employees_data.sql – 包含了数据和架构(解压后约 140MB)
  • employees_schema.sql – 仅包含数据架构
  • employees_parts.sql – 对某些表进行分区后的版本

我们这里使用的是 PostgreSQL 15。转储中没有包含有关数据库或用户名的信息,因此,请使用所需的任意所有者(用户),创建一个任意的数据库:

CREATE USER employees_user SUPERUSER;
CREATE DATABASE employees_database WITH OWNER employees_user;

并将转储插入其中:

tar -xf employees-database.tar.bz2
psql -U employees_user employees_database < employees_data.sql
psql -U employees_user employees_database < employees_parts.sql

该数据库包含了大约 300000 条员工记录,和 280 万条薪资记录。

演示

查询中的相关表都已按 “hire_date” 进行分区,如下面的 “WHERE” 子句所示,规划器应该规划分区剪枝,因为驱动表中的 “hire_date” 等于连接表中的 “hire_date”,但规划器并没有修剪分区:

EXPLAIN (ANALYZE, BUFFERS)
WITH top_emp AS (
   SELECT
       emp.id,
       emp.hire_date,
	   max(sal.amount) AS sal_amount
   FROM
       employees.employee_part emp
       INNER JOIN employees.salary sal on (sal.employee_id = emp.id)
   WHERE
       emp.hire_date = '1985-01-01'::date
   GROUP BY emp.id, emp.hire_date
)
SELECT
    dept.employee_id,
    dept.department_id,
    dept.hire_date,
	emp1.sal_amount
FROM
    top_emp emp1,
    employees.department_employee_part dept
WHERE
    emp1.hire_date = dept.hire_date and
    emp1.id = dept.employee_id;

                                                                                        QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=5030.60..10845.53 rows=1 width=25) (actual time=40.026..41.241 rows=9 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=3787
   ->  Hash Join  (cost=4030.61..9845.43 rows=1 width=25) (actual time=27.342..36.103 rows=3 loops=3)
         Hash Cond: ((dept.hire_date = emp1.hire_date) AND (dept.employee_id = emp1.id))
         Buffers: shared hit=3787
         ->  Parallel Append  (cost=0.00..5089.45 rows=138168 width=17) (actual time=0.009..21.210 rows=110534 loops=3)
               Buffers: shared hit=2448
               ->  Parallel Seq Scan on department_employee_part_p1986 dept_2  (cost=0.00..530.32 rows=23532 width=17) (actual time=0.008..5.717 rows=40005 loops=1)
                     Buffers: shared hit=295
               ->  Parallel Seq Scan on department_employee_part_p1985 dept_1  (cost=0.00..517.88 rows=22988 width=17) (actual time=0.007..5.684 rows=39080 loops=1)
                     Buffers: shared hit=288
               ->  Parallel Seq Scan on department_employee_part_p1987 dept_3  (cost=0.00..489.24 rows=21724 width=17) (actual time=0.018..5.021 rows=36930 loops=1)
                     Buffers: shared hit=272
               ->  Parallel Seq Scan on department_employee_part_p1988 dept_4  (cost=0.00..460.15 rows=20415 width=17) (actual time=0.018..5.600 rows=34705 loops=1)
                     Buffers: shared hit=256
               ->  Parallel Seq Scan on department_employee_part_p1989 dept_5  (cost=0.00..415.40 rows=18440 width=17) (actual time=0.011..1.492 rows=10449 loops=3)
                     Buffers: shared hit=231
               ->  Parallel Seq Scan on department_employee_part_p1990 dept_6  (cost=0.00..375.64 rows=16664 width=17) (actual time=0.006..1.985 rows=14164 loops=2)
                     Buffers: shared hit=209
               ->  Parallel Seq Scan on department_employee_part_p1991 dept_7  (cost=0.00..330.67 rows=14667 width=17) (actual time=0.003..3.367 rows=24934 loops=1)
                     Buffers: shared hit=184
               ->  Parallel Seq Scan on department_employee_part_p1992 dept_8  (cost=0.00..298.58 rows=13258 width=17) (actual time=0.003..2.349 rows=22539 loops=1)
                     Buffers: shared hit=166
               ->  Parallel Seq Scan on department_employee_part_p1993 dept_9  (cost=0.00..260.69 rows=11569 width=17) (actual time=0.002..1.967 rows=19667 loops=1)
                     Buffers: shared hit=145
               ->  Parallel Seq Scan on department_employee_part_p1994 dept_10  (cost=0.00..218.84 rows=9684 width=17) (actual time=0.004..1.688 rows=16463 loops=1)
                     Buffers: shared hit=122
               ->  Parallel Seq Scan on department_employee_part_p1995 dept_11  (cost=0.00..177.90 rows=7890 width=17) (actual time=0.002..1.523 rows=13413 loops=1)
                     Buffers: shared hit=99
               ->  Parallel Seq Scan on department_employee_part_p1996 dept_12  (cost=0.00..140.16 rows=6216 width=17) (actual time=0.002..1.180 rows=10568 loops=1)
                     Buffers: shared hit=78
               ->  Parallel Seq Scan on department_employee_part_p1997 dept_13  (cost=0.00..98.38 rows=4338 width=17) (actual time=0.003..0.815 rows=7375 loops=1)
                     Buffers: shared hit=55
               ->  Parallel Seq Scan on department_employee_part_p1998 dept_14  (cost=0.00..60.84 rows=2684 width=17) (actual time=0.002..0.481 rows=4562 loops=1)
                     Buffers: shared hit=34
               ->  Parallel Seq Scan on department_employee_part_p1999 dept_15  (cost=0.00..22.83 rows=983 width=17) (actual time=0.002..0.174 rows=1671 loops=1)
                     Buffers: shared hit=13
               ->  Parallel Seq Scan on department_employee_part_p2000 dept_16  (cost=0.00..1.09 rows=9 width=17) (actual time=0.008..0.011 rows=15 loops=1)
                     Buffers: shared hit=1
         ->  Hash  (cost=4029.12..4029.12 rows=99 width=20) (actual time=7.015..7.018 rows=9 loops=3)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               Buffers: shared hit=1291
               ->  Subquery Scan on emp1  (cost=0.72..4029.12 rows=99 width=20) (actual time=2.652..7.010 rows=9 loops=3)
                     Buffers: shared hit=1291
                     ->  GroupAggregate  (cost=0.72..4028.13 rows=99 width=20) (actual time=2.651..7.007 rows=9 loops=3)
                           Group Key: emp.id, emp.hire_date
                           Buffers: shared hit=1291
                           ->  Nested Loop  (cost=0.72..4018.56 rows=1144 width=20) (actual time=2.636..6.986 rows=162 loops=3)
                                 Buffers: shared hit=1291
                                 ->  Index Scan using employee_part_p1985_pkey on employee_part_p1985 emp  (cost=0.29..1310.32 rows=99 width=12) (actual time=2.611..6.888 rows=9 loops=3)
                                       Filter: (hire_date = '1985-01-01'::date)
                                       Rows Removed by Filter: 35307
                                       Buffers: shared hit=1178
                                 ->  Index Scan using idx_16991_primary on salary sal  (cost=0.43..27.24 rows=12 width=16) (actual time=0.005..0.008 rows=18 loops=27)
                                       Index Cond: (employee_id = emp.id)
                                       Buffers: shared hit=113
 Planning:
   Buffers: shared hit=9
 Planning Time: 0.976 ms
 Execution Time: 41.333 ms
(62 rows)

因此,如果分区键是相等的,并且表的分区方式完全相同,那么为什么规划器没有进行分区修剪呢?可以单击此处,在 PostgreSQL 邮件列表中找到答案。

规划器只能将条件向下推送到子查询中,它不能将条件从子查询拉入到外部查询中。

执行阶段的分区修剪只能发生在嵌套循环连接上,如果外部表上没有索引,并且规划器必须使用哈希连接,那么原始查询在执行阶段也不能进行分区修剪。

让我们在子查询中增加一个emp.birth_date的条件,来看看会发生什么情况:

EXPLAIN (ANALYZE, BUFFERS)
WITH top_emp AS (
   SELECT 
       emp.id,
       emp.hire_date,
	   max(sal.amount) AS sal_amount
   FROM
       employees.employee_part emp
       INNER JOIN employees.salary sal on (sal.employee_id = emp.id)
   WHERE
       emp.hire_date = '1985-01-01'::date
       and emp.birth_date between '1957-01-01'::date and '1970-01-01'::date
   GROUP BY emp.id, emp.hire_date
)
SELECT
    dept.employee_id,
    dept.department_id,
    dept.hire_date,
	emp1.sal_amount
FROM
    top_emp emp1,
    employees.department_employee_part dept
WHERE
    emp1.hire_date = dept.hire_date and
    emp1.id = dept.employee_id;

                                                                                       QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=2600.72..8414.19 rows=1 width=25) (actual time=3.710..3.728 rows=5 loops=1)
   Buffers: shared hit=328
   ->  HashAggregate  (cost=2600.43..2601.04 rows=61 width=20) (actual time=3.679..3.682 rows=5 loops=1)
         Group Key: emp.id, emp.hire_date
         Batches: 1  Memory Usage: 24kB
         Buffers: shared hit=313
         ->  Nested Loop  (cost=0.43..2595.14 rows=705 width=20) (actual time=1.329..3.655 rows=90 loops=1)
               Buffers: shared hit=313
               ->  Seq Scan on employee_part_p1985 emp  (cost=0.00..911.03 rows=61 width=12) (actual time=1.309..3.602 rows=5 loops=1)
                     Filter: ((birth_date >= '1957-01-01'::date) AND (birth_date <= '1970-01-01'::date) AND (hire_date = '1985-01-01'::date))
                     Rows Removed by Filter: 35311
                     Buffers: shared hit=293
               ->  Index Scan using idx_16991_primary on salary sal  (cost=0.43..27.49 rows=12 width=16) (actual time=0.005..0.007 rows=18 loops=5)
                     Index Cond: (employee_id = emp.id)
                     Buffers: shared hit=20
   ->  Append  (cost=0.29..95.13 rows=16 width=17) (actual time=0.006..0.007 rows=1 loops=5)
         Buffers: shared hit=15
         ->  Index Scan using department_employee_part_p1985_pkey on department_employee_part_p1985 dept_1  (cost=0.29..7.46 rows=1 width=17) (actual time=0.005..0.005 rows=1 loops=5)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
               Buffers: shared hit=15
         ->  Index Scan using department_employee_part_p1986_pkey on department_employee_part_p1986 dept_2  (cost=0.29..7.52 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1987_pkey on department_employee_part_p1987 dept_3  (cost=0.29..7.39 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1988_pkey on department_employee_part_p1988 dept_4  (cost=0.29..7.39 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1989_pkey on department_employee_part_p1989 dept_5  (cost=0.29..7.26 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1990_pkey on department_employee_part_p1990 dept_6  (cost=0.29..7.19 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1991_pkey on department_employee_part_p1991 dept_7  (cost=0.29..7.06 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1992_pkey on department_employee_part_p1992 dept_8  (cost=0.29..6.93 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1993_pkey on department_employee_part_p1993 dept_9  (cost=0.29..6.73 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1994_pkey on department_employee_part_p1994 dept_10  (cost=0.29..6.54 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1995_pkey on department_employee_part_p1995 dept_11  (cost=0.29..6.21 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1996_pkey on department_employee_part_p1996 dept_12  (cost=0.29..5.81 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1997_pkey on department_employee_part_p1997 dept_13  (cost=0.28..5.15 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1998_pkey on department_employee_part_p1998 dept_14  (cost=0.28..4.11 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1999_pkey on department_employee_part_p1999 dept_15  (cost=0.28..1.94 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p2000_pkey on department_employee_part_p2000 dept_16  (cost=0.14..0.35 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
 Planning:
   Buffers: shared hit=9
 Planning Time: 1.133 ms
 Execution Time: 3.977 ms
(70 rows)

原来的哈希连接变成了嵌套循环连接。在 EXPLAIN 输出的计划中标注的 “never executed” 表明,执行阶段的分区修剪已经发生了。

让我们将原始子查询中的 GROUP BY 子句去掉,看看又会发生什么情况:

EXPLAIN (ANALYZE, BUFFERS)
WITH top_emp AS (
   SELECT 
       DISTINCT
       emp.id,
       emp.hire_date
   FROM
       employees.employee_part emp
       INNER JOIN employees.salary sal on (sal.employee_id = emp.id)
   WHERE
       emp.hire_date = '1985-01-01'::date
       and emp.birth_date between '1957-01-01'::date and '1970-01-01'::date
       and sal.amount > 50000
)
SELECT
    dept.employee_id,
    dept.department_id,
    dept.hire_date
FROM
    top_emp emp1,
    employees.department_employee_part dept
WHERE
    emp1.hire_date = dept.hire_date and
    emp1.id = dept.employee_id;

                                                                                       QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=2619.67..8435.21 rows=1 width=17) (actual time=4.955..4.985 rows=5 loops=1)
   Buffers: shared hit=328
   ->  Unique  (cost=2619.38..2622.06 rows=61 width=12) (actual time=4.909..4.922 rows=5 loops=1)
         Buffers: shared hit=313
         ->  Sort  (cost=2619.38..2620.72 rows=535 width=12) (actual time=4.907..4.913 rows=87 loops=1)
               Sort Key: emp.id
               Sort Method: quicksort  Memory: 29kB
               Buffers: shared hit=313
               ->  Nested Loop  (cost=0.43..2595.14 rows=535 width=12) (actual time=1.829..4.849 rows=87 loops=1)
                     Buffers: shared hit=313
                     ->  Seq Scan on employee_part_p1985 emp  (cost=0.00..911.03 rows=61 width=12) (actual time=1.805..4.773 rows=5 loops=1)
                           Filter: ((birth_date >= '1957-01-01'::date) AND (birth_date <= '1970-01-01'::date) AND (hire_date = '1985-01-01'::date))
                           Rows Removed by Filter: 35311
                           Buffers: shared hit=293
                     ->  Index Scan using idx_16991_primary on salary sal  (cost=0.43..27.52 rows=9 width=8) (actual time=0.007..0.011 rows=17 loops=5)
                           Index Cond: (employee_id = emp.id)
                           Filter: (amount > 50000)
                           Rows Removed by Filter: 1
                           Buffers: shared hit=20
   ->  Append  (cost=0.29..95.13 rows=16 width=17) (actual time=0.009..0.009 rows=1 loops=5)
         Buffers: shared hit=15
         ->  Index Scan using department_employee_part_p1985_pkey on department_employee_part_p1985 dept_1  (cost=0.29..7.46 rows=1 width=17) (actual time=0.007..0.007 rows=1 loops=5)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
               Buffers: shared hit=15
         ->  Index Scan using department_employee_part_p1986_pkey on department_employee_part_p1986 dept_2  (cost=0.29..7.52 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1987_pkey on department_employee_part_p1987 dept_3  (cost=0.29..7.39 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1988_pkey on department_employee_part_p1988 dept_4  (cost=0.29..7.39 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1989_pkey on department_employee_part_p1989 dept_5  (cost=0.29..7.26 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1990_pkey on department_employee_part_p1990 dept_6  (cost=0.29..7.19 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1991_pkey on department_employee_part_p1991 dept_7  (cost=0.29..7.06 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1992_pkey on department_employee_part_p1992 dept_8  (cost=0.29..6.93 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1993_pkey on department_employee_part_p1993 dept_9  (cost=0.29..6.73 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1994_pkey on department_employee_part_p1994 dept_10  (cost=0.29..6.54 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1995_pkey on department_employee_part_p1995 dept_11  (cost=0.29..6.21 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1996_pkey on department_employee_part_p1996 dept_12  (cost=0.29..5.81 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1997_pkey on department_employee_part_p1997 dept_13  (cost=0.28..5.15 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1998_pkey on department_employee_part_p1998 dept_14  (cost=0.28..4.11 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p1999_pkey on department_employee_part_p1999 dept_15  (cost=0.28..1.94 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
         ->  Index Scan using department_employee_part_p2000_pkey on department_employee_part_p2000 dept_16  (cost=0.14..0.35 rows=1 width=17) (never executed)
               Index Cond: (employee_id = emp.id)
               Filter: (emp.hire_date = hire_date)
 Planning:
   Buffers: shared hit=11
 Planning Time: 1.038 ms
 Execution Time: 5.128 ms
(74 rows)

同样,原来的哈希连接变成了嵌套循环连接。在 EXPLAIN 输出的计划中标注的 “never executed” 表明,执行阶段的分区修剪已经发生了。

结论

所以,正如你所看到的,让规划器预测你想要的方式是非常重要的。虽然最终执行阶段没有完全执行,但它仍然必须查看分区的统计信息来决定要不要执行它,从而会导致一些时间浪费。另外,请记住,该测试用例非常简单。在实际的生产环境中,会有更多的分区和更多的数据,因此规划上浪费的时间要高得多,也会产生更大的影响。