PostgreSQL 18 预览: 支持异步 I/O

John Doe 四月 11, 2025

你是否介意 PostgreSQL 的双重缓冲?现在,PostgreSQL 支持异步 I/O 了。

目录

沙滩上漫步的大象

异步 I/O 简介

传统上,PostgreSQL 依赖操作系统来隐藏磁盘写入的延迟,而这一过程是同步进行的,可能会导致双重缓冲(PostgreSQL 共享缓冲区和操作系统文件缓存)。这在写入预写式日志(WAL)时最为关键,因为 PostgreSQL 必须确保更改已刷新到磁盘,并需要等待确认。

在进行测试之前,让我们从参数的角度看看有哪些新变化。新参数之一是io_method

show io_method;
 io_method 
-----------
 worker
(1 row)

其默认值为 “worker”,执行异步操作最多的工作进程数目由io_workers控制:

show io_workers;
 io_workers 
------------
 3
(1 row)

在操作系统中也可以看到这一点:

$ ps aux | grep "io worker" | grep -v grep
postgres   29732  0.0  0.1 224792  7052 ?        Ss   Apr08   0:00 postgres: pgdev: io worker 1
postgres   29733  0.0  0.2 224792  9884 ?        Ss   Apr08   0:00 postgres: pgdev: io worker 0
postgres   29734  0.0  0.1 224792  7384 ?        Ss   Apr08   0:00 postgres: pgdev: io worker 2

io_method的其他可能设置有:

  • io_uring:使用 io_uring 进行异步 I/O
  • sync:PostgreSQL 18 之前的行为,执行同步 I/O

只有当io_method设置为默认配置 “worker” 时,io_workers才会生效。

准备测试环境

和往常一样:以下只是一些基本测试。请在你自己的环境中,针对特定的业务负载进行测试,以获得有意义的数据。特别是在公有云环境中进行测试时,要注意这些数字可能无法完全反映真实情况。

我们将在运行 Debian 12 的系统上进行测试。文件系统为 ext4(默认设置):

$ grep proc /proc/cpuinfo
processor       : 0
processor       : 1

$ free -g
              total        used        free      shared  buff/cache   available
Mem:               7           0           4           0           3           7
Swap:              0           0           0

$ mount | grep 18
/dev/nvme1n1 on /db/pgsql/18 type ext4 (rw,relatime)

使用默认设置初始化 PostgreSQL:

$ /usr/pgsql/18/bin/initdb --pgdata=/db/pgsql/18/data/

更改以下的设置:

$ echo "shared_buffers='2GB'" >> /db/pgsql/18/data/postgresql.auto.conf
$ echo "checkpoint_timeout='20min'" >> /db/pgsql/18/data/postgresql.auto.conf
$ echo "random_page_cost=1.1" >> /db/pgsql/18/data/postgresql.auto.conf
$ echo "max_wal_size='8GB'" >> /db/pgsql/18/data/postgresql.auto.conf  
$ /usr/pgsql/18/bin/pg_ctl --pgdata=/db/pgsql/18/data/ -l /dev/null start

$ /usr/pgsql/18/bin/psql -c "select version()"
                               version
--------------------------------------------------------------------
 PostgreSQL 18devel on x86_64-linux, compiled by gcc-12.2.0, 64-bit
(1 row)

$ export PATH=/usr/pgsql/18/bin/:$PATH

pgbench 数据加载测试

第一个测试是数据加载。设置io_method为 “worker”,测试加载数据需要多长时间,连续进行 3 次测试。这样会生成大约 1536MB 的数据集:

$ pgbench -i -s 100
dropping old tables...
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 31.85 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 24.82 s, vacuum 0.35 s, primary keys 6.68 s).

$ pgbench -i -s 100
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 31.97 s (drop tables 0.24 s, create tables 0.00 s, client-side generate 25.44 s, vacuum 0.34 s, primary keys 5.93 s).

$ pgbench -i -s 100
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 30.72 s (drop tables 0.26 s, create tables 0.00 s, client-side generate 23.93 s, vacuum 0.55 s, primary keys 5.98 s).

使用 “sync” 进行相同的测试:

$ psql -c "alter system set io_method='sync'"
$ pg_ctl --pgdata=/db/pgsql/18/data/ restart -l /dev/null
$ psql -c "show io_method"
 io_method 
-----------
 sync
(1 row)

$ pgbench -i -s 100
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 20.89 s (drop tables 0.29 s, create tables 0.01 s, client-side generate 14.70 s, vacuum 0.45 s, primary keys 5.44 s).

$ pgbench -i -s 100
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 21.57 s (drop tables 0.20 s, create tables 0.00 s, client-side generate 16.13 s, vacuum 0.46 s, primary keys 4.77 s).

$ pgbench -i -s 100
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 21.44 s (drop tables 0.20 s, create tables 0.00 s, client-side generate 16.04 s, vacuum 0.52 s, primary keys 4.67 s).

最后使用 “io_uring” 进行测试:

$ psql -c "alter system set io_method='io_uring'"
ALTER SYSTEM
$ pg_ctl --pgdata=/db/pgsql/18/data/ restart -l /dev/null
waiting for server to shut down.... done
server stopped
waiting for server to start.... done
server started
$ psql -c "show io_method"
 io_method 
-----------
 io_uring
(1 row)

$ pgbench -i -s 100
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 20.63 s (drop tables 0.35 s, create tables 0.01 s, client-side generate 14.92 s, vacuum 0.47 s, primary keys 4.88 s).

$ pgbench -i -s 100
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 20.81 s (drop tables 0.29 s, create tables 0.00 s, client-side generate 14.43 s, vacuum 0.46 s, primary keys 5.63 s).

$ pgbench -i -s 100
dropping old tables...
creating tables...
generating data (client-side)...
vacuuming...                                                                                   
creating primary keys...
done in 21.11 s (drop tables 0.24 s, create tables 0.00 s, client-side generate 15.63 s, vacuum 0.53 s, primary keys 4.70 s).

“sync” 和 “io_uring” 之间没有太大区别,但对于这种类型的业务负载,“worker” 明显较慢。

pgbench 基准测试

接下来,让我们看看标准的 pgbench 基准测试结果如何。我们从 “io_uring” 开始,因为这是当前的设置:

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 567989
number of failed transactions: 0 (0.000%)
latency average = 2.113 ms
initial connection time = 8.996 ms
tps = 946.659673 (without initial connection time)

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 557640
number of failed transactions: 0 (0.000%)
latency average = 2.152 ms
initial connection time = 6.994 ms
tps = 929.408406 (without initial connection time)

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 563613
number of failed transactions: 0 (0.000%)
latency average = 2.129 ms
initial connection time = 16.351 ms
tps = 939.378627 (without initial connection time)

使用 “worker” 进行相同的测试:

$ psql -c "alter system set io_method='worker'"
$ pg_ctl --pgdata=/db/pgsql/18/data/ restart -l /dev/null
waiting for server to shut down............. done
server stopped
waiting for server to start.... done
server started

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 549176
number of failed transactions: 0 (0.000%)
latency average = 2.185 ms
initial connection time = 7.189 ms
tps = 915.301403 (without initial connection time)

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 564898
number of failed transactions: 0 (0.000%)
latency average = 2.124 ms
initial connection time = 11.332 ms
tps = 941.511304 (without initial connection time)

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 563041
number of failed transactions: 0 (0.000%)
latency average = 2.131 ms
initial connection time = 9.120 ms
tps = 938.412979 (without initial connection time)

最后使用 “sync” 进行相同的测试:

$ psql -c "alter system set io_method='sync'"
$ pg_ctl --pgdata=/db/pgsql/18/data/ restart -l /dev/null
waiting for server to shut down............ done
server stopped
waiting for server to start.... done
server started

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 560420
number of failed transactions: 0 (0.000%)
latency average = 2.141 ms
initial connection time = 12.000 ms
tps = 934.050237 (without initial connection time)

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 560077
number of failed transactions: 0 (0.000%)
latency average = 2.143 ms
initial connection time = 7.204 ms
tps = 933.469665 (without initial connection time)

$ pgbench --time=600 --client=2 --jobs=2
pgbench (18devel)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 2
number of threads: 2
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 566150
number of failed transactions: 0 (0.000%)
latency average = 2.120 ms
initial connection time = 7.579 ms
tps = 943.591451 (without initial connection time)

如你所见,无论使用哪种io_method,差异都不大。让我们给系统施加更多压力(这里仅给出测试结果的摘要):

$ pgbench --time=600 --client=10 --jobs=10
## sync
tps = 2552.785398 (without initial connection time)
tps = 2505.476064 (without initial connection time)
tps = 2542.419230 (without initial connection time)
## io_uring
tps = 2511.138931 (without initial connection time)
tps = 2529.705311 (without initial connection time)
tps = 2573.195751 (without initial connection time)
## worker
tps = 2531.657962 (without initial connection time)
tps = 2523.854335 (without initial connection time)
tps = 2515.490351 (without initial connection time)

从某些方面来看,差异并不大。最后再进行一项测试,给系统施加更大的压力:

$ pgbench --time=600 --client=20 --jobs=20
## worker
tps = 2930.268033 (without initial connection time)
tps = 2799.499964 (without initial connection time)
tps = 3033.491153 (without initial connection time)
## io_uring
tps = 2942.542882 (without initial connection time)
tps = 3061.487286 (without initial connection time)
tps = 2995.175169 (without initial connection time)
## sync
tps = 2997.654084 (without initial connection time)
tps = 2924.269626 (without initial connection time)
tps = 2753.853272 (without initial connection time)

至少就这些测试而言,io_method的三种设置之间并没有太大差异(“sync” 似乎要稍微慢一点),但仍然表现不错。对于这样一项重大的特性来说,能达到与之前相同的性能表现就已经很不错了。PostgreSQL 的性能一直在提升,而且相信在这个领域也会带来很多改进。

非常不错的体验。感谢社区的所有相关人员。

参考

提交日志:https://git.postgresql.org/pg/commitdiff/da7226993fd4b73d8b40abb7167d124eada97f2e