由 John Doe 四月 1, 2025
你在升级 PostgreSQL 大版本的时候,会担心升级时间过长吗?现在,PostgreSQL 有了一个新的 swap 升级模式。
特性提交日志
pg_upgrade:添加 “–swap” 选项以实现更快的文件传输。
这个新选项指示pg_upgrade
将数据目录从旧集群移动到新集群,然后用为新集群生成的系统表文件替换旧的系统表文件。这种模式的性能可能优于--link
、--clone
、--copy
和--copy-file-range
选项,尤其是在包含许多数据库对象(关系)的集群上。
然而,这种模式会在旧集群中产生许多垃圾文件。如果使用--sync-method=syncfs
选项,这些垃圾文件可能会延长文件刷写的同步时间。为了解决这个问题,我们建议在此模式下使用--sync-method=fsync
选项,并且pg_upgrade
会在内部使用initdb --sync-only --no-sync-data-files
命令来进行文件刷写的同步。pg_upgrade
会在传输系统表文件时对其进行刷写的同步。我们假定从旧集群传输过来的数据库文件在升级之前已经进行了同步。
这种模式也会使回滚到旧集群变得更加复杂。因此,我们建议在文件传输过程中或传输完成后出现失败的情况下,从备份中恢复数据。我们确实考虑过让pg_upgrade
学会如何为这种失败情况生成回滚脚本,但最终决定不这么做,原因是文件传输过程中出现失败的情况很少见,生成脚本的过程比较复杂,而且存在脚本被误用的可能性。
这种新模式仅限于位于同一文件系统中的集群之间的升级。经过一些努力,我们或许能够支持不同文件系统之间的升级,但是如果我们必须跨文件系统边界复制文件,这种模式可能不会带来太多好处。
它还仅限于从 10 版本或更高版本进行升级。使用交换模式从旧版本进行升级存在一些已知的障碍。例如,可见性映射文件的格式在 9.6 版本中发生了变化,序列的元组格式在 10 版本中发生了变化。实际上,交换模式在使用pg_dump
时会省略--sequence-data
选项,而是直接重用旧集群的序列数据文件。虽然要让交换模式能处理这些变化肯定是可行的(而且无论如何,我们将来可能还得处理类似的问题),但为支持从早已不受支持的版本进行升级而付出这样的努力,似乎并不值得。
讨论:https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
示例
当你想要从一个主要版本的 PostgreSQL 升级到另一个版本时,可能会选择使用 pg_upgrade(或者逻辑复制)。目前 pg_upgrade 已有几种操作模式:
- –copy:将数据文件从旧集群复制到新集群。
- –clone:进行克隆而非复制(前提是文件系统支持)。
- –copy-file-range:如果文件系统支持,使用 copy_file_range 系统调用进行高效复制。
- –link:使用硬链接而非复制文件。
哪种模式最适合你,取决于具体需求。我们通常会选择 “--link” 模式,因为它速度相当快,但只有在旧集群和新集群位于同一文件系统时才能使用。这种模式的缺点是,一旦新集群启动,就无法再使用旧集群。
在 PostgreSQL 18 中,可能会新增一个名为 “--swap” 的选项。这种模式不是链接或复制文件,而是将文件从旧集群移动到新集群,然后用新集群的系统表文件替换旧的系统表文件。之所以增加这种模式,是因为当集群中包含许多数据库对象(关系)时,它的性能可能比 “--link” 模式(以及其他模式)还要好。
我们来验证一下,通过创建两个包含许多数据库对象(关系)的 PostgreSQL 17 集群:
pgenv $ initdb --version
initdb (PostgreSQL) 17.2
pgenv $ initdb -D /var/tmp/dummy/17.2_1 --data-checksums
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
...
Success. You can now start the database server using:
pg_ctl -D /var/tmp/17.2 -l logfile start
pgenv $ echo "port=8888" >> /var/tmp/dummy/17.2_1/postgresql.auto.conf
pgenv $ pg_ctl --pgdata=/var/tmp/dummy/17.2_1 start -l /dev/null
waiting for server to start.... done
server started
pgenv $ psql -p 8888 -l
List of databases
Name | Owner | Encoding | Locale Provider | Collate | Ctype | Locale | ICU Rules | Access privileges
-----------+----------+----------+-----------------+-------------+-------------+--------+-----------+-----------------------
postgres | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | | |
template0 | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
(3 rows)
下面是一个小脚本,用于创建一些表、索引并插入一些数据:
#!/bin/bash
for i in {1..10000}; do
psql -p 8888 -c "create table t${i} ( a int, b text )"
psql -p 8888 -c "insert into t${i} select i, i::text from generate_series(1,1000) i;"
psql -p 8888 -c "create index i${i} on t${i}(a);"
done
如果对该集群运行这个脚本,我们将得到 10000 个表(每个表包含 1000 行数据)和 10000 个索引。这对于进行快速测试来说,已经有足够多的数据库对象(关系)了。
我们通过复制第一个集群来创建第二个集群:
pgenv $ mkdir /var/tmp/dummy/17.2_2/
pgenv $ pg_basebackup --port=8888 --pgdata=/var/tmp/dummy/17.2_2/ --checkpoint=fast
pgenv $ sed -i 's/8888/8889/g' /var/tmp/dummy/17.2_2/postgresql.auto.conf
现在,我们创建两个要升级到的 PostgreSQL 18 集群。其中一个使用 “–link” 模式升级,另一个使用新的 “–swap” 模式升级,我们还要停止旧的集群:
pgdev $ initdb --version
initdb (PostgreSQL) 18devel
pgdev $ initdb --pgdata=/var/tmp/dummy/18link
pgdev $ initdb --pgdata=/var/tmp/dummy/18swap
pgdev $ echo "port=9000" >> /var/tmp/dummy/18link/postgresql.auto.conf
pgdev $ echo "port=9001" >> /var/tmp/dummy/18swap/postgresql.auto.conf
pgdev $ pg_ctl --pgdata=/var/tmp/dummy/17.2_1/ stop
快速做一下升级前的检查,看看是否一切正常:
pgdev $ pg_upgrade --version
pg_upgrade (PostgreSQL) 18devel
pgdev $ export PGDATAOLD=/var/tmp/dummy/17.2_1/
pgdev $ export PGDATANEW=/var/tmp/dummy/18link/
pgdev $ export PGBINOLD=/opt/postgres/product/17/db_2/bin
pgdev $ export PGBINNEW=/opt/postgres/product/DEV/db_0/bin/
pgdev $ pg_upgrade --check
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database connection settings ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for valid logical replication slots ok
Checking for subscription state ok
Checking data type usage ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for new cluster tablespace directories ok
*Clusters are compatible*
是时候使用 “–link” 模式进行测试了:
pgdev $ time pg_upgrade --link
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database connection settings ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for valid logical replication slots ok
Checking for subscription state ok
Checking data type usage ok
Creating dump of global objects ok
Creating dump of database schemas
ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for new cluster tablespace directories ok
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.
Performing Upgrade
------------------
Setting locale and encoding for new cluster ok
Analyzing all rows in the new cluster ok
Freezing all rows in the new cluster ok
Deleting files from new pg_xact ok
Copying old pg_xact to new server ok
Setting oldest XID for new cluster ok
Setting next transaction ID and epoch for new cluster ok
Deleting files from new pg_multixact/offsets ok
Copying old pg_multixact/offsets to new server ok
Deleting files from new pg_multixact/members ok
Copying old pg_multixact/members to new server ok
Setting next multixact ID and offset for new cluster ok
Resetting WAL archives ok
Setting frozenxid and minmxid counters in new cluster ok
Restoring global objects in the new cluster ok
Restoring database schemas in the new cluster
ok
Adding ".old" suffix to old global/pg_control ok
If you want to start the old cluster, you will need to remove
the ".old" suffix from /var/tmp/dummy/17.2_1/global/pg_control.old.
Because "link" mode was used, the old cluster cannot be safely
started once the new cluster has been started.
Linking user relation files
ok
Setting next OID for new cluster ok
Sync data directory to disk ok
Creating script to delete old cluster ok
Checking for extension updates ok
Upgrade Complete
----------------
Some optimizer statistics may not have been transferred by pg_upgrade.
Once you start the new server, consider running:
/opt/postgres/product/DEV/db_0/bin/vacuumdb --all --analyze-in-stages --missing-stats-only
Running this script will delete the old cluster's data files:
./delete_old_cluster.sh
real 0m13.776s
user 0m0.654s
sys 0m1.536s
我们用新的 “–swap” 模式进行同样的测试:
pgdev $ export PGDATAOLD=/var/tmp/dummy/17.2_2/
pgdev $ export PGDATANEW=/var/tmp/dummy/18swap/
pgdev $ export PGBINOLD=/opt/postgres/product/17/db_2/bin
pgdev $ export PGBINNEW=/opt/postgres/product/DEV/db_0/
pgdev $ pg_upgrade --check
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database connection settings ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for valid logical replication slots ok
Checking for subscription state ok
Checking data type usage ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for new cluster tablespace directories ok
*Clusters are compatible*
pgdev $ time pg_upgrade --swap
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database connection settings ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for valid logical replication slots ok
Checking for subscription state ok
Checking data type usage ok
Creating dump of global objects ok
Creating dump of database schemas
ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
Checking for new cluster tablespace directories ok
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.
Performing Upgrade
------------------
Setting locale and encoding for new cluster ok
Analyzing all rows in the new cluster ok
Freezing all rows in the new cluster ok
Deleting files from new pg_xact ok
Copying old pg_xact to new server ok
Setting oldest XID for new cluster ok
Setting next transaction ID and epoch for new cluster ok
Deleting files from new pg_multixact/offsets ok
Copying old pg_multixact/offsets to new server ok
Deleting files from new pg_multixact/members ok
Copying old pg_multixact/members to new server ok
Setting next multixact ID and offset for new cluster ok
Resetting WAL archives ok
Setting frozenxid and minmxid counters in new cluster ok
Restoring global objects in the new cluster ok
Restoring database schemas in the new cluster
ok
Adding ".old" suffix to old global/pg_control ok
Because "swap" mode was used, the old cluster can no longer be
safely started.
Swapping data directories
ok
Setting next OID for new cluster ok
Sync data directory to disk ok
Creating script to delete old cluster ok
Checking for extension updates ok
Upgrade Complete
----------------
Some optimizer statistics may not have been transferred by pg_upgrade.
Once you start the new server, consider running:
/opt/postgres/product/DEV/db_0/bin/vacuumdb --all --analyze-in-stages --missing-stats-only
Running this script will delete the old cluster's data files:
./delete_old_cluster.sh
real 0m11.426s
user 0m0.600s
sys 0m0.659s
使用 “–swap” 模式大约快了 2 秒,虽然提升不算多,但至少更快了。当然,这只是一个非常简单的测试用例,还需要进一步测试。另外请注意输出中的警告信息:
Because "swap" mode was used, the old cluster can no longer be
safely started.
Swapping data directories
这是使用这种模式的必然结果。感谢社区的所有相关人员。
参考
提交日志:https://git.postgresql.org/pg/commitdiff/626d7236b65da50423df7de035e86f273cd36b49