PostgreSQL 教程: 增量备份

十一月 29, 2024

摘要:在本教程中,您将学习如何在 PostgreSQL 中执行增量备份。

目录

PostgreSQL Incremental Backup

为什么选择增量备份?

PostgreSQL 提供了一种方法,可以在不中断现有客户端连接的情况下,从正在运行的 PostgreSQL 服务器进行全量、一致的备份。对于大型数据库,这可能需要数小时或数天才能完成,并且在此期间可能会占用大量磁盘 I/O 带宽。为了解决这个问题,通常会偶尔地进行全量备份,并且会不断地进行 WAL 归档。因此,WAL 归档会被用作细粒度的增量备份。

但是,拥有一个真正的增量备份方案,可以大大降低备份的磁盘空间要求,并降低备份和还原操作的总体复杂性。使用一个全量备份和一个增量备份,而不是一个全量备份和几千个 WAL 文件,更容易且不易出错。

要在 PostgreSQL 17 中执行增量的备份和恢复,请执行以下步骤:

WAL 摘要

PostgreSQL 17 附带了一个称为 WAL summarizer process 的新后台工作进程,该进程会在pg_wal/summaries目录下创建 WAL 文件的摘要信息。默认情况下,此进程不会启用,但需要启用后增量备份才能正常工作。要启用它,请设置新的配置参数summarize_wal

ALTER SYSTEM SET summarize_wal = 'on';
SELECT pg_reload_conf();

更改此设置不需要重启 PostgreSQL 即可生效,只需重新加载即可。

全量备份

通常的备份方式是使用pg_basebackup,进行一次全量备份。让我们全量备份到一个目录/backup/full1中:

$ pg_basebackup -D /backup/full1 -c fast -p 7000 -v
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/10000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_1182"
pg_basebackup: write-ahead log end point: 0/10000120
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed

过一会儿,等做了一些更改,让我们再做一次全量备份(到目录/backup/full2),以便与增量备份进行比较:

$ pg_basebackup -D /backup/full2 -c fast -p 7000

增量备份

此时,让我们来做一次增量备份。为此,请使用新的-i, --incremental=OLDMANIFEST选项执行 pg_basebackup。我们希望从/backup/full1中的全量备份来获得增量,因此我们使用清单/backup/full1/backup_manifest,并在/backup/incr2中创建该备份:

$ pg_basebackup -D /backup/incr2 --incremental /backup/full1/backup_manifest -c fast -p 7000 -v
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/17000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_1223"
pg_basebackup: write-ahead log end point: 0/17000120
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed

让我们比较一下所有的三个备份大小:

$ du -sh /backup/*
189M    /backup/full1
189M    /backup/full2
51M     /backup/incr2

第二次全量备份/backup/full2的大小,与初次全量备份/backup/full1相同。增量备份/backup/incr2的大小仅为同一时间点的全量备份大小的四分之一多一点。

/backup/full2/backup/incr2的内容看起来很相似,但通过查看备份标签,您可以看到备份是否是增量备份:

$ cat /backup/full2/backup_label
START WAL LOCATION: 0/15000028 (file 000000010000000000000015)
CHECKPOINT LOCATION: 0/15000080
BACKUP METHOD: streamed
BACKUP FROM: primary
START TIME: 2024-11-22 10:13:37 IST
LABEL: pg_basebackup base backup
START TIMELINE: 1

$ cat /backup/incr2/backup_label
START WAL LOCATION: 0/17000028 (file 000000010000000000000017)
CHECKPOINT LOCATION: 0/17000080
BACKUP METHOD: streamed
BACKUP FROM: primary
START TIME: 2024-11-22 10:13:56 IST
LABEL: pg_basebackup base backup
START TIMELINE: 1
INCREMENTAL FROM LSN: 0/10000028
INCREMENTAL FROM TLI: 1

第二次增量备份

在发生更多更改后,使用第一个增量备份中的backup_manifest,创建第二个增量备份:

$ pg_basebackup -D /backup/incr3 --incremental /backup/incr2/backup_manifest -c fast -p 7000

恢复增量备份

要恢复一个增量备份,请使用新的工具 pg_combinebackup。下面的命令获取第一次全量备份/backup/full1和稍后的增量备份,将它们合并,并将输出写入到目录/backup/restore3

$ pg_combinebackup -o /backup/restore3 /backup/full1 /backup/incr2 /backup/incr3

我们可以用/backup/restore3中的数据目录启动 postmaster,并看到它在备份恢复后成功启动:

2024-11-22 10:26:28.269 UTC [1443] LOG:  starting PostgreSQL 17 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2024-11-22 10:26:28.272 UTC [1443] LOG:  listening on IPv6 address "::1", port 7001
2024-11-22 10:26:28.273 UTC [1443] LOG:  listening on IPv4 address "127.0.0.1", port 7001
2024-11-22 10:26:28.277 UTC [1443] LOG:  listening on Unix socket "/tmp/.s.PGSQL.7001"
2024-11-22 10:26:28.291 UTC [1446] LOG:  database system was interrupted; last known up at 2024-11-22 10:13:56 IST
2024-11-22 10:26:28.500 UTC [1446] LOG:  starting backup recovery with redo LSN 0/17000028, checkpoint LSN 0/17000080, on timeline ID 1
2024-11-22 10:26:28.511 UTC [1446] LOG:  redo starts at 0/17000028
2024-11-22 10:26:28.515 UTC [1446] LOG:  completed backup recovery with redo LSN 0/17000028 and end LSN 0/17000120
2024-11-22 10:26:28.515 UTC [1446] LOG:  consistent recovery state reached at 0/17000120
2024-11-22 10:26:28.515 UTC [1446] LOG:  redo done at 0/17000120 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2024-11-22 10:26:28.613 UTC [1444] LOG:  checkpoint starting: end-of-recovery immediate wait
2024-11-22 10:26:28.641 UTC [1444] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.009 s, sync=0.003 s, total=0.032 s; sync files=2, longest=0.002 s, average=0.002 s; distance=16384 kB, estimate=16384 kB; lsn=0/18000028, redo lsn=0/18000028
2024-11-22 10:26:28.652 UTC [1443] LOG:  database system is ready to accept connections

考虑

  • 备份目录:确保每个备份(无论是全量备份还是增量备份)都存储在单独的目录中,以防止覆盖文件。
  • 备份监控:定期监控备份,以确保它们成功完成,并正确维护好backup_manifest文件。
  • 还原过程:熟悉还原过程,特别是当您需要使用 PostgreSQL 17 中的pg_combinebackup工具组合多个增量备份时。此工具有助于将增量备份合并到一个可用状态。

通过执行这些步骤,您可以维护一系列的增量备份,以有效地捕获对 PostgreSQL 数据库的更改。

增量备份与时间点恢复

在当前的 PostgreSQL 版本中,可以先进行全量备份,然后归档自备份以来生成的每个 WAL 文件。这允许您恢复备份,然后重放每个 WAL 文件到特定的时间点,从而重新创建当时的数据目录。这通常称为时间点恢复(PITR)。

从逻辑上讲,按时间点恢复到一个增量备份点的结果,应该与恢复在该点产生的增量备份的结果相同。但是,区别在于增量备份的大小可能比组合的 WAL 文件大小要小一点或小得多。例如,如果一行被 100 个事务更新了,则 WAL 文件将必须记录所有这 100 次更改,但增量备份仅记录数据目录内容的初始状态和最终状态之间的差异。

如果您以前将 WAL 归档用于纯粹的增量备份目的,而不是时间点恢复,那么现在可以简单地使用新的增量备份功能。如果您需要按时间点恢复,WAL 归档和重放仍然是需要采用的方案。

了解更多

PostgreSQL 管理