October 23, 2023
Summary: in this tutorial, you will learn how to enable WAL archiving in PostgreSQL.
Table of Contents
Introduction
Archiving is the concept of storing history in a safe location for recovery. Similarly, in database technologies, we see transaction logs (WALs in PostgreSQL) being archived to a remote backup server or cloud to allow a database to be recovered during disasters. As WAL segments get recycled after a certain threshold, it is important to archive them to safe storage before they are gone from the server. This also helps with Point In Time Recovery (PITR) and also in situations where a standby is lagging behind the primary (or master) and requires the recently removed WAL segments to get back to sync. In this tutorial, we shall see the steps involved in enabling archiving in PostgreSQL.
Getting ready
In PostgreSQL, archiving is not enabled by default. To enable archiving, the PostgreSQL instance needs to be restarted. So, we need to make sure to enable archiving before going live with a production PostgreSQL server. We must also ensure that we choose a piece of storage that is redundant and safe from disasters, for storing the archives.
How to do it…
Follow these steps to complete this tutorial:
1. Verify archive_mode by running the following command:
$ psql -c "show archive_mode"
2. Set archive_mode
to ON
to enable archiving using ALTER SYSTEM
:
$ psql -c "ALTER SYSTEM SET archive_mode TO 'ON'"
3. Restart PostgreSQL by running the following command:
$ pg_ctl -D $PGDATA restart -mf
4. Create the archive location and set the ownership to postgres
:
$ mkdir -p /backups/archive
$ chown postgres:postgres /backups/archive
5. Set archive_command with the appropriate shell command or a script that copies archives to a safe location. Create the archive location and set the ownership to postgres
:
$ mkdir -p /backups/archive
$ psql -c "ALTER SYSTEM SET archive_command TO 'cp %p /backups/archive/%f'"
6. Run either reload
or SIGHUP
for the changes that were made to the archive_command
parameter to take effect:
$ psql -c "select pg_reload_conf()"
How it works…
To enable WAL segments to be archived to permanent storage, follow these steps:
-
Set the archive_mode parameter to ON (requires restart).
-
Set the archive_command parameter to a shell command or a script that can send archives to an archive location.
Before setting the archiving to a location, we must validate whether archive_mode
is set to ON, as shown in Step 1. If it is not set to ON, we can set it to ON using the command shown in Step 2. As a change to archive_mode
requires a restart, we need to restart the PostgreSQL cluster using a command similar to what we saw in Step 3.
To copy WAL segments to a different location for archiving, we need to provide the appropriate shell command, which can be used to perform the copy. In my setup, I am trying to safely archive the WAL segment to a NAS mounted named /backups
. So, I could just let the archiver process issue a simple copy command to copy the WAL segment to the NAS mount point, as shown in Step 5.
Here, %p => refers to the path of the WAL segment to be archived, while %f => refers to the WAL segment’s name. These two values are substituted by PostgreSQL.
Now, to get the changes to archive_command
into effect, we must perform either a reload
or a SIGHUP
, as seen in Step 5. Once done, archiving will be in effect.
There’s more…
As an example, after setting archive_command
with the previously mentioned shell command, the command that’s executed by PostgreSQL to archive one of the WAL segments may look as follows:
cp pg_wal/00000001000000D2000000C9 /backups/archive/00000001000000D2000000C9
As you can see, it copies the WAL segment to the location specified in the shell command.