May 27, 2024
Summary: In this tutorial, you will learn how to tune autovacuum for dead tuple cleanup in PostgreSQL.
Table of Contents
The best-known autovacuum task is cleaning up of dead tuples from UPDATE
or DELETE
operations. If autovacuum cannot keep up with cleaning up dead tuples, you should follow these three tuning steps:
Make sure that nothing keeps autovacuum from reclaiming dead tuples
Sometimes, autovacuum won’t remove the dead tuples. Most often, the culprit are long running transactions. Unless you can remove these obstacles, tuning autovacuum will be useless.
If you cannot fight the problem at its root, you can use the configuration parameter idle_in_transaction_session_timeout to have PostgreSQL terminate sessions that stay “idle in transaction” for too long. That causes errors on the client side, but may be justified if you have no other way to keep your database operational. Similarly, to fight long-running queries, you can use statement_timeout.
Tuning autovacuum to run faster
If autovacuum cannot keep up with cleaning up dead tuples, the solution is to make it work faster. This may seem obvious, but many people fall into the trap of thinking that making autovacuum start earlier or run more often will solve the problem.
VACUUM
is a resource-intensive operation, so autovacuum by default operates deliberately slowly. The goal is to have it work in the background without being in the way of normal database operation. But if your workload creates lots of dead tuples, you will have to make it more aggressive:
- increase autovacuum_vacuum_cost_limit from its default value of 200 (this is the gentle method).
- reduce autovacuum_vacuum_cost_delay from its default value of 2 (in older versions: 20!) milliseconds (this is the effective method).
Setting autovacuum_vacuum_cost_delay
to 0 will make autovacuum as fast as a manual VACUUM
– that is, as fast as possible.
Since not all tables grow dead tuples at the same pace, it is usually best not to change the global setting in postgresql.conf
, but to change the setting individually for busy tables:
ALTER TABLE busy_table SET (autovacuum_vacuum_cost_delay = 1);
Partitioning a table can also help with getting the job done faster. The advantage here is you can have several autovacuum workers working on several partitions in parallel, so that the partitioned table as a whole is done faster than a single autovacuum worker could.
Change the workload so that fewer dead tuples are generated
If nothing else works, you have to see that fewer dead tuples are generated. Perhaps several UPDATE
s to a single row could be combined to a single UPDATE
?
Often you can significantly reduce the number of dead tuples by using “HOT updates”:
- set the fillfactor for the table to a value less than 100, so that
INSERT
s leave some free space in each block. - make sure that no column that you modify in the
UPDATE
is indexed.
Then any SELECT
or DML statement can clean up dead tuples, and there is less need for VACUUM
.