By John Doe January 26, 2026
Summary: Have you ever encountered a situation in PostgreSQL where multiple autovacuum processes keep running continuously? At this point, most of the resources in the system are used for VACUUM operations. In this article, we will explore the high cost and uncertainty of VACUUM in PostgreSQL, as well as related solutions.
Table of Contents
Use Case of VACUUM old rows
Below, we will create a table in a PostgreSQL database, insert some data into it, and then perform VACUUM and ANALYZE on the table.
-- Session 1
CREATE TABLE t_large (id integer, name text);
INSERT INTO t_large (id, name)
SELECT i, repeat('Pg', 64)
FROM generate_series(1, 1000000) AS s(i);
CREATE INDEX large_idx ON t_large (id);
VACUUM ANALYZE t_large;
After preparing the base table and initial data, let’s start a session, initiate a transaction in this session, and keep it in an idle state:
-- Session 2
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT name FROM t_large WHERE id = 202510;
Next, let’s perform some update operations on a portion of the data in the table; of course, we can also delete a portion of the data.
-- Session 1
UPDATE t_large SET name = 'dummy' WHERE mod(id, 100) = 1;
DELETE FROM t_large WHERE mod(id, 100) = 50;
SELECT
C.reltuples,
S.n_dead_tup,
S.n_ins_since_vacuum,
S.n_mod_since_analyze,
S.last_vacuum
FROM pg_class C LEFT JOIN
pg_stat_all_tables S ON C.oid = S.relid
WHERE S.relname = 't_large';
reltuples | n_dead_tup | n_ins_since_vacuum | n_mod_since_analyze | last_vacuum
-----------+------------+--------------------+---------------------+------------------------------
1e+06 | 20000 | 0 | 20000 | 2026-01-23 12:17:17.96903+08
(1 row)
The above update and delete operations have generated a certain number of old row versions in the table. Now, let’s perform a VACUUM cleanup on this table to observe the process of the autovacuum process cleaning up old row versions:
-- Session 1
VACUUM (verbose) t_large;
INFO: vacuuming "postgres.public.t_large"
INFO: finished vacuuming "postgres.public.t_large": index scans: 0
pages: 0 removed, 20409 remain, 20409 scanned (100.00% of total)
tuples: 0 removed, 1010000 remain, 20000 are dead but not yet removable
removable cutoff: 753, which was 2 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 40833 hits, 0 misses, 0 dirtied
WAL usage: 1 records, 0 full page images, 188 bytes
system usage: CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s
...
SELECT
C.reltuples,
S.n_dead_tup,
S.n_ins_since_vacuum,
S.n_mod_since_analyze,
S.last_vacuum
FROM pg_class C LEFT JOIN
pg_stat_all_tables S ON C.oid = S.relid
WHERE S.relname = 't_large';
reltuples | n_dead_tup | n_ins_since_vacuum | n_mod_since_analyze | last_vacuum
-----------+------------+--------------------+---------------------+-------------------------------
990000 | 20000 | 0 | 20000 | 2026-01-23 12:18:07.681234+08
(1 row)
The above VACUUM operation scanned all 20,409 pages of the table but did not clean up any old row versions. Since the transaction in Session 2 is still in progress, the system needs to retain the old row versions to maintain their visibility for this transaction. As a result, the VACUUM operation in Session 1 is ineffective and does not clean up any garbage data.
The same issue can occur with the regular VACUUM actions scheduled by the background autovacuum process: it frequently consumes storage I/O resources but actually fails to clean up any old row versions. Moreover, as subsequent update and delete operations continue, the old row versions generated by the system will accumulate continuously, eventually causing an “avalanche effect” on the system.
Analysis for the VACUUM scenario
PostgreSQL adopts the Multi-Version Concurrency Control (MVCC) mechanism. When data is updated or deleted, the original records are not immediately removed but are marked as invisible. This design supports transaction rollback and snapshot reads, which leads to the generation of old row versions. For example, in an order table where the order status is frequently updated, each update will generate an old row version.
There is a persistent daemon process, called the autovacuum launcher, attempting to start one worker within each database every autovacuum_naptime seconds. If the number of tuples obsoleted since the last VACUUM exceeds a certain threshold, the table is vacuumed. The condition to vacuum a table is defined as:
pg_stat_all_tables.n_dead_tup >
autovacuum_vacuum_threshold +
autovacuum_vacuum_scale_factor * pg_class.reltuples
If the number of tuples inserted since the last vacuum has exceeded a certain threshold, the table is also vacuumed. The condition for autovacuum to vacuum a table is defined as:
pg_stat_all_tables.n_ins_since_vacuum >
autovacuum_vacuum_insert_threshold +
autovacuum_vacuum_insert_scale_factor * pg_class.reltuples
In the above use case, when the background autovacuum worker processes are unable to effectively clean up old row versions, new worker processes will continuously start at intervals set by autovacuum_naptime to perform the VACUUM, resulting in up to autovacuum_max_workers number of autovacuum worker processes being busy at the same time.
In scenarios with large data volumes and frequent updates, the cost of cleaning up old row versions is high. This is because the cleanup process (VACUUM) needs to scan a large amount of data, determine which row versions are no longer needed, and reclaim space. To reduce the cost of old row version cleanup, the following measures can be considered:
- Execute VACUUM regularly: Arrange the execution time and frequency of VACUUM reasonably (e.g., execute it during off-peak business hours to minimize impact on business operations). At the same time, select an appropriate VACUUM method based on the characteristics of the table—for instance, use regular VACUUM to clean up dead tuples, and VACUUM FULL to free up disk space (note that VACUUM FULL will block access).
- Tuning autovacuum parameters: Tuning parameters such as autovacuum_max_workers and autovacuum_naptime to enable the autovacuum process to clean up old row versions more efficiently. For example, appropriately lowering
autovacuum_max_workersand increasingautovacuum_naptimecan reduce the impact of ineffective VACUUM work on the database system.
Conclusion
In this article, we have examined the potential avalanche effect that may occur when cleaning up old row versions in PostgreSQL, as well as possible measures to mitigate this issue.
In addition, Redrock Postgres has introduced the undo logging. The system can generate old row versions on-demand by applying the undo logs to new row versions. Furthermore, after the commit of a transaction that performs deletions or non-in-place updates, the space is reclaimed immediately. Therefore, after updating or deleting table data, there is no need to regularly trigger the cleanup of old row versions.
Note: In Redrock Postgres, the background autovacuum process still exists and must be relied upon to periodically ANALYZE tables and update the database’s statistics. You can set
autovacuum_vacuum_thresholdto an extremely large value (e.g., 2000000000) and setautovacuum_vacuum_insert_thresholdto -1, to prevent autovacuum processes from performing theVACUUMwork.