PostgreSQL Storage Bloat Caused by Long Transactions

By John Doe June 9, 2026

Summary: In this article, we will explore the issue of PostgreSQL storage bloat caused by long transactions and its impact.

Table of Contents

Preparing the Base Data

Adjust the database configuration parameters (modify the configuration file postgresql.conf):

shared_buffers = 1GB
max_wal_size = 4GB
synchronous_commit = off

Note: Here, we set the synchronous_commit parameter to off to simulate a high-throughput workload supported by high-speed storage devices. This is not recommended for actual production environments.

Next, we will create a table, insert some data, and perform cleanup operations on the table data in a PostgreSQL 18.4 database.

-- Session 1
CREATE TABLE bloat (id int PRIMARY KEY, filler char(130));
INSERT INTO bloat (id, filler)
  SELECT i, sha512(random()::text::bytea)
    FROM generate_series(1, 5000000) AS s(i);

VACUUM ANALYZE bloat;

Check the space occupied by the table and its indexes:

-- Session 1
SELECT relname, pg_size_pretty(pg_relation_size(oid))
  FROM pg_class WHERE relname LIKE 'bloat%';
  relname   | pg_size_pretty
------------+----------------
 bloat      | 831 MB
 bloat_pkey | 107 MB
(2 rows)

To comparatively analyze the database bloat, we choose to use a “frequent updates” test model. Create a new test.sql file with a custom SQL script to define a high-frequency update workload:

\set rnd random(1, 5000000)

UPDATE bloat SET filler = sha512(random()::text::bytea) WHERE id = :rnd;

After preparing the base table and initial data, let’s start a session, begin a transaction within it, and keep it in an idle state:

-- Session 2
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT filler FROM bloat WHERE id = 202606;

Running Benchmark Pressure Test

Run the test using pgbench with the -f parameter to specify the custom script:

pgbench --no-vacuum -f test.sql --time=600 --client=8 --jobs=8
pgbench (18.4)
transaction type: test.sql
scaling factor: 1
query mode: simple
number of clients: 8
number of threads: 8
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 13401301
number of failed transactions: 0 (0.000%)
latency average = 0.358 ms
initial connection time = 4.734 ms
tps = 22335.334971 (without initial connection time)

Check the space occupied by the table and its indexes:

-- Session 1
SELECT relname, pg_size_pretty(pg_relation_size(oid))
  FROM pg_class WHERE relname LIKE 'bloat%';
  relname   | pg_size_pretty
------------+----------------
 bloat      | 3059 MB
 bloat_pkey | 353 MB
(2 rows)

Space Reclamation After the Long Transaction Ends

Let’s terminate the idle transaction in session two.

-- Session 2
COMMIT;

After waiting for about 2 minutes (the default value for the autovacuum_naptime parameter is 1 minute), check the space occupied by the table and its indexes:

-- Session 1
SELECT relname, pg_size_pretty(pg_relation_size(oid))
  FROM pg_class WHERE relname LIKE 'bloat%';
  relname   | pg_size_pretty
------------+----------------
 bloat      | 3059 MB
 bloat_pkey | 353 MB
(2 rows)

As we can see, after running the benchmark pressure test and autovacuum, PostgreSQL did not reclaim the space occupied by dead tuples.

Performance of Redrock Postgres

Next, we will execute the same test case in a Redrock Postgres database. Before running the benchmark stress test, let’s check the space occupied by the table and its indexes, as well as the current usage of undo segments in the database:

-- Session 1
SELECT relname, pg_size_pretty(pg_relation_size(oid))
  FROM pg_class WHERE relname LIKE 'bloat%';
  relname   | pg_size_pretty
------------+----------------
 bloat      | 766 MB
 bloat_pkey | 108 MB
(2 rows)

SELECT pg_size_pretty(sum(pg_relation_size(oid))) AS total_undo_size
  FROM pg_class WHERE relkind = 'u';
 total_undo_size
-----------------
 598 MB
(1 row)

The output shows that the undo segments in the current database have already occupied 598 MB of storage space.

As before, open a session, start a transaction within it and keep the transaction idle.

Then run the same benchmark pressure test:

pgbench --no-vacuum -f test.sql --time=600 --client=8 --jobs=8
pgbench (18.4)
transaction type: test.sql
scaling factor: 1
query mode: simple
number of clients: 8
number of threads: 8
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 20289782
number of failed transactions: 0 (0.000%)
latency average = 0.237 ms
initial connection time = 5.724 ms
tps = 33816.081613 (without initial connection time)

Let’s check the space occupied by the table and its indexes again, along with the current usage of undo segments in the database:

-- Session 1
SELECT relname, pg_size_pretty(pg_relation_size(oid))
  FROM pg_class WHERE relname LIKE 'bloat%';
  relname   | pg_size_pretty
------------+----------------
 bloat      | 766 MB
 bloat_pkey | 108 MB
(2 rows)

SELECT pg_size_pretty(sum(pg_relation_size(oid))) AS total_undo_size
  FROM pg_class WHERE relkind = 'u';
 total_undo_size
-----------------
 6777 MB
(1 row)

The output indicates that the space occupied by the table and indexes has not changed. However, due to the presence of the long transaction, the storage space occupied by the undo segments rose to 6777 MB after the benchmark stress test completed.

Next, we terminate the idle transaction in session two. After waiting for about 10 minutes, check the usage of the undo segments:

-- Session 1
SELECT pg_size_pretty(sum(pg_relation_size(oid))) AS total_undo_size
  FROM pg_class WHERE relkind = 'u';
 total_undo_size
-----------------
 947 MB
(1 row)

After the undo segment space increases due to a long transaction, Redrock Postgres is able to shrink the storage space of the undo segments in real time.

Test Conclusion

In PostgreSQL, the autovacuum background process periodically cleans up garbage (dead tuples). If you’re running very long transaction in the Repeatable Read isolation mode, be it even SELECT, it will prevent the cleanup of dead tuples generated after that transaction started. Long transactions cause the database to accumulate a large amount of garbage data, which not only occupies storage space but also triggers other issues.

When we’re speaking about Long Transactions the time is a bad measure. Having transaction in read only database open for weeks does no harm, however if database has very high update rate, say 10K+ rows are modified every second even 5 minute transaction may be considered long as it will be enough to accumulate about 3 million of row changes.

The difference in the above test results lies in how space is handled: VACUUM in PostgreSQL merely reclaims space and makes it reusable; in most cases, this extra space is not returned to the operating system but is kept within the same table for future reuse. In Redrock Postgres, the occurrence of long transactions causes the space occupied by undo segments to increase. However, once the total space occupied by the undo segments exceeds 1GB, the background maintenance process automatically shrinks them to reclaim the excess space.

Note: In the version used for this test, we have improved the reclamation efficiency of undo segment storage space. The cleanup cycle for storage space has been adjusted from the original 1 hour to the duration set by checkpoint_timeout; the cleanup threshold has been changed from triggering when a single undo segment reaches 1GB to triggering when the total space occupied by all undo segments exceeds 1GB. This improvement will be reflected in the next release of Redrock Postgres.

Preparing the Base Data

Running Benchmark Pressure Test

Space Reclamation After the Long Transaction Ends

Performance of Redrock Postgres

Test Conclusion

Search

Categories

Tags