By John Doe May 21, 2026
Summary: In this article, we will explore how to conduct table bloat testing in PostgreSQL using pgbench.
Table of Contents
Why Do We Need to Care About Table Bloat?
As is well known, PostgreSQL has an autovacuum background process that periodically cleans up garbage. However, it is only PostgreSQL’s first line of defense against table bloat, not a silver bullet. The root cause of table bloat lies in the conflict between the MVCC mechanism and business write patterns. The limitations of autovacuum (resource constraints, transaction visibility limits, and the inability to physically shrink files) dictate that it cannot completely eliminate bloat.
Autovacuum is merely a “background cleaner”; its design goal is to progressively clean up garbage without significantly impacting business performance, rather than to eradicate bloat entirely. When autovacuum’s cleanup speed cannot keep pace with the rate of garbage generation, or when it cannot clean up certain dead rows due to various restrictions, table bloat occurs.
The default configuration of autovacuum is designed for general-purpose scenarios. For business systems with frequent updates or deletions, these default settings are often too conservative. Furthermore, different tables within the database may have entirely different business models. The following issues can arise with autovacuum configurations:
- Trigger thresholds are too high: The default
autovacuum_vacuum_scale_factor=0.2means autovacuum is only triggered when dead tuples account for 20% of the table size. For a large table (e.g., 100GB), this means 20GB of dead tuples must be generated before cleanup is triggered, by which time the table is already severely bloated. - Cleanup speed is too slow: The default
autovacuum_vacuum_cost_limit=200. To avoid impacting frontend business operations, autovacuum strictly controls its own I/O consumption. For high-update workloads, this cleanup speed will fall far behind the garbage generation rate. - Insufficient worker processes: The default
autovacuum_max_workers=3. When a large number of tables in the database require vacuuming, worker processes will queue up, resulting in some tables being left uncleaned for extended periods.
Preparing the Base Data
Adjust the database configuration parameters (modify the postgresql.conf configuration file):
shared_buffers = 1GB
synchronous_commit = off
Note: We are setting the
synchronous_commitparameter tooffhere to simulate a business workload backed by high-speed storage devices. It is not recommended to do this in an actual production environment.
Create a new database (e.g., benchdb):
CREATE DATABASE benchdb;
Create the default pgbench test data tables:
pgbench -i --scale=100 benchdb
Note: The scale factor
--scale=100generates approximately 10 million records. You can adjust this value as needed.
Modify the table’s storage parameters to disable autovacuum:
ALTER TABLE pgbench_accounts SET (autovacuum_enabled = off);
Check the space occupied by the table and index:
SELECT pg_size_pretty(pg_relation_size('pgbench_accounts'));
pg_size_pretty
----------------
1281 MB
(1 row)
SELECT pg_size_pretty(pg_relation_size('pgbench_accounts_pkey'));
pg_size_pretty
----------------
214 MB
(1 row)
Writing a Custom Test Script
To properly compare and analyze the database’s bloat behavior, we have chosen to adopt a “delete + insert” test model rather than a “frequent update” model. This is because databases equipped with rollback segments support in-place updates, which behave fundamentally differently from PostgreSQL’s append-only updates.
To execute a pgbench delete + insert test, we need to utilize a custom SQL script to simulate high write volumes or high-frequency data churn scenarios, as the default TPC-B benchmark workload leans heavily toward data update operations.
Create a test.sql file and define the mixed delete and insert business workload:
\set aid random(1, 100000 * :scale) / 8 * 8 + :client_id
\set bid random(1, 1 * :scale)
\set delta random(-5000, 5000)
DELETE FROM pgbench_accounts WHERE aid = :aid;
INSERT INTO pgbench_accounts (aid, bid, abalance, filler)
VALUES (:aid, :bid, :delta, 'new row');
Run Benchmark Pressure Test
Run the test using pgbench combined with the -f parameter to specify the custom script:
pgbench --no-vacuum -f test.sql --time=600 --client=8 --jobs=8 benchdb
pgbench (18.4)
transaction type: test.sql
scaling factor: 1
query mode: simple
number of clients: 8
number of threads: 8
maximum number of tries: 1
duration: 600 s
number of transactions actually processed: 16678556
number of failed transactions: 0 (0.000%)
latency average = 0.288 ms
initial connection time = 5.231 ms
tps = 27797.816364 (without initial connection time)
Parameter Explanations:
--no-vacuum: Do not perform a data table vacuum operation before testing-f test.sql: Call the custom test script--time=600: Run the test continuously for 600 seconds--client=8: Simulate 8 concurrent clients--jobs=8: Enable 8 worker threads
Check the space occupied by the table and index:
SELECT pg_size_pretty(pg_relation_size('pgbench_accounts'));
pg_size_pretty
----------------
3416 MB
(1 row)
SELECT pg_size_pretty(pg_relation_size('pgbench_accounts_pkey'));
pg_size_pretty
----------------
216 MB
(1 row)
Summary of Test Results
After we performed the same test on Redrock Postgres, we collected the table and index sizes before and after the test run:
| Database | Initial Table Size | Initial Index Size | Post-Test Table Size | Post-Test Index Size | Transaction Throughput (TPS) |
|---|---|---|---|---|---|
| PostgreSQL | 1281 MB | 214 MB | 3416 MB | 216 MB | 27798 |
| Redrock Postgres | 1132 MB | 216 MB | 1295 MB | 216 MB | 22528 |
Through continuous observation during the testing process, it was found that the space occupied by the table in PostgreSQL continued to rise. After the test ran for 10 minutes, the table had expanded to 2.67 times its initial size. In contrast, the space occupied by the table in Redrock Postgres tends to stabilize, with the final table bloat rate controlled at 12.6%.