Freeze caused by high transaction throughput in PostgreSQL

By John Doe December 13, 2024

Summary: In this article, we’ll take a look at the freeze issue that comes with high transaction throughput in PostgreSQL, and corresponding solutions.

Table of Contents

Transaction assignments scenario: Use case

Let’s create a table and some functions for data manipulation in a PostgreSQL database.

CREATE TABLE t_large (id integer, name text);
CREATE UNIQUE INDEX large_idx ON t_large (id);

CREATE OR REPLACE FUNCTION f_base(v int)
RETURNS void AS $$
BEGIN
  INSERT INTO t_large (id, name) VALUES(v, 'xxx');
EXCEPTION
  WHEN UNIQUE_VIOLATION THEN
    RAISE WARNING 'Exception for value: %', v;
END; $$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION f_loop(b int, e int)
RETURNS void AS $$
BEGIN
  FOR i IN b .. e LOOP
    PERFORM f_base(i);
  END LOOP;
END; $$ LANGUAGE plpgsql;

Let’s insert some rows with the stored function, to see if the operation will be successful and if the transaction ID will change in each subtransaction.

SELECT f_loop(1, 4);

SELECT ctid, xmin, * FROM t_large;
 ctid  | xmin | id | name
-------+------+----+------
 (0,1) |  743 |  1 | xxx
 (0,2) |  744 |  2 | xxx
 (0,3) |  745 |  3 | xxx
 (0,4) |  746 |  4 | xxx
(4 rows)

Transaction assignments scenario: How it works

Here’s what happens to the global transaction ID in the database system, when we run the stored function in PostgreSQL:

XID Growth

  1. PostgreSQL uses subtransactions to implement the exception-handling in the stored function.
  2. Each transaction/subtransaction has a transaction ID, which is a bit like the clock time of a database system.
  3. Each tuple has two system columns, xmin and xmax, which hold the transaction/subtransaction IDs that inserted and deleted them, respectively.
  4. The maximum number of transaction IDs is around 2 billion, and after beyond that number, the multi-version concurrency control will not work.

Transaction assignments scenario: Thought

Impact of the issue

  1. The rows in all tables need to be scanned periodically for transaction ID freezing.
  2. If there are some long-live transactions in the system, the transaction ID will be exhausted and the database will shutdown.
  3. The wait events for accessing the cache in pg_xact and pg_subtrans are prone to occur, and the system performance will be worse.

Solution

  1. Add a monitoring metric and alarm, to check transaction ID exhaustion.
  2. Add a monitoring metric and alarm, to check uses of subtransactions.
  3. Add a monitoring metric and alarm, to check long running transactions.

The solution by Redrock Postgres

Redrock Postgres implemented subtransactions based on the undo record position, these subtransactions are only temporary marks in a transaction, they do not need to assign their own transaction IDs, and have no risk to the system.

Besides, in Redrock Postgres, the transaction ID type xid is 8 bytes long, and can be used almost infinitely without worrying about running out of transaction IDs.

Let’s create the same table and the same functions for operation in a Redrock Postgres database as above, insert the same amount of data with the stored function, and see if the transaction ID changes in each subtransaction.

SELECT f_loop(1, 4);

SELECT ctid, rowxid, * FROM t_large;
 ctid  |   rowxid    | id | name
-------+-------------+----+------
 (0,1) | (8,64,4988) |  1 | xxx
 (0,2) | (8,64,4988) |  2 | xxx
 (0,3) | (8,64,4988) |  3 | xxx
 (0,4) | (8,64,4988) |  4 | xxx
(4 rows)

From the above test results, after we finished inserting rows into the table t_large, the database system consumed only one global transaction ID. Subtransactions use the same transaction ID as the owning transaction block, they do not need to be assigned separate transaction IDs.