June 18, 2024
Summary: in this tutorial, you will learn what a MultiXact is, why it exists, and under what circumstances one comes into existence.
Table of Contents
Introduction to MultiXact
In PostgreSQL, it’s possible for two processes (or indeed, even three!) to lock the same row at the same time, as long as their locks don’t conflict with one another. The rules for what kinds of locks conflict with what other kinds of locks are admirably documented.
When you lock a row in PostgreSQL, what you do is, you put your transaction ID in the xmax
field of the corresponding tuple in storage. This way, anyone who comes along looking for this row will know that you already have it locked. The latecomer can then wait for the lock:
Transaction 768 | Transaction 769 | Notes |
---|---|---|
SELECT * FROM foo WHERE x = 9 FOR UPDATE; |
Transaction 768 now has a row lock. The row’s xmax field contains the value 768 . |
|
SELECT * FROM foo WHERE x = 9 FOR UPDATE; |
Transaction 769 retrieves the current row, sees that Transaction 768 already holds a lock that conflicts with the lock it wants, and waits for Transaction 768 to be over. |
But what if two processes both want to lock the same row simultaneously? For example:
Transaction 772 | Transaction 773 | Notes |
---|---|---|
SELECT * FROM foo WHERE x = 9 FOR SHARE; |
Afterward, transaction 772 has a row lock. The row’s xmax field contains the value 772 . |
|
SELECT * FROM foo WHERE x = 9 FOR SHARE; |
What happens now? |
Transaction 773
can’t just write its transaction ID into the xmax
field. That would amount to preempting Transaction 772
‘s lock… which would defeat the whole point of locking. To solve this problem, PostgreSQL creates a MultiXact. A MultiXact essentially bundles together some set of transactions so that those transactions can all lock the same row at the same time. Instead of a transaction ID, a new MultiXact ID is written to the row’s xmax
.
Transaction 772 | Transaction 773 | Notes |
---|---|---|
SELECT * FROM foo WHERE x = 9 FOR SHARE; |
||
SELECT * FROM foo WHERE x = 9 FOR SHARE; |
Now both transactions have the row locked. The row’s xmax field is set to 14 , which is a MultiXact ID. MultiXact 14 refers to transactions 772 and 773 by their transaction IDs. |
|
COMMIT; |
Transaction 773 is now over, but the row’s xmax value is still 14 . Since MultiXacts are immutable, MultiXact 14 still refers to the now-defunct Transaction 773 as well as the ongoing Transaction 772 . |
|
COMMIT; |
With both transactions over, there are no remaining locks active on the row. Its xmax value is still 14 , and will remain 14 until another process locks the row or the table is vacuumed. |
It bears repeating that MultiXacts are immutable. If transactions 104
and 108
both have row R locked as part of MultiXact 19
, and transaction 117
locks row R too, transaction 117
can’t just join MultiXact 19
. Instead, a new MultiXact with ID 20
is created, which contains 104
, 108
, and 117
.
This means that, every time an additional transaction wants to lock a row, PostgreSQL has to write the whole new MultiXact to a buffer. For large MultiXacts, the time-cost of all this reading and writing can become quite significant. Especially since access to the underlying data region is subject to a set of global lightweight locks.
Will a MultiXact not involve multiple transactions?
In practice, a MultiXact maybe get created within a single transaction. Suppose that, we were doing something like this:
BEGIN;
SELECT * FROM queue_jobs
WHERE id = 4
FOR SHARE;
SAVEPOINT foo;
SELECT * FROM queue_jobs
WHERE id = 4
FOR UPDATE;
A SAVEPOINT doesn’t technically create a new transaction (I don’t think), but nevertheless, PostgreSQL needs to keep track of the fact that the FOR UPDATE
lock was taken after the savepoint, so that that lock can be released in case of a subsequent ROLLBACK TO SAVEPOINT command. As a result, a new MultiXact gets created and its ID gets placed in the row’s xmax
field.
Summary
A Multixact ID
is an internal identifier used to support row locking by multiple transactions.
Multixact IDs are created when transactions use “SELECT … FOR UPDATE
” (or one of these lock modes: SHARE, KEY SHARE, NO KEY UPDATE) to lock and update tuples.
Multixact IDs live in the pg_multixact
directory.