June 11, 2024
Summary: In this tutorial, you will understand cost estimation for index only scan.
Table of Contents
PostgreSQL Index Only Scan
When an index contains all the data required to process a query, we call it a covering index for the query. Using a covering index, the access method can retrieve data directly from the index without a single table scan. This is called an Index Only Scan and can be used by any access method with the RETURNABLE property.
Let’s start with a little test setup:
CREATE TABLE events (event_id char(7) PRIMARY KEY, occurred_at timestamptz);
INSERT INTO events (event_id, occurred_at)
SELECT to_hex(i),
TIMESTAMP '2024-01-01 00:00:00+08' + floor(random() * 100000)::int * '1 minutes'::interval
FROM generate_series(x'1000000'::int, x'2000000'::int, 16) AS s(i);
VACUUM ANALYZE events;
The operation is represented in the plan with an Index Only Scan node:
EXPLAIN SELECT event_id FROM events WHERE event_id < '1100000';
QUERY PLAN
--------------------------------------------------------------------------------------
Index Only Scan using events_pkey on events (cost=0.43..2090.81 rows=73279 width=8)
Index Cond: (event_id < '1100000'::bpchar)
(2 rows)
The name suggests that the Index Only Scan node never queries the table, but in some cases it does. PostgreSQL indexes don’t store visibility information, so the access method returns the data from every row version that matches the query, regardless of their visibility to the current transaction. Tuple visibility is verified afterwards by the indexing mechanism.
But if the method has to scan the table for visibility anyway, how is it different from plain Index Scan? The trick is a feature called the visibility map. Each heap relation has a visibility map to keep track of which pages contain only the tuples that are known to be visible to all active transactions. Whenever an index method returns a row ID that points to a page which is flagged in the visibility map, you can tell for certain that all the data there is visible to your transaction.
Cost estimation
The Index Only Scan cost is determined by the number of table pages flagged in the visibility map. This is a collected statistic:
SELECT relpages, relallvisible
FROM pg_class WHERE relname = 'events';
relpages | relallvisible
----------+---------------
5668 | 5668
(1 row)
The only difference between the Index Scan and the Index Only Scan cost estimates is that the latter multiplies the I/O cost of the former by the fraction of the pages not present in the visibility map. (The processing cost remains the same.)
In this example every row version on every page is visible to every transaction, so the I/O cost is essentially multiplied by zero:
WITH costs(idx_cost, tbl_cost) AS (
SELECT
( SELECT round(
current_setting('random_page_cost')::real * pages +
current_setting('cpu_index_tuple_cost')::real * tuples +
current_setting('cpu_operator_cost')::real * tuples
)
FROM (
SELECT relpages * 0.0700 AS pages, reltuples * 0.0700 AS tuples
FROM pg_class WHERE relname = 'events_pkey'
) c
) AS idx_cost,
( SELECT round(
(1 - frac_visible) * -- the fraction of the pages not in the visibility map
current_setting('seq_page_cost')::real * pages +
current_setting('cpu_tuple_cost')::real * tuples
)
FROM (
SELECT relpages * 0.0700 AS pages,
reltuples * 0.0700 AS tuples,
relallvisible::real/relpages::real AS frac_visible
FROM pg_class WHERE relname = 'events'
) c
) AS tbl_cost
)
SELECT idx_cost, tbl_cost, idx_cost + tbl_cost AS total
FROM costs;
idx_cost | tbl_cost | total
----------+----------+-------
1356 | 734 | 2090
(1 row)
Any changes still present in the heap that haven’t been vacuumed increase the total plan cost (and make the plan less desirable for the optimizer). You can check the actual number of necessary heap fetches by using the EXPLAIN ANALYZE
command:
CREATE TEMP TABLE events_tmp WITH (autovacuum_enabled = off)
AS SELECT * FROM events ORDER BY event_id;
ALTER TABLE events_tmp ADD PRIMARY KEY(event_id);
ANALYZE events_tmp;
EXPLAIN (analyze, timing off, summary off)
SELECT event_id FROM events_tmp WHERE event_id < '1100000';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Index Only Scan using events_tmp_pkey on events_tmp (cost=0.43..2490.88 rows=73283 width=8) (actual rows=65536 loops=1)
Index Cond: (event_id < '1100000'::bpchar)
Heap Fetches: 65536
(3 rows)
Because vacuuming is disabled, the planner has to check the visibility of every row (Heap Fetches). After vacuuming, however, there is no need for that:
VACUUM events_tmp;
EXPLAIN (analyze, timing off, summary off)
SELECT event_id FROM events_tmp WHERE event_id < '1100000';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Index Only Scan using events_tmp_pkey on events_tmp (cost=0.43..2090.88 rows=73283 width=8) (actual rows=65536 loops=1)
Index Cond: (event_id < '1100000'::bpchar)
Heap Fetches: 0
(3 rows)