PostgreSQL Tutorial: EXPLAIN Scan Nodes

August 21, 2025

Summary: In this tutorial, you will learn various types of scan “nodes” / operations that you might see in explain plans.

Table of Contents

Sequential Scan

It looks like this:

explain analyze select * from pg_class;
                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=202) (actual time=0.009..0.049 rows=295 loops=1)
 Total runtime: 0.249 ms
(2 rows)

This is the simplest possible operation – PostgreSQL opens table file, and reads rows, one by one, returning them to user or to upper node in explain tree, for example to limit, as in:

explain analyze select * from pg_class limit 2;
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..0.07 rows=2 width=202) (actual time=0.014..0.014 rows=2 loops=1)
   ->  Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=202) (actual time=0.009..0.009 rows=2 loops=1)
 Total runtime: 0.132 ms
(3 rows)

It is important to understand that the order of returned rows is not any specific. It’s not “in order of insertion”, or “last updated first” or anything like this. Concurrent selects, updates, deletes, vacuums can modify the order of rows at any time.

Seq Scan can filter rows – that is reject some from being returned. This happens for example when you’ll add “WHERE” clause:

explain analyze select * from pg_class
  where relname ~ 'a';
                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Seq Scan on pg_class  (cost=0.00..11.65 rows=227 width=202) (actual time=0.030..0.294 rows=229 loops=1)
   Filter: (relname ~ 'a'::text)
   Rows Removed by Filter: 66
 Total runtime: 0.379 ms
(4 rows)

As you can see now we have Filter: information. And also “Rows removed by filter” line.

Index Scan

This type of scan seems to be very straight forward, and most people understand when it is used at least in one case:

explain analyze select * from pg_class
  where oid = 1247;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Index Scan using pg_class_oid_index on pg_class  (cost=0.15..8.17 rows=1 width=202) (actual time=0.007..0.007 rows=1 loops=1)
   Index Cond: (oid = 1247::oid)
 Total runtime: 0.077 ms
(3 rows)

That is – we have index that matches the condition, so PostgreSQL does:

opens the index
in the index if finds where (in table data) there might be rows that match given condition
opens table
fetches row(s) pointed to by index
if the rows can be returned – i.e. they are visible to current session – they are returned

You can of course ask: how can a row be invisible? It might happen for rows deleted that are still in the table (haven’t been vacuumed). Or have been updated. Or were inserted, but after current transaction.

Index Scan is also used when you want some data ordered using order from index. As in here:

explain analyze select * from pg_class
  order by oid limit 10;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.15..1.67 rows=10 width=206) (actual time=0.017..0.029 rows=10 loops=1)
   ->  Index Scan using pg_class_oid_index on pg_class  (cost=0.15..44.53 rows=292 width=206) (actual time=0.014..0.026 rows=10 loops=1)
 Total runtime: 0.145 ms
(3 rows)

There is no condition here, but we can add condition easily, like this:

explain analyze select * from pg_class
  where oid > 1247 order by oid limit 10;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.15..4.03 rows=10 width=206) (actual time=0.021..0.035 rows=10 loops=1)
   ->  Index Scan using pg_class_oid_index on pg_class  (cost=0.15..37.84 rows=97 width=206) (actual time=0.017..0.031 rows=10 loops=1)
         Index Cond: (oid > 1247::oid)
 Total runtime: 0.132 ms
(4 rows)

In these cases, PostgreSQL finds starting point in index (either first row that is > 1247, or simply smallest value in index, and then returns next rows/values until Limit will be satisfied.

There is a version of Index Scan, called “Index Scan Backward” – which does the same thing, but is used for scanning in descending order:

explain analyze select * from pg_class
  where oid < 1247 order by oid desc limit 10;
                                                                   QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.15..4.03 rows=10 width=206) (actual time=0.012..0.026 rows=10 loops=1)
   ->  Index Scan Backward using pg_class_oid_index on pg_class  (cost=0.15..37.84 rows=97 width=206) (actual time=0.009..0.022 rows=10 loops=1)
         Index Cond: (oid < 1247::oid)
 Total runtime: 0.119 ms
(4 rows)

This is the same kind of operation – open index, and for every row pointed to by index, fetch row from table, just it happens not “from small to big” but “from big to small”.

Index Only Scan

Let’s create simple table:

create table test (id serial primary key, i int4);

insert into test (i)
  select random() * 1000000000
    from generate_series(1,100000);

vacuum analyze test;

This gives me table like this one:

select * from test limit 10;
 id |     i
----+-----------
  1 | 546119592
  2 | 253476978
  3 | 235791031
  4 | 654694043
  5 | 187647296
  6 | 709050245
  7 | 210316749
  8 | 348927354
  9 | 120463097
 10 |   5611946
(10 rows)

In here, I have index on id:

\d test
                         Table "public.test"
 Column |  Type   |                     Modifiers
--------+---------+---------------------------------------------------
 id     | integer | not null default nextval('test_id_seq'::regclass)
 i      | integer | 
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)

So, if some conditions are met (more on it in a bit), I can get plan like this:

explain analyze select id from test
  order by id asc limit 10;
                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..0.55 rows=10 width=4) (actual time=0.039..0.042 rows=10 loops=1)
   ->  Index Only Scan using test_pkey on test  (cost=0.29..2604.29 rows=100000 width=4) (actual time=0.036..0.038 rows=10 loops=1)
         Heap Fetches: 0
 Total runtime: 0.092 ms
(4 rows)

Please note the “Only” word in “Index Only Scan”. This means that PostgreSQL realized that I select only data (columns) that are in the index. And it is possible that it doesn’t need to check anything in the table files. So that it will return the data straight from index.

These scans have the ability to work way faster than normal Index Scans, because they don’t have to verify anything in table data.

The problem is that, in order to make it work, Index has to contain information that given rows are in pages, that didn’t have any changes “recently”. This means that in order to utilize Index Only Scans, you have to have your table well vacuumed. But with autovacuum running, it shouldn’t be that big of a deal.

Bitmap Index Scan

It looks like this:

create index i1 on test (i);

explain analyze select * from test where i < 100000;
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=4.37..39.99 rows=10 width=8) (actual time=0.025..0.110 rows=13 loops=1)
   Recheck Cond: (i < 100000)
   ->  Bitmap Index Scan on i1  (cost=0.00..4.37 rows=10 width=0) (actual time=0.013..0.013 rows=13 loops=1)
         Index Cond: (i < 100000)
 Total runtime: 0.154 ms
(5 rows)

Bitmap Scans are always in (at least) two nodes. First (lower level) there is Bitmap Index Scan, and then there is Bitmap Heap Scan.

How does it work?

Let’s assume your table has 100000 pages (that would be ~ 780MB). Bitmap Index Scan would create a bitmap where there would be one bit for every page in your table. So in this case, we’d get memory block of 100,000 bits ~ 12.5kB. All these bits would be set to 0. Then Bitmap Index Scan, would set some bits to 1, depending on which page in table might contain row that should be returned.

This part doesn’t touch table data at all. Just index. After it will be done – that is all pages that might contain row that should be returned will be “marked”, this bitmap is passed to upper node – Bitmap Heap Scan, which reads them in more sequential fashion.

What is the point of such operation? Well, Index Scans (normal) cause random IO – that is, pages from disk are loaded in random fashion. Which, at least on spinning disks, is slow.

Sequential scan is faster for getting single page, but on the other hand – you not always need all the pages.

Bitmap Index Scans joins the two cases when you need many rows from the table, but not all, and when the rows that you’ll be returning are not in single block (which would be the case if I did “… where id < …”). Bitmap scans have also one more interesting feature. That is - they can join two operations, two indexes, together. Like in here:

explain analyze select * from test
  where i < 5000000 or i > 950000000;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=107.36..630.60 rows=5323 width=8) (actual time=1.023..4.353 rows=5386 loops=1)
   Recheck Cond: ((i < 5000000) OR (i > 950000000))
   ->  BitmapOr  (cost=107.36..107.36 rows=5349 width=0) (actual time=0.922..0.922 rows=0 loops=1)
         ->  Bitmap Index Scan on i1  (cost=0.00..12.25 rows=527 width=0) (actual time=0.120..0.120 rows=491 loops=1)
               Index Cond: (i < 5000000)
         ->  Bitmap Index Scan on i1  (cost=0.00..92.46 rows=4822 width=0) (actual time=0.799..0.799 rows=4895 loops=1)
               Index Cond: (i > 950000000)
 Total runtime: 4.765 ms
(8 rows)

In here we see two Bitmap Index Scans (there can be more of them), which are then joined using BitmapOr.

As you remember – output of Bitmap Index Scan is a bitmap – that is memory block with some zeros and some ones. Having multiple such bitmaps means that you can easily do logical operations on it: Or, And or Not.

In here we see that two such bitmaps were joined together using Or operator, and resulting bitmap was passed to Bitmap Heap Scan which loaded appropriate rows from the table.

While in here both Index Scans use the same index, it’s not always the case. For example, let’s add quickly some more columns:

alter table test add column j int4 default random() * 1000000000;
alter table test add column h int4 default random() * 1000000000;
create index i2 on test (j);
create index i3 on test (h);

And now:

explain analyze select * from test
  where j < 50000000 and i < 50000000 and h > 950000000;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=280.76..323.61 rows=12 width=16) (actual time=2.295..2.352 rows=11 loops=1)
   Recheck Cond: ((h > 950000000) AND (j < 50000000) AND (i < 50000000))
   ->  BitmapAnd  (cost=280.76..280.76 rows=12 width=0) (actual time=2.278..2.278 rows=0 loops=1)
         ->  Bitmap Index Scan on i3  (cost=0.00..92.53 rows=4832 width=0) (actual time=0.546..0.546 rows=4938 loops=1)
               Index Cond: (h > 950000000)
         ->  Bitmap Index Scan on i2  (cost=0.00..93.76 rows=4996 width=0) (actual time=0.783..0.783 rows=5021 loops=1)
               Index Cond: (j < 50000000)
         ->  Bitmap Index Scan on i1  (cost=0.00..93.96 rows=5022 width=0) (actual time=0.798..0.798 rows=4998 loops=1)
               Index Cond: (i < 50000000)
 Total runtime: 2.428 ms
(10 rows)

Three Bitmap Index Scans, each using different index, bitmaps joined using “and” bit operation, and result fed to Bitmap Heap Scan.

In case you wonder – why is the BitmapAnd showing “Actual rows = 0” – it’s simple. This node doesn’t deal with rows at all (just bitmap of disk pages). So it can’t return any rows.

Thats about it for now – these are your possible table scans – how you get data from disk.

See more

PostgreSQL Optimization

Sequential Scan

Index Scan

Index Only Scan

Bitmap Index Scan

See more

Search

Categories

Tags