PostgreSQL Tutorial: EXPLAIN Materialize Nodes

August 27, 2025

Summary: In this tutorial, you will learn various types of materialize “nodes” / operations that you might see in explain plans.

Table of Contents

Sort

This seems to be easy to understand – sort gets given records and returns them sorted in some way.

Example:

explain analyze select * from pg_class order by relname;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Sort  (cost=22.88..23.61 rows=292 width=203) (actual time=0.230..0.253 rows=295 loops=1)
   Sort Key: relname
   Sort Method: quicksort  Memory: 103kB
   ->  Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=203) (actual time=0.007..0.048 rows=295 loops=1)
 Total runtime: 0.326 ms
(5 rows)

While it is simple, it has some cool logic inside. For starters – if memory used for sorting would be more than work_mem, it will switch to using disk based sorting:

explain analyze select random() as x from generate_series(1,14000) i order by x;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=62.33..64.83 rows=1000 width=0) (actual time=16.713..18.090 rows=14000 loops=1)
   Sort Key: (random())
   Sort Method: quicksort  Memory: 998kB
   ->  Function Scan on generate_series i  (cost=0.00..12.50 rows=1000 width=0) (actual time=2.036..4.533 rows=14000 loops=1)
 Total runtime: 18.942 ms
(5 rows)
 
explain analyze select random() as x from generate_series(1,15000) i order by x;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=62.33..64.83 rows=1000 width=0) (actual time=27.052..28.780 rows=15000 loops=1)
   Sort Key: (random())
   Sort Method: external merge  Disk: 264kB
   ->  Function Scan on generate_series i  (cost=0.00..12.50 rows=1000 width=0) (actual time=2.171..4.894 rows=15000 loops=1)
 Total runtime: 29.767 ms
(5 rows)

Please note the Sort Method: change above.

To handle such cases, PostgreSQL will use temporary files stored in $PGDATA/base/pgsql_tmp/ directory. They will of course be removed as soon as they are not needed.

One additional feature is that Sort can change it’s method of working if it’s called by Limit operation, like here:

explain analyze select * from pg_class order by relfilenode limit 5;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Limit  (cost=15.77..15.78 rows=5 width=203) (actual time=0.119..0.120 rows=5 loops=1)
   ->  Sort  (cost=15.77..16.50 rows=292 width=203) (actual time=0.118..0.118 rows=5 loops=1)
         Sort Key: relfilenode
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=203) (actual time=0.005..0.047 rows=295 loops=1)
 Total runtime: 0.161 ms
(6 rows)

Normally, to sort given dataset, you need to process it in whole. But PostgreSQL knows, that if you need only some small number of rows, it doesn’t have to sort whole dataset, and it’s good enough to get just the first values.

In Big O notation, general sort has complexity of O(m * log(m)), but Top-N has complexity of O(m * log(n)) – where m is number of rows in table, and n is number of returned rows. What’s most important – this kind of sort also uses much less memory (after all, it doesn’t have to construct whole dataset of sorted rows, just couple of rows), so it’s less likely to use slow disk for temporary files.

HashAggregate

This operation is used basically whenever you are using GROUP BY and some aggregates, like sum(), avg(), min(), max() or others.

Example:

explain analyze select relkind, count(*) from pg_class group by relkind;
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=12.38..12.42 rows=4 width=1) (actual time=0.223..0.224 rows=5 loops=1)
   ->  Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=1) (actual time=0.008..0.053 rows=295 loops=1)
 Total runtime: 0.273 ms
(3 rows)

HashAggregate does something like this: for every row it gets, it finds GROUP BY “key" (in this case relkind). Then, in hash (associative array, dictionary), puts given row into bucket designated by given key.

After all rows have been processed, it scans the hash, and returns single row per each key value, when necessary – doing appropriate calculations (sum, min, avg, and so on).

It is important to understand that HashAggregate has to scan all rows before it can return even single row.

Now, if you understand it, you should see potential problem: well, what about case when there are millions of rows? The hash will be too big to fit in memory. And here, again, we’ll be using work_mem. If generated hash is too big, it will “spill" to disk (again in the $PGDATA/base/pgsql_tmp).

This means that if we have plan that has both HashAggregate and Sort – we can use up to 2 * work_mem. And such plan is simple to get:

explain analyze select relkind, count(*) from pg_Class group by relkind order by relkind;
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Sort  (cost=12.46..12.47 rows=4 width=1) (actual time=0.260..0.261 rows=5 loops=1)
   Sort Key: relkind
   Sort Method: quicksort  Memory: 25kB
   ->  HashAggregate  (cost=12.38..12.42 rows=4 width=1) (actual time=0.221..0.222 rows=5 loops=1)
         ->  Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=1) (actual time=0.006..0.044 rows=295 loops=1)
 Total runtime: 0.312 ms
(6 rows)

In reality – single query can use many times work_mem, as work_mem is a limit per operation. So, if your query uses 1000 of HashAggregates and Sorts (and other work_mem using operations) total memory usage can get pretty high.

Materialize

This operation showed earlier in example for Merge Join, but it is also usable in another cases.

psql has many internal commands. One of them is \dTS – which lists all system datatypes. Internally \dTS runs this query:

SELECT n.nspname as "Schema",
  pg_catalog.format_type(t.oid, NULL) AS "Name",
  pg_catalog.obj_description(t.oid, 'pg_type') as "Description"
FROM pg_catalog.pg_type t
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
WHERE (t.typrelid = 0 OR (SELECT c.relkind = 'c' FROM pg_catalog.pg_class c WHERE c.oid = t.typrelid))
  AND NOT EXISTS(SELECT 1 FROM pg_catalog.pg_type el WHERE el.oid = t.typelem AND el.typarray = t.oid)
  AND pg_catalog.pg_type_is_visible(t.oid)
ORDER BY 1, 2;

It’s plan is:

                                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=2783.00..2783.16 rows=65 width=68) (actual time=3.883..3.888 rows=87 loops=1)
   Sort Key: n.nspname, (format_type(t.oid, NULL::integer))
   Sort Method: quicksort  Memory: 39kB
   ->  Nested Loop Left Join  (cost=16.32..2781.04 rows=65 width=68) (actual time=0.601..3.657 rows=87 loops=1)
         Join Filter: (n.oid = t.typnamespace)
         Rows Removed by Join Filter: 435
         ->  Hash Anti Join  (cost=16.32..2757.70 rows=65 width=8) (actual time=0.264..0.981 rows=87 loops=1)
               Hash Cond: ((t.typelem = el.oid) AND (t.oid = el.typarray))
               ->  Seq Scan on pg_type t  (cost=0.00..2740.26 rows=81 width=12) (actual time=0.012..0.662 rows=157 loops=1)
                     Filter: (pg_type_is_visible(oid) AND ((typrelid = 0::oid) OR (SubPlan 1)))
                     Rows Removed by Filter: 185
                     SubPlan 1
                       ->  Index Scan using pg_class_oid_index on pg_class c  (cost=0.15..8.17 rows=1 width=1) (actual time=0.002..0.002 rows=1 loops=98)
                             Index Cond: (oid = t.typrelid)
               ->  Hash  (cost=11.33..11.33 rows=333 width=8) (actual time=0.241..0.241 rows=342 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 14kB
                     ->  Seq Scan on pg_type el  (cost=0.00..11.33 rows=333 width=8) (actual time=0.002..0.130 rows=342 loops=1)
         ->  Materialize  (cost=0.00..1.09 rows=6 width=68) (actual time=0.000..0.001 rows=6 loops=87)
               ->  Seq Scan on pg_namespace n  (cost=0.00..1.06 rows=6 width=68) (actual time=0.002..0.003 rows=6 loops=1)
 Total runtime: 3.959 ms

Materialize operation is called by Nested Loop Left Join operation. We know that Nested Loop causes given operation to be ran multiple times, in this case – 87 times.

Right side of the join is Seq Scan on pg_namespace. So PostgreSQL, theoretically, should run Sequential Scan on pg_namespace 87 times. Given that single Seq Scan of this table takes 0.003ms, we could expect total time of ~ 0.25ms.

But PostgreSQL is smarter than that. It realized that it will be cheaper to scan the table just once, and build memory representation of all the rows in there. So, that next time, it will not have to scan the table, check visibility information, parse data pages. It will just get the data from memory.

Thanks to this total time of: reading the table once, preparing memory representation of the data and scanning this representation 87 times was 0.087ms.

In the example below, the Merge Join operation only performs a scan once, so why is the Materialize operation still used?

explain analyze select * from
  (select oid, * from pg_class order by oid) as c
  join
  (select * from pg_attribute a order by attrelid) as a
  on c.oid = a.attrelid;
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=23.16..268.01 rows=2273 width=410) (actual time=0.658..3.779 rows=2274 loops=1)
   Merge Cond: (pg_class.oid = a.attrelid)
   ->  Sort  (cost=22.88..23.61 rows=292 width=207) (actual time=0.624..0.655 rows=293 loops=1)
         Sort Key: pg_class.oid
         Sort Method: quicksort  Memory: 102kB
         ->  Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=207) (actual time=0.011..0.211 rows=293 loops=1)
   ->  Materialize  (cost=0.28..212.34 rows=2273 width=203) (actual time=0.028..1.264 rows=2274 loops=1)
         ->  Index Scan using pg_attribute_relid_attnum_index on pg_attribute a  (cost=0.28..183.92 rows=2273 width=203) (actual time=0.015..0.752 rows=2274 loops=1)
 Total runtime: 4.009 ms
(9 rows)

Yes. It was run just once. The problem is, though, that source of data for Merge Join has to match several criteria. Some are obvious (data has to be sorted) and some are not so obvious as are more technical (data has to be scrollable back and forth).

Because of this (these not so obvious criteria) sometimes PostgreSQL will have to Materialize the data coming from source (Index Scan in our case) so that it will have all the necessary features when using it.

Long story short – Materialize gets data from underlying operation and stores it in memory (or partially in memory) so that it can be used faster, or with additional features that underlying operation doesn’t provide.

Unique

Name seems to be clear about what’s going on here – it removes duplicate data.

This might happen, for example, when you’re doing:

select distinct field from table

But in newer PostgreSQL this query will usually be done using HashAggregate.

The problem with Unique is that is requires data to be sorted. Not because it needs data in any particular order – but it needs it so that all rows with the same value will be “together".

This makes it really cool (when possible to use) because it doesn’t use virtually any memory. It just checks if value in previous row is the same as in current, and if yes – discards it. That’s all.

So, we can force usage of it, by pre-sorting data:

explain select distinct relkind from
  (select relkind from pg_class order by relkind) as x;
                              QUERY PLAN
-----------------------------------------------------------------------
 Unique  (cost=22.88..27.26 rows=4 width=1)
   ->  Sort  (cost=22.88..23.61 rows=292 width=1)
         Sort Key: pg_class.relkind
         ->  Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=1)
(4 rows)

Result

This happens mostly in very simple test queries. This operation is used when your query selects some constant value (or values):

explain select 1, 2;
                QUERY PLAN                
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=0)
(1 row)

Aside from test queries it can be sometimes seen in queries that do “insert, but don’t if it would be duplicate" kind of thing:

explain insert into t (i)
  select 1
  where not exists (select * from t where i = 1);
                             QUERY PLAN
---------------------------------------------------------------------
 Insert on t  (cost=3.33..3.35 rows=1 width=4)
   ->  Result  (cost=3.33..3.34 rows=1 width=0)
         One-Time Filter: (NOT $0)
         InitPlan 1 (returns $0)
           ->  Seq Scan on t t_1  (cost=0.00..40.00 rows=12 width=0)
                 Filter: (i = 1)
(6 rows)

GroupAggregate

This is similar to previously described HashAggregate.

The difference is that for GroupAggregate to work data has to be sorted using whatever column(s) you used for your GROUP BY clause.

Just like Unique – GroupAggregate uses very little memory, but forces ordering of data.

Example:

explain select relkind, count(*) from
  (select relkind from pg_class order by relkind) x
  group by relkind;
                              QUERY PLAN
-----------------------------------------------------------------------
 GroupAggregate  (cost=22.88..28.03 rows=4 width=1)
   ->  Sort  (cost=22.88..23.61 rows=292 width=1)
         Sort Key: pg_class.relkind
         ->  Seq Scan on pg_class  (cost=0.00..10.92 rows=292 width=1)
(4 rows)

HashSetOp

This operation is used by INTERSECT/EXCEPT operations (with optional “ALL" modifier).

It works by running sub-operation of Append for a pair of sub-queries, and then, based on result and optional ALL modifier, it figures which rows should be returned.

In here we can see that unlike UNION, these operations work on two sources of data:

explain select * from (select oid from pg_class order by oid) x
  intersect all
  select * from (select oid from pg_proc order by oid) y;
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 HashSetOp Intersect All  (cost=0.15..170.72 rows=292 width=4)
   ->  Append  (cost=0.15..163.36 rows=2941 width=4)
         ->  Subquery Scan on "*SELECT* 1"  (cost=0.15..18.37 rows=292 width=4)
               ->  Index Only Scan using pg_class_oid_index on pg_class  (cost=0.15..12.53 rows=292 width=4)
         ->  Subquery Scan on "*SELECT* 2"  (cost=0.28..145.00 rows=2649 width=4)
               ->  Index Only Scan using pg_proc_oid_index on pg_proc  (cost=0.28..92.02 rows=2649 width=4)
(6 rows)

but with three sources, we get more complex tree:

explain select * from (select oid from pg_class order by oid) x
  intersect all
  select * from (select oid from pg_proc order by oid) y
  intersect all
  select * from (select oid from pg_database order by oid) as w;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 HashSetOp Intersect All  (cost=1.03..172.53 rows=2 width=4)
   ->  Append  (cost=1.03..171.79 rows=294 width=4)
         ->  Subquery Scan on "*SELECT* 3"  (cost=1.03..1.07 rows=2 width=4)
               ->  Sort  (cost=1.03..1.03 rows=2 width=4)
                     Sort Key: pg_database.oid
                     ->  Seq Scan on pg_database  (cost=0.00..1.02 rows=2 width=4)
         ->  Result  (cost=0.15..170.72 rows=292 width=4)
               ->  HashSetOp Intersect All  (cost=0.15..170.72 rows=292 width=4)
                     ->  Append  (cost=0.15..163.36 rows=2941 width=4)
                           ->  Subquery Scan on "*SELECT* 1"  (cost=0.15..18.37 rows=292 width=4)
                                 ->  Index Only Scan using pg_class_oid_index on pg_class  (cost=0.15..12.53 rows=292 width=4)
                           ->  Subquery Scan on "*SELECT* 2"  (cost=0.28..145.00 rows=2649 width=4)
                                 ->  Index Only Scan using pg_proc_oid_index on pg_proc  (cost=0.28..92.02 rows=2649 width=4)
(13 rows)

See more

PostgreSQL Optimization

Sort

HashAggregate

Materialize

Unique

Result

GroupAggregate

HashSetOp

See more

Search

Categories

Tags