October 19, 2023
Summary: The pg_stat_statements extension exposes a lot of query statistics that will be helpful in finding slow queries that need to be optimized. In this tutorial, we shall see some example queries to view some information from pg_stat_statements
.
Table of Contents
Getting ready
In order to run the views or queries discussed in this tutorial, we should have pg_stat_statements
successfully configured and created as an extension. Please note that every query for which the statistics are collected by pg_stat_statements
can be
uniquely identified through queryid
. A hash is generated based on the query text and is visible as queryid
when we query the pg_stat_statements
view.
How to do it…
In the following commands, we have used a function to trim the query up to just 40 characters for better visibility. We may use an expanded display using \x
and avoid truncating the query when viewing the statistics:
1. Find the queries that are writing to temp
the most:
select queryid, left(query,40), calls,
temp_blks_read, temp_blks_written
from pg_stat_statements
order by temp_blks_written desc;
2. Find the queries that are reading from temp
the most:
select queryid, left(query,40), calls,
temp_blks_read, temp_blks_written
from pg_stat_statements
order by temp_blks_read desc;
3. Find the queries that are reading the most from disk due to accessing temporary tables:
select queryid, left(query,40), calls, local_blks_read,
local_blks_read/calls as avg_read_per_call
from pg_stat_statements
order by local_blks_read desc;
4. Find the queries with the highest execution times:
select queryid, left(query,40), mean_time, max_time
from pg_stat_statements
where calls > 10
order by mean_time desc, max_time desc;
5. Find the queries that are executed the most:
select queryid, left(query,40), calls, mean_time
from pg_stat_statements
order by calls desc;
How it works…
The following is the list of columns available through the pg_stat_statements
view. For this view, we could query a lot of information that can be useful for query analysis:
rockdb=# \d pg_stat_statements
View "public.pg_stat_statements"
Column | Type | Collation | Nullable | Default
---------------------+------------------+-----------+----------+---------
userid | oid | | |
dbid | oid | | |
queryid | bigint | | |
query | text | | |
calls | bigint | | |
total_time | double precision | | |
min_time | double precision | | |
max_time | double precision | | |
mean_time | double precision | | |
stddev_time | double precision | | |
rows | bigint | | |
shared_blks_hit | bigint | | |
shared_blks_read | bigint | | |
shared_blks_dirtied | bigint | | |
shared_blks_written | bigint | | |
local_blks_hit | bigint | | |
local_blks_read | bigint | | |
local_blks_dirtied | bigint | | |
local_blks_written | bigint | | |
temp_blks_read | bigint | | |
temp_blks_written | bigint | | |
blk_read_time | double precision | | |
blk_write_time | double precision | | |
Some of the queries discussed in this tutorial serve a specific purpose. However, more such queries can be constructed in several combinations to filter queries going through a certain phase similar to getting a list of the queries that are generating more temp
, reading from temp
, and reading from disk but not from memory, highest execution times, and the queries called (executed) the most, as seen in steps 1 to 5. At any point in time, all the stats collected can be reset using the following function call:
rockdb=# select pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------
(1 row)