May 10, 2024
Summary: In this tutorial, you will learn how to log kernel resource usage of queries in PostgreSQL.
Table of Contents
Introduction
PostgreSQL includes lots of monitoring tools that allow you to see what is happening, but there are a few settings that really go into detail, e.g., log_statement_stats:
SET client_min_messages = log;
SET log_statement_stats = true;
SELECT 100;
LOG: QUERY STATISTICS
DETAIL: ! system usage stats:
1 ! 0.000000 s user, 0.000000 s system, 0.000201 s elapsed
2 ! [0.000000 s user, 0.000000 s system total]
3 ! 0/0 [0/0] filesystem blocks in/out
4 ! 0/14 [0/363] page faults/reclaims, 0 [0] swaps
5 ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
6 ! 0/0 [3/0] voluntary/involuntary context switches
?column?
----------
100
What do all those numbers mean? Well, the Linux getrusage() manual page has a clue — these are kernel resource usage statistics. The first two lines show the user and kernel (“system”) cpu time used, as well as elapsed time. (The numbers in brackets on line #2 and following are totals for this process.) Line #3 shows I/O from/to the storage device (not from the kernel cache). Line #4 covers memory pages faulted into the process address space and reclaimed. Line #5 shows signal and IPC message activity. Line #6 shows process context switches.
Example
Let’s look at some more interesting queries by populating a new table:
CREATE TABLE test (x INTEGER);
INSERT INTO test SELECT x FROM generate_series(1, 1000) t(x);
By clearing the kernel buffers with echo 3 > /proc/sys/vm/drop_caches
and restarting the server we can see reads from the storage device (via log_statement_stats) and reads from the kernel cache (via explain (buffers true)
):
EXPLAIN (ANALYZE true, BUFFERS true, TIMING false, COSTS false)
SELECT COUNT(*) FROM test;
LOG: QUERY STATISTICS
DETAIL: ! system usage stats:
! 0.000000 s user, 0.000000 s system, 0.019259 s elapsed
! [0.008000 s user, 0.000000 s system total]
! 1712/0 [5840/280] filesystem blocks in/out
! 7/94 [21/586] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 29/1 [73/3] voluntary/involuntary context switches
QUERY PLAN
---------------------------------------------------
Aggregate (actual rows=1 loops=1)
Buffers: shared read=5
-> Seq Scan on test (actual rows=1000 loops=1)
Buffers: shared read=5
Planning time: 3.673 ms
Execution time: 2.221 ms
Let’s run the query again:
EXPLAIN (ANALYZE true, BUFFERS true, TIMING false, COSTS false)
SELECT COUNT(*) FROM test;
LOG: QUERY STATISTICS
DETAIL: ! system usage stats:
! 0.000000 s user, 0.000000 s system, 0.000408 s elapsed
! [0.008000 s user, 0.000000 s system total]
! 0/0 [5840/288] filesystem blocks in/out
! 0/3 [21/591] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 0/0 [74/3] voluntary/involuntary context switches
QUERY PLAN
---------------------------------------------------
Aggregate (actual rows=1 loops=1)
Buffers: shared hit=5
-> Seq Scan on test (actual rows=1000 loops=1)
Buffers: shared hit=5
Planning time: 0.027 ms
Execution time: 0.250 ms
Notice the first count reads 1712 512-byte blocks from storage and 5 8kB blocks from the kernel cache into the shared buffer cache. (The unrecorded reads are probably system table information.) The second query shows no I/O from the storage device, and a hit on 5 8kB shared buffer pages.
Summary
As you might know, PostgreSQL executes queries in stages. Fortunately, you can get rusage statistics for each stage with log_parser_stats, log_planner_stats, and log_executor_stats. (explain (analyze true)
also shows planning and execution durations, which you can see above.)
As you can see, you can access a wealth of information about PostgreSQL and how it is using the cpus, storage, and caches. These have a large impact on how quickly queries execute.