pg_gather: SQL script for PostgreSQL health inspection

December 26, 2023

Summary: Scan PostgreSQL instance for potential problems. pg_gather is a SQL-only script leveraging the built-in features of psql.

Introduction

Scan and collect the minimal amount of data needed to identify potential problems in your PostgreSQL database, and then generate an analysis report using that data. This project provides two SQL scripts for users:

  • gather.sql: Gathers performance and configuration data from PostgreSQL databases.
  • gather_report.sql: Analyzes the collected data and generates detailed HTML reports.

Everything is SQL-only, leveraging the built-in features of psql, the command-line utility of PostgreSQL.

Highlights

  1. Secure by Open:

    Simple, Transparent, Fully auditable code.

    To ensure full transparency of what is collected, transmitted, and analyzed, we use an SQL-only data collection script and avoid programs with any control structures, thus improving the readability and auditability of the data collection. This is one reason for separating data collection and analysis.

  2. No Executables:

    No executables need to be deployed on the database host.

    Using executables in secured environments poses unacceptable risks in many highly secure environments. pg_gather requires only the standard PostgreSQL command line utility, psql, and no other libraries or executables.

  3. Authentication agnostic

    Any authentication mechanism supported by PostgreSQL works for data gathering in pg_gather, because it uses the standard psql command-line utility.

  4. Any Operating System

    Linux (32/64-bit), Sun Solaris, Apple macOS, and Microsoft Windows: pg_gather works wherever psql is available, ensuring maximum portability. (Windows users, please see the Notes section below)

  5. Architecture agnostic

    x86-64 bit, ARM, Sparc, Power, and other architectures. It works anywhere psql is available.

  6. Auditable and optionally maskable data:

    pg_gather collects data in Tab Separated Values (TSV) format, making it easy to review and audit the information before sharing it for analysis. Additional masking or trimming is also possible with simple steps.

  7. Any cloud/container/k8s:

    Works with AWS RDS, Azure, Google Cloud SQL, on-premises databases, and more. (Please see Heroku, AWS Aurora, Docker and K8s specific notes in the Notes section below)

  8. Zero failure design:

    pg_gather can generate a report from available information even if data collection is partial or fails due to permission issues, unavailable tables/views, or other reasons.

  9. Low overhead for data collection:

    By design, data collection is separate from data analysis. This allows the collected data to be analyzed on an independent system, so that analysis queries do not adversely impact critical systems. In most cases, the overhead of data collection is negligible.

  10. Small, single file data dump:

    To generate the smallest possible file, which can be further compressed by gzip for the easy transmission and storage, pg_gather avoids redundancy in the collected data as much as possible.

Data Gathering

To gather configuration and performance information, run the gather.sql script against the database using psql:

psql <connection_parameters_if_any> -X -f gather.sql > out.tsv

OR ALTERNATIVELY pipe to a compression utilty to get a compressed output as follows:

psql <connection_parameters_if_any> -X -f gather.sql | gzip > out.tsv.gz

This script may take over 20 seconds to run because it contains sleeps/delays. We recommend running the script as a privileged user (such as superuser or rds_superuser) or as an account with the pg_monitor privilege. The output file contains performance and configuration data for analysis.

Notes

  1. Heroku and similar DaaS hostings impose very high restrictions on collecting performance data. Queries on views like pg_statistics may produce errors during data collection, but these errors can be ignored.

  2. MS Windows users!, Client tools like pgAdmin include psql, which can be used to run pg_gather against local or remote databases. For example:

    "C:\Program Files\pgAdmin 4\v4\runtime\psql.exe" -h pghost -U postgres -f gather.sql > out.tsv
    
  3. AWS Aurora offers a “PostgreSQL-compatible” database. However, it is not a true PostgreSQL database, even though it looks like one. Therefore, you should do the following to the gather.sql script to replace any unapplicable lines with “NULL”.

    sed -i 's/^CASE WHEN pg_is_in_recovery().*/NULL/' gather.sql
    
  4. Docker containers of PostgreSQL may not include the curl or wget utilities necessary to download gather.sql. Therefore, it is recommended to pipe the contents of the SQL file to psql instead.

    cat gather.sql | docker exec -i <container> psql -X -f - > out.tsv
    
  5. Kubernetes environments also have similar restrictions as those mentioned for Docker. Therefore, a similar approach is suggested.

    cat gather.sql | kubectl exec -i <PGpod> -- psql -X -f - > out.tsv
    

Gathering data continuously, but Partially

To capture the details of an occasional problem, you may need to collect data multiple times. pg_gather has a special lightweight mode for continuous data gathering, which is automatically enabled when it connects to the “template1” database. You can schedule gather.sql to run every minute against the “template1” database and collect the output files into a directory.

Following is an example for scheduling in Linux/Unix systems using cron.

* * * * * psql -U postgres -d template1 -X -f /path/to/gather.sql | gzip >  /path/to/out/out-`date +\%a-\%H.\%M`.txt.gz 2>&1

When connected to the template1 database, the gather script collects only live, dynamic, performance-related information, skipping all object-specific information. This is called a partial gathering and it can be further compressed with gzip to reduce the size significantly.

Data Analysis

Importing collected data

The collected data can be imported to a PostgreSQL Instance. This creates required schema objects in the public schema of the database.

CAUTION : Avoid importing the data into critical environments/databases. A temporary PostgreSQL instance is preferable.

 psql -f gather_schema.sql -f out.tsv

Deprecated usage of sed:

sed -e '/^Pager/d; /^Tuples/d; /^Output/d; /^SELECT pg_sleep/d; /^PREPARE/d; /^\s*$/d' out.tsv | psql -f gather_schema.sql -

Generating Report

An analysis report in HTML format can be generated from the imported data as follows.

psql -X -f gather_report.sql > GatherReport.html

You may use your favourite web browser to read the report.

NOTE: PostgreSQL version 13 or above is required to generate the analysis report.

Importing Partial data

As mentioned in the previous section, partial data gathering is helpful if we schedule the gather.sql as a simple continuous monitoring tool. A separate schema with the name history can hold the imported data. A script file with the name history_schema.sql is provided for creating this schema and objects.

psql -X -f history_schema.sql

This project provides a sample imphistory.sh file which automates importing partial data from multiple files into the tables in history schema. This script can be executed from the directory which contains all the output files. Multiiple files and Wild cards are allowed. Here is an example:

$ imphistory.sh out-*.gz > log.txt

Collecting the import log file is a good practice, as shown above.

comments powered by Disqus