由 John Doe 七月 8, 2025
还在用 grep 分析日志定位数据库故障吗?快来了解一下用 SQL 分析 JSON 日志吧。
特性提交日志
引入 log_destination=jsonlog。
“jsonlog” 是可添加到 log_destination 中的一个新值,它能以 JSON 格式提供日志,其输出会写入文件,成为继 “stderr” 和 “csvlog” 之后的第三种目标日志类型。这种格式便于将日志提供给其他应用程序使用。
此前有一个内核以外的插件,借助 elog.c 中的钩子提供了此功能,但该插件必须覆盖 “stderr” 的输出才能生效,因此无法同时使用这两者。这种日志格式生成的文件以 “.json” 为后缀,并且根据后端配置,采用与另外两种格式相同的轮转策略。
文档已更新,列出了这种新格式中可能存在的所有键和值。pg_current_logfile()
也需要针对这个新选项进行更新。
讨论:https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
示例
根据一般经验来看,在数据库中使用 JSON 时,大多数的情况都是滥用 JSON,来处理那些本该重新设计的数据。
但是,在数据库之外,对于像集成数据这类事物,JSON 其实相当好用。
PostgreSQL 已经有了 csvlog,但显然并非所有的 CSV 解析库都能理解与 PostgreSQL 相同的 CSV 格式,比如单条日志涉及多行值。
所以,不用再去寻找能以 PostgreSQL 生成 CSV 的方式来处理 CSV 的专用工具了,使用 JSON 格式就可以了。
让我们来看看它是如何工作的。
首先,我们在配置文件中,更改下面两个参数:
-
logging_collector 需要设置为 on。
-
log_destination 需要设置为 jsonlog。
其他与日志相关的设置如下:
select name, setting from pg_settings where name ~ '^log' order by 1;
name | setting
-----------------------------------+--------------------------------
log_autovacuum_min_duration | 600000
log_checkpoints | on
log_connections | on
log_destination | jsonlog
log_directory | log
log_disconnections | on
log_duration | off
log_error_verbosity | default
log_executor_stats | off
log_file_mode | 0600
log_filename | postgresql-%Y-%m-%d_%H%M%S.log
logging_collector | on
log_hostname | off
logical_decoding_work_mem | 65536
log_line_prefix | %m %u@%d %p %r
log_lock_waits | on
log_min_duration_sample | -1
log_min_duration_statement | 0
log_min_error_statement | error
log_min_messages | warning
log_parameter_max_length | -1
log_parameter_max_length_on_error | 0
log_parser_stats | off
log_planner_stats | off
log_recovery_conflict_waits | off
log_replication_commands | off
log_rotation_age | 1440
log_rotation_size | 102400
log_startup_progress_interval | 10000
log_statement | none
log_statement_sample_rate | 1
log_statement_stats | off
log_temp_files | 0
log_timezone | Asia/Shanghai
log_transaction_sample_rate | 0
log_truncate_on_rotation | off
(36 rows)
重启后,查看 log 目录下的文件,会有一个新的以 “.json” 为后缀的文件:
$ ls -l log/
total 192
-rw------- 1 redrock redrock 178194 Jun 17 12:23 postgresql-2025-06-17_121022.log
-rw------- 1 redrock redrock 11240 Jun 17 12:26 postgresql-2025-06-17_122308.json
-rw------- 1 redrock redrock 175 Jun 17 12:23 postgresql-2025-06-17_122308.log
该文件的内容直接阅读起来并不容易:
$ head -n 5 log/postgresql-2025-06-17_122308.json
{"timestamp":"2025-01-17 12:23:08.437 CST","pid":1560406,"session_id":"61e5519c.17cf56","line_num":1,"session_start":"2025-06-17 12:23:08 CST","txid":0,"error_severity":"LOG","message":"ending log output to stderr","hint":"Future log output will go to log destination \"jsonlog\".","backend_type":"postmaster","query_id":0}
{"timestamp":"2025-06-17 12:23:08.437 CST","pid":1560406,"session_id":"61e5519c.17cf56","line_num":2,"session_start":"2025-06-17 12:23:08 CST","txid":0,"error_severity":"LOG","message":"starting PostgreSQL 15devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0, 64-bit","backend_type":"postmaster","query_id":0}
{"timestamp":"2025-06-17 12:23:08.437 CST","pid":1560406,"session_id":"61e5519c.17cf56","line_num":3,"session_start":"2025-06-17 12:23:08 CST","txid":0,"error_severity":"LOG","message":"listening on IPv4 address \"0.0.0.0\", port 5430","backend_type":"postmaster","query_id":0}
{"timestamp":"2025-06-17 12:23:08.437 CST","pid":1560406,"session_id":"61e5519c.17cf56","line_num":4,"session_start":"2025-06-17 12:23:08 CST","txid":0,"error_severity":"LOG","message":"listening on IPv6 address \"::\", port 5430","backend_type":"postmaster","query_id":0}
{"timestamp":"2025-06-17 12:23:08.444 CST","pid":1560406,"session_id":"61e5519c.17cf56","line_num":5,"session_start":"2025-06-17 12:23:08 CST","txid":0,"error_severity":"LOG","message":"listening on Unix socket \"/tmp/.s.PGSQL.5430\"","backend_type":"postmaster","query_id":0}
不过,我们可以使用 jq 这样的工具:
$ tail -n2 log/postgresql-2025-06-17_122308.json | jq .
{
"timestamp": "2025-06-17 12:28:08.549 CST",
"pid": 1560408,
"session_id": "61e5519c.17cf58",
"line_num": 1,
"session_start": "2025-06-17 12:23:08 CST",
"txid": 0,
"error_severity": "LOG",
"message": "checkpoint starting: time",
"backend_type": "checkpointer",
"query_id": 0
}
{
"timestamp": "2025-06-17 12:28:08.574 CST",
"pid": 1560408,
"session_id": "61e5519c.17cf58",
"line_num": 2,
"session_start": "2025-06-17 12:23:08 CST",
"txid": 0,
"error_severity": "LOG",
"message": "checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.007 s, sync=0.002 s, total=0.025 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB",
"backend_type": "checkpointer",
"query_id": 0
}
那么,对于多行的查询,日志会是什么样子?
$ printf 'select\n relname,\n relkind\nfrom\n pg_class\nlimit 1;\n' | psql -aX
select
relname,
relkind
from
pg_class
limit 1;
relname | relkind
--------------+---------
pg_statistic | r
(1 row)
日志中相关的数据如下:
$ grep relkind log/postgresql-2025-06-17_122308.json | tail -n1
{"timestamp":"2025-06-17 12:30:27.324 CST","user":"redrock","dbname":"redrock","pid":1563548,"remote_host":"[local]","session_id":"61e55353.17db9c","line_num":3,"ps":"SELECT","session_start":"2025-06-17 12:30:27 CST","vxid":"4/0","txid":0,"error_severity":"LOG","message":"duration: 0.693 ms statement: select\n relname,\n relkind\nfrom\n pg_class\nlimit 1;","application_name":"psql","backend_type":"client backend","query_id":6991982717157019333}
$ grep relkind log/postgresql-2025-06-17_122308.json | tail -n1 | jq .
{
"timestamp": "2025-06-17 12:30:27.324 CST",
"user": "redrock",
"dbname": "redrock",
"pid": 1563548,
"remote_host": "[local]",
"session_id": "61e55353.17db9c",
"line_num": 3,
"ps": "SELECT",
"session_start": "2025-06-17 12:30:27 CST",
"vxid": "4/0",
"txid": 0,
"error_severity": "LOG",
"message": "duration: 0.693 ms statement: select\n relname,\n relkind\nfrom\n pg_class\nlimit 1;",
"application_name": "psql",
"backend_type": "client backend",
"query_id": 6991982717157020000
}
$ grep relkind log/postgresql-2025-06-17_122308.json | tail -n1 | jq -r .message
duration: 0.693 ms statement: select
relname,
relkind
from
pg_class
limit 1;
非常不错。看起来每个查询都会放在单独的一行中,不管它原本有多少行,而且标准工具(jq)能完美地处理它。
我们还可以使用 jq 在日志中查找内容:
$ jq 'select(.error_severity != "LOG" )' log/postgresql-2025-06-17_122308.json
{
"timestamp": "2025-06-17 12:41:48.112 CST",
"user": "depesz",
"dbname": "depesz",
"pid": 1561647,
"remote_host": "[local]",
"session_id": "61e55238.17d42f",
"line_num": 10,
"ps": "SELECT",
"session_start": "2025-06-17 12:25:44 CST",
"vxid": "3/29",
"txid": 0,
"error_severity": "ERROR",
"state_code": "22012",
"message": "division by zero",
"statement": "select 1/0;",
"application_name": "psql",
"backend_type": "client backend",
"query_id": -6975083880902025000
}
非常不错的特性,感谢所有参与的社区人员。
参考
提交日志:https://git.postgresql.org/pg/commitdiff/dc686681e0799b12c40f44f85fc5bfd7fed4e57f