由 John Doe 四月 9, 2025
你有没有想要在 PostgreSQL 中使用列式存储表,来加速数据分析查询?
初始设置
pg_mooncake
是 PostgreSQL 中的一种列式存储扩展。这个扩展的目标是为分析工作优化 PostgreSQL,它可以将表以 Iceberg 或 Delta Lake 格式存储在本地磁盘或像 S3 这样的云存储上。
从该扩展的项目仓库下载源代码,编译并安装完成后,将该扩展添加到数据库中:
postgres=# create extension pg_mooncake;
CREATE EXTENSION
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
--------------------+---------+------------+----------------------------------------------------------------
pg_mooncake | 0.1.2 | public | Columnstore Table in Postgres
pg_stat_statements | 1.11 | public | track planning and execution statistics of all SQL statements executed
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(3 rows)
准备测试数据
为了将这种表布局与标准的 PostgreSQL 布局进行比较,我们创建两个表,其中一个使用列式存储格式:
postgres=# create table t1 ( a int, b text );
CREATE TABLE
postgres=# create table t2 ( a int, b text ) using columnstore;
CREATE TABLE
postgres=# \d+ t1
Table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
a | integer | | | | plain | | |
b | text | | | | extended | | |
Access method: heap
postgres=# \d+ t2
Table "public.t2"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------+---------+-----------+----------+---------+----------+-------------+--------------+-------------
a | integer | | | | plain | | |
b | text | | | | extended | | |
Access method: columnstore
向这两个表中各插入一百万行数据:
insert into t1 select i, md5(i::text) from generate_series(1,1000000) i;
insert into t2 select i, md5(i::text) from generate_series(1,1000000) i;
检查物理布局
查看磁盘情况,我们会发现:
select pg_relation_filepath('t1');
pg_relation_filepath
----------------------
base/5/24715
(1 row)
select pg_relation_filepath('t2');
pg_relation_filepath
----------------------
base/5/24718
(1 row)
select pg_size_pretty ( pg_relation_size ( 't1' ));
pg_size_pretty
----------------
65 MB
(1 row)
select pg_size_pretty ( pg_relation_size ( 't2' ));
pg_size_pretty
----------------
0 bytes
(1 row)
列式存储的表显示大小为零字节,实际上磁盘上甚至没有对应 24718 的文件:
$ ls -la $PGDATA/base/5/24715
-rw-------. 1 postgres postgres 68272128 Feb 24 14:19 /db/pgdata/17/base/5/24715
$ ls -la $PGDATA/base/5/24718
ls: cannot access '/db/pgdata/17/base/5/24718': No such file or directory
相反,该表以 Parquet 文件的形式存储在这里:
$ ls -la $PGDATA/mooncake_local_tables/mooncake_postgres_t2_24708/
total 107224
drwx------. 3 postgres postgres 180 Feb 24 14:31 .
drwx------. 3 postgres postgres 40 Feb 24 14:15 ..
-rw-------. 1 postgres postgres 36596064 Feb 24 14:19 560c6efe-1226-4a76-985f-1301169bcc44.parquet
-rw-------. 1 postgres postgres 36596064 Feb 24 14:18 ca0550d6-bd84-4bf9-b8cf-6ce85a65346b.parquet
drwx------. 2 postgres postgres 4096 Feb 24 14:19 _delta_log
-rw-------. 1 postgres postgres 36596064 Feb 24 14:19 fba0eff4-3c57-4dbb-bd9b-469f6622ab92.parquet
这里有一篇更详细的文章,介绍了相关设计决策,实际上,该扩展在内部集成了 DuckDB。
测试性能表现
查看针对这两个表的查询计划,也能发现差异:
postgres=# explain (analyze) select * from t1 where a = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..14542.43 rows=1 width=37) (actual time=1.591..41.014 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on t1 (cost=0.00..13542.33 rows=1 width=37) (actual time=0.022..11.067 rows=0 loops=3)
Filter: (a = 1)
Rows Removed by Filter: 333333
Planning Time: 0.861 ms
Execution Time: 41.086 ms
(8 rows)
postgres=# explain (analyze) select * from t2 where a = 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Custom Scan (MooncakeDuckDBScan) (cost=0.00..0.00 rows=0 width=0) (actual time=7.797..7.816 rows=1 loops=1)
DuckDB Execution Plan:
┌─────────────────────────────────────┐
│┌───────────────────────────────────┐│
││ Query Profiling Information ││
│└───────────────────────────────────┘│
└─────────────────────────────────────┘
EXPLAIN ANALYZE SELECT a, b FROM pgmooncake.public.t2 WHERE (a = 1)
┌────────────────────────────────────────────────┐
│┌──────────────────────────────────────────────┐│
││ Total Time: 0.0043s ││
│└──────────────────────────────────────────────┘│
└────────────────────────────────────────────────┘
┌───────────────────────────┐
│ QUERY │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ EXPLAIN_ANALYZE │
│ ──────────────────── │
│ 0 Rows │
│ (0.00s) │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ TABLE_SCAN │
│ ──────────────────── │
│ Function: │
│ COLUMNSTORE_SCAN │
│ │
│ Projections: │
│ a │
│ b │
│ │
│ Filters: │
│ a=1 AND a IS NOT NULL │
│ │
│ 1 Rows │
│ (0.00s) │
└───────────────────────────┘
Planning Time: 1.693 ms
Execution Time: 8.584 ms
(43 rows)
和通常的列式数据存储一样,当数据在列的基础上能够很好地压缩时,这种存储方式效果最佳,例如:
truncate t1,t2;
insert into t1 select 1, md5(i::text) from generate_series(1,1000000) i;
Time: 852.812 ms
insert into t2 select 1, md5(i::text) from generate_series(1,1000000) i;
Time: 243.532 ms
向 t2 表插入数据的速度始终比向普通表插入数据的速度快(多重复几次插入操作就能有直观感受)。在读取压缩效果好的列时也是如此(同样,多重复几次查询操作就能体会到):
select count(a) from t1;
count
---------
2000000
(1 row)
Time: 60.463 ms
select count(a) from t2;
count
---------
2000000
(1 row)
Time: 10.272 ms
如果你有相关使用场景,pg_mooncake
可能是一个不错的选择。