PostgreSQL 教程: regexp_match 函数

四月 29, 2025

摘要：PostgreSQL 的regexp_match()函数将正则表达式与字符串进行匹配，并返回匹配的子字符串。

介绍

regexp_matches()和regexp_match()是两个类似的字符串函数，它们支持在 PostgreSQL 数据库中直接进行正则表达式匹配。

regexp_match()仅返回第一个匹配的结果，因此无需返回一个集合，这使得它更方便用于简单的情况。然而，这两个函数之间的差异非常细微，通过一些示例可以更容易理解。

测试数据

您可以在本地的 PostgreSQL 数据库中，创建一个包含一些测试数据的小表，如下所示：

redrock=# CREATE TABLE patterns (value text);
CREATE TABLE
redrock=# INSERT INTO patterns VALUES ('foo'), ('bar'), ('foobar'), ('foo1barfoo2bar');
INSERT 0 4
redrock=# SELECT * FROM patterns;
     value
----------------
 foo
 bar
 foobar
 foo1barfoo2bar
(4 rows)

`regexp_matches()`

由于regexp_matches()是 PostgreSQL 最先支持的，让我们来看看它是如何工作的，以及为什么regexp_match()后面才会被支持。

redrock=# SELECT
redrock-#     value
redrock-#     , regexp_matches(value, 'foo\d?', 'g')
redrock-#     , regexp_matches(value, 'foo\d?', 'g') is null AS is_null
redrock-# FROM patterns;
     value      | regexp_matches | is_null
----------------+----------------+---------
 foo            | {foo}          | f
 foobar         | {foo}          | f
 foo1barfoo2bar | {foo1}         | f
 foo1barfoo2bar | {foo2}         | f
(4 rows)

观察结果：

该示例使用了g标志，因为根据文档说明，“如果您只想要第一个匹配，那么使用 regexp_match() 会更容易、更有效。”
对于值“foo”和“foobar”，各返回一行；对于值“foo1barfoo2bar”，则返回两行，每次匹配对应一行。
由于值“bar”与模式不匹配，因此不会返回任何行。

`regexp_match()`

对于更简单的情况，regexp_match()可能会有帮助。以下是相同的查询（没有使用不适用的g标志）：

redrock=# SELECT
redrock-#     value
redrock-#     , regexp_match(value, 'foo\d?')
redrock-#     , regexp_match(value, 'foo\d?') is null AS is_null
redrock-# FROM patterns;
     value      | regexp_match | is_null
----------------+--------------+---------
 foo            | {foo}        | f
 bar            |              | t
 foobar         | {foo}        | f
 foo1barfoo2bar | {foo1}       | f
(4 rows)

观察结果：

如果模式不匹配，regexp_match()会返回NULL，而不是从结果集中排除该行。
仅返回了值 “foo1barfoo2bar” 中第一个匹配的模式；如果您想要匹配两个值，则需要指定一个与两个值都匹配的单个模式（例如，在此示例中用(foo\d?).+(foo\d?)）。
结果集中的行数与源表匹配。

总的来说，regexp_match()表现更直观、更容易推理，所以它应该是在 PostgreSQL 中进行正则表达式匹配的首选。

如果您使用的 PostgreSQL 版本早于 10，则文档中指出，可以将regexp_matches()放在子查询中，以在结果中包含不匹配的行，例如：

redrock=# SELECT
redrock-#     value
redrock-#     , (SELECT regexp_matches(value, '(foo\d?)')) AS regexp_matches
redrock-#     , (SELECT regexp_matches(value, '(foo\d?)')) IS NULL AS is_null
redrock-# FROM patterns;
     value      | regexp_matches | is_null
----------------+----------------+---------
 foo            | {foo}          | f
 bar            |                | t
 foobar         | {foo}          | f
 foo1barfoo2bar | {foo1}         | f
(4 rows)

请注意，不应在这种形式中使用g标志，因为它可能会返回多行。如果出现这种情况，您将收到错误：“用作表达式的子查询返回了多行”。

`regexp_substr()`

PostgreSQL 15 中添加了另一个函数regexp_substr()。在简单的情况下（如这些示例中所示），正则表达式中只有一个模式，如果您使用的是 PostgreSQL 15 或更高版本，则此函数将返回匹配的模式或NULL，而不是数组或NULL，就像regexp_match()一样。

redrock=# SELECT
redrock-#     value
redrock-#     , regexp_substr(value, 'foo\d?')
redrock-#     , regexp_substr(value, 'foo\d?') is null AS is_null
redrock-# FROM patterns;
     value      | regexp_substr | is_null
----------------+---------------+---------
 foo            | foo           | f
 bar            |               | t
 foobar         | foo           | f
 foo1barfoo2bar | foo1          | f
(4 rows)

观察结果：

此函数返回匹配的模式或NULL，和regexp_match()一样。
与regexp_match()不同，结果不是数组（在上面，它没有出现在{和}的内部）。

在正则表达式中只有一种模式的情况下，如果您使用的是 PostgreSQL 15 或更高版本，regexp_substr()是一个不错的选择。

结论

regexp_matches()、regexp_match()和regexp_substr()是 PostgreSQL 中独特而强大的模式匹配函数。一般来说，最好先从最新的函数开始上手，如果新的函数无法满足您的需求，再回头学习旧的regexp_matches()函数。不过，在编写查询时，了解它们之间的区别还是很有帮助的，希望这篇文章能帮助您理解其中的一些细微差别。

了解更多

PostgreSQL 教程：字符串函数

PostgreSQL 文档：字符串函数和操作符

介绍

测试数据

`regexp_matches()`

`regexp_match()`

`regexp_substr()`

结论

了解更多

搜索

分类

标签