April 29, 2025
Summary: The PostgreSQL regexp_match()
function matches a regular expression against a string and returns matched substrings.
Table of Contents
Introduction
regexp_matches()
and regexp_match()
are two similar string functions that support regular expression matching directly in the PostgreSQL database.
regexp_match()
only returns results from the first match so it does not need to return a set, making it easier to use for simple cases. The differences between these two functions are nuanced, however, and they are easiest to understand with some examples.
Test data
You can create a short table with some test data in a local PostgreSQL database like so:
redrock=# CREATE TABLE patterns (value text);
CREATE TABLE
redrock=# INSERT INTO patterns VALUES ('foo'), ('bar'), ('foobar'), ('foo1barfoo2bar');
INSERT 0 4
redrock=# SELECT * FROM patterns;
value
----------------
foo
bar
foobar
foo1barfoo2bar
(4 rows)
regexp_matches()
Since regexp_matches()
was added first to PostgreSQL, let’s see how it works and perhaps why its companion regexp_match()
was added later on.
redrock=# SELECT
redrock-# value
redrock-# , regexp_matches(value, 'foo\d?', 'g')
redrock-# , regexp_matches(value, 'foo\d?', 'g') is null AS is_null
redrock-# FROM patterns;
value | regexp_matches | is_null
----------------+----------------+---------
foo | {foo} | f
foobar | {foo} | f
foo1barfoo2bar | {foo1} | f
foo1barfoo2bar | {foo2} | f
(4 rows)
Observations:
- The example uses the
g
flag, since according to the documentation, “if you only want the first match, it’s easier and more efficient to use regexp_match().” - One row each is returned for the values “foo” and “foobar”, whereas two rows are returned, one for each match, in the value “foo1barfoo2bar”.
- No row is returned for the value “bar” since it doesn’t match the pattern.
regexp_match()
regexp_match()
may be helpful for simpler use cases. Here is the same query (without the inapplicable g
flag):
redrock=# SELECT
redrock-# value
redrock-# , regexp_match(value, 'foo\d?')
redrock-# , regexp_match(value, 'foo\d?') is null AS is_null
redrock-# FROM patterns;
value | regexp_match | is_null
----------------+--------------+---------
foo | {foo} | f
bar | | t
foobar | {foo} | f
foo1barfoo2bar | {foo1} | f
(4 rows)
Observations:
regexp_match()
returns aNULL
if the pattern doesn’t match, instead of excluding the row from the result set.- Only the first matching pattern in the value “foo1barfoo2bar” is returned; if you wanted to match both values, you would need to specify a single pattern that matched both values (for example,
(foo\d?).+(foo\d?)
in this example). - The number of rows in the result set matches the source table.
In general, I find regexp_match()
more intuitive and easier to reason about, so it’s my first choice for regular expression matching in PostgreSQL.
If you’re using a version of PostgreSQL older than 10, the documentation notes that regexp_matches()
can be placed in a sub-select to include non-matching rows in the result, for example:
redrock=# SELECT
redrock-# value
redrock-# , (SELECT regexp_matches(value, '(foo\d?)')) AS regexp_matches
redrock-# , (SELECT regexp_matches(value, '(foo\d?)')) IS NULL AS is_null
redrock-# FROM patterns;
value | regexp_matches | is_null
----------------+----------------+---------
foo | {foo} | f
bar | | t
foobar | {foo} | f
foo1barfoo2bar | {foo1} | f
(4 rows)
Note, the g
flag should not be used in this form as it might return more than one row. If it does, you’ll receive the error, “more than one row returned by a subquery used as an expression.”
regexp_substr()
Another function, regexp_substr()
, was added in PostgreSQL 15. In the simple case where (as in these examples) there is only one pattern in the regular expression, if you are using PostgreSQL 15 or above, this function returns the matched pattern or NULL
, not an array or NULL
as with regexp_match()
.
redrock=# SELECT
redrock-# value
redrock-# , regexp_substr(value, 'foo\d?')
redrock-# , regexp_substr(value, 'foo\d?') is null AS is_null
redrock-# FROM patterns;
value | regexp_substr | is_null
----------------+---------------+---------
foo | foo | f
bar | | t
foobar | foo | f
foo1barfoo2bar | foo1 | f
(4 rows)
Observations:
- This function returns the matched pattern or
NULL
, likeregexp_match()
. - Unlike
regexp_match()
, the result is not an array (it does not appear within{
and}
above).
In cases where there is only one pattern in the regular expression, if you are using PostgreSQL 15 or above, regexp_substr()
is a good choice.
Conclusion
regexp_matches()
, regexp_match()
, and regexp_substr()
are unique and powerful functions for pattern matching in PostgreSQL. As a general rule, it may be helpful to start with the newest function available to you, and work backwards to the original regexp_matches()
in case the newer functions do not meet your needs. Nonetheless, it’s helpful to understand the differences between them when writing queries, and I hope this post has helped explain some of the nuances.
See more
PostgreSQL Tutorial: String Functions
PostgreSQL Documentation: String Functions and Operators