By John Doe July 4, 2023
When using PostgreSQL functions, you need to be careful not to end up with dangling functions, since the dependency between functions is not well maintained in the PostgreSQL system catalog. The following example shows how you can end up with a dangling function:
CREATE OR REPLACE FUNCTION dep_func (INT) RETURNS INT AS $$ BEGIN RETURN $1 + 2; END; $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION test_func(INT) RETURNS INT AS $$ BEGIN RETURN dep_func($1); END; $$LANGUAGE plpgsql; DROP FUNCTION dep_func(int);
test_func function is dangling. When this function is invoked, an error will be raised, as follows:
SELECT test_func (5); ERROR: function dep_func(integer) does not exist LINE 1: SELECT dep_func($1) HINT: No function matches ...
How are functions stored in PostgreSQL?
Since functions are stored as strings in PostgreSQL, and there is no limit to the languages a function could be written in, PostgreSQL cannot analyze the code to add dependencies.
In the general case, it is impossible to find function’s dependent objects; but a limited (restricted-domain) solution is perfectly doable, and might prove adequate for your needs.
- Fails (false negative) if name of dependent object is specified as a quoted identifier.
- Fails (false negative) if name of dependent object is passed as argument.
- Fails (false negative) if name of dependent object is read from a relation at runtime.
- Fails (false negative) if name of dependent object is assembled from tokens.
- Fails (false positive) if name of dependent object is present just as literal.
- Fails (false positive) if name of dependent object is present in a multi-line comment.
- Does not account for function overloading.
- Does not account for functions invoked via triggers.
- Does not account for functions invoked per query-rewrite rules.
- Does not account for effects of query rewriting rules.
- Knows nothing about functions written in non-interpreted procedural languages like C.
Finding direct function dependencies
WITH routines AS ( SELECT DISTINCT(lower(proname)) AS name, --#0 string_to_array( --#4 regexp_replace( --#3 regexp_replace( --#2 lower(pg_get_functiondef(oid)) --#1 , '--.*?\n', '', 'g') , '\W+', ' ', 'g') , ' ') AS objnames FROM pg_proc WHERE proowner = to_regrole(current_role) AND prokind != 'a' ), routine_names AS ( SELECT DISTINCT(lower(proname)) AS name FROM pg_proc WHERE proowner = to_regrole(current_role) ORDER BY 1 ), relation_names AS ( SELECT DISTINCT(lower(relname)) AS name FROM pg_class WHERE relowner = to_regrole(current_role) ORDER BY 1 ) SELECT name AS "Your routine", array_remove( --#8 array( --#7 SELECT unnest(objnames) --#5 INTERSECT --#6 SELECT name FROM routine_names ), name) AS "Dependent routines", array( SELECT unnest(objnames) INTERSECT SELECT name FROM relation_names ) AS "Dependent relations" FROM routines;
How It Works
- #0 Collect names of all the routines which could be callers. We cannot handle overloaded functions correctly anyway, so just
DISTINCTto save trouble later on; SQL is case-insensitive apart from quoted identifiers which we are not bothering with anyway, so we just
lower()to simplify comparison later.
pg_get_functiondef()fetches complete text of the CREATE FUNCTION or CREATE PROCEDURE command. Again,
- #2 Strip single-line comments. Note the lazy (non-greedy)
*?quantifier: the usual
*quantifier, if used here, would remove the first single-line comment plus all subsequent lines!
- #3 Replace all characters other than letters and digits and
_, with a space. Note the
+quantifier: it ensures that 2+ contiguous removed characters are replaced by just 1 space.
- #4 Split by spaces into an array; this array contains bits of SQL syntax, literals, numbers, and identifiers including routine names.
- #5 Unnest the array into a rowset.
INTERSECTwith routine names; result will consist of routine names only.
- #7 Convert rowset into an array.
- #8 Since input was complete text of a
CREATE FUNCTION f ...command, extracted routine names will obviously contain
fitself; so we remove it with