PostgreSQL Tutorial: Convert Strings to Arrays

October 15, 2024

Summary: In this tutorial, you will learn how to use the PostgreSQL string_to_array() function to convert strings to arrays.

Table of Contents

Example Data

Here’s a table to load the data into.

CREATE TABLE weather_data (
    station text,
    temps text
);

For this example, it will be easier to just INSERT the data directly. The PostgreSQL COPY feature will work in a similar way.

INSERT INTO weather_data VALUES
('Station North','-1,-4,-14,-15,-16,-15,-12,-9,-3,0,1,2'),
('Station West','2,4,5,6,9,10,15,16,13,12,10,9,5,3,1'),
('Station East','5,3,2,4,5,6,9,10,15,16,13,12,10,9,5,4,2,1'),
('Station South','12,18,22,25,29,30,33,31,30,29,28,25,24,23,14');

Strings to Arrays

With the data in the table, the next question is: what to do with that silly comma-separated list of temperatures? First, make it more usable by converting it to an array with the string_to_array(string,separator) function.

Split to array

SELECT
	station,
	string_to_array(temps,',') AS array
FROM weather_data;

Query Result

    station    |                     array
---------------+------------------------------------------------
 Station North | {-1,-4,-14,-15,-16,-15,-12,-9,-3,0,1,2}
 Station West  | {2,4,5,6,9,10,15,16,13,12,10,9,5,3,1}
 Station East  | {5,3,2,4,5,6,9,10,15,16,13,12,10,9,5,4,2,1}
 Station South | {12,18,22,25,29,30,33,31,30,29,28,25,24,23,14}

Having an array instead of a string doesn’t look much more useful, but we can show that in fact we now have structured data by doing “array-only” things to the data, like returning the array length.

Split to array, analyze array

SELECT
	station,
	cardinality(string_to_array(temps,',')) AS array_size
FROM weather_data;

Query Result

    station    | array_size
---------------+------------
 Station North |         12
 Station West  |         15
 Station East  |         18
 Station South |         15

Expanding and Analyzing the Array

However, by far the most fun you can have with an array like this is to unnest(array) it! The unnest(array) function is a “set returning function” which means it can return more than one row. How does that work? All the other parts of the incoming row are duplicated, so that each row has a full collection of data, like this.

Split to array, unnest

SELECT
	station,
	unnest(string_to_array(temps,',')) AS temps
FROM weather_data;

Query Result

    station    | temps
---------------+-------
 Station North | -1
 Station North | -4
 Station North | -14
 Station North | -15
 Station North | -16
 Station North | -15
 Station North | -12
 Station North | -9
 Station North | -3
 Station North | 0
 Station North | 1
 Station North | 2
 Station West  | 2
 Station West  | 4
 Station West  | 5
 Station West  | 6
 Station West  | 9
 Station West  | 10
 Station West  | 15
 Station West  | 16
 Station West  | 13
 Station West  | 12
 Station West  | 10
 Station West  | 9
 Station West  | 5
 Station West  | 3
 Station West  | 1
 Station East  | 5
 Station East  | 3
 Station East  | 2
 Station East  | 4
 Station East  | 5
 Station East  | 6
 Station East  | 9
 Station East  | 10
 Station East  | 15
 Station East  | 16
 Station East  | 13
 Station East  | 12
 Station East  | 10
 Station East  | 9
 Station East  | 5
 Station East  | 4
 Station East  | 2
 Station East  | 1
 Station South | 12
 Station South | 18
 Station South | 22
 Station South | 25
 Station South | 29
 Station South | 30
 Station South | 33
 Station South | 31
 Station South | 30
 Station South | 29
 Station South | 28
 Station South | 25
 Station South | 24
 Station South | 23
 Station South | 14

The data now looks a lot like something we might get by joining tables together in a standard data model, and we can actually do standard analytical things now, like figure out the temperature range at each station.

Split to array, unnest and analyze temp

WITH unnested_data AS (
	SELECT
		station,
		unnest(string_to_array(temps,',')) AS temps
	FROM weather_data
)
SELECT
	station,
	max(temps) AS max_temp,
	min(temps) AS min_temp
FROM unnested_data
GROUP BY station;

Query Result

    station    | max_temp | min_temp
---------------+----------+----------
 Station North | 2        | -1
 Station West  | 9        | 1
 Station East  | 9        | 1
 Station South | 33       | 12
(4 rows)

Finally, for completeness, if you want to keep your associated tables in a string, but just don’t like commas, here’s how to split and re-join your data, using a new delimiter.

Split to array, join to string

SELECT
	station,
	array_to_string(string_to_array(temps,','),'|') AS temps
FROM weather_data;

Query Result

    station    |                    temps
---------------+----------------------------------------------
 Station North | -1|-4|-14|-15|-16|-15|-12|-9|-3|0|1|2
 Station West  | 2|4|5|6|9|10|15|16|13|12|10|9|5|3|1
 Station East  | 5|3|2|4|5|6|9|10|15|16|13|12|10|9|5|4|2|1
 Station South | 12|18|22|25|29|30|33|31|30|29|28|25|24|23|14
(4 rows)

See more

PostgreSQL Tutorial: String Functions

PostgreSQL Documentation: String Functions and Operators