PostgreSQL Tutorial: LATERAL JOIN

December 9, 2023

Summary: In this article, we are going to see how the SQL LATERAL JOIN works, and how we can use it to cross-reference rows from a subquery with rows in the outer table and build compound result sets.

Table of Contents

Database table

Let’s assume we have the following blog database table storing the blogs hosted by our platform:

SQL LATERAL JOIN blog table

And, we have two blogs currently hosted:

| id | created_on | title               | url                                |
|----|------------|---------------------|------------------------------------|
| 1  | 2013-09-30 | PostgreSQL Blog     | https://www.rockdata.net/blog/     |
| 2  | 2017-01-22 | PostgreSQL Tutorial | https://www.rockdata.net/tutorial/ |

Getting the report without using the SQL LATERAL JOIN

We need to build a report that extracts the following data from the blog table:

  • the blog id
  • the blog age, in years
  • the date for the next blog anniversary
  • the number of days remaining until the next anniversary.

Calculating the blog age using date interval functions

The blog age needs to be calculated by subtracting the blog creation date from the current date.

The date of the next blog anniversary can be calculated by incrementing the age in years and adding it to the blog creation date.

The number of days until the next anniversary can be calculated by extracting the number of days from the interval given by the next blog anniversary and the current date.

To get the result, you can use the following query:

SELECT
  b.id as blog_id,
  extract(
    YEAR FROM age(now(), b.created_on)
  ) AS age_in_years,
  date(
    created_on + (
      extract(YEAR FROM age(now(), b.created_on)) + 1
    ) * interval '1 year'
  ) AS next_anniversary,
  date(
    created_on + (
      extract(YEAR FROM age(now(), b.created_on)) + 1
    ) * interval '1 year'
  ) - date(now()) AS days_to_next_anniversary
FROM blog b
ORDER BY blog_id

And, you are going to get the expected result:

| blog_id | age_in_years | next_anniversary | days_to_next_anniversary |
|---------|--------------|------------------|--------------------------|
| 1       | 7            | 2021-09-30       | 295                      |
| 2       | 3            | 2021-01-22       | 44                       |

As you can see, the age_in_years has to be defined three times because you need it when calculating the next_anniversary and days_to_next_anniversary values.

And, that’s exactly where LATERAL JOIN can help us.

Getting the report using the SQL LATERAL JOIN

LATERAL JOIN allows us to reuse the age_in_years value and just pass it further when calculating the next_anniversary and days_to_next_anniversary values.

For instance, the previous SQL query can be rewritten like this:

SELECT
  b.id as blog_id,
  age_in_years,
  date(
    created_on + (age_in_years + 1) * interval '1 year'
  ) AS next_anniversary,
  date(
    created_on + (age_in_years + 1) * interval '1 year'
  ) - date(now()) AS days_to_next_anniversary
FROM blog b
CROSS JOIN LATERAL (
  SELECT
    cast(
      extract(YEAR FROM age(now(), b.created_on)) AS int
    ) AS age_in_years
) AS t
ORDER BY blog_id

And, the age_in_years value can be calculated once and reused for the next_anniversary and days_to_next_anniversary computations:

| blog_id | age_in_years | next_anniversary | days_to_next_anniversary |
|---------|--------------|------------------|--------------------------|
| 1       | 7            | 2021-09-30       | 295                      |
| 2       | 3            | 2021-01-22       | 44                       |

Much better, right?

The age_in_years is calculated for every record of the blog table. So, it works like a correlated subquery, but the subquery records are joined with the primary table and, for this reason, we can reference the columns produced by the subquery.