SQL LATERAL JOIN – A Beginner’s Guide

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!

Introduction

In this article, we are going to see how the SQL LATERAL JOIN works, and how we can use it to cross-reference rows from a subquery with rows in the outer table and build compound result sets.

A LATERAL join can be used either explicitly, as we will see in this article, or implicitly as it’s the case for the MySQL JSON_TABLE function.

Database table

Let’s assume we have the following blog database table storing the blogs hosted by our platform:

SQL LATERAL JOIN blog table

And, we have two blogs currently hosted:

| id | created_on | title                | url                      |
|----|------------|----------------------|--------------------------|
| 1  | 2013-09-30 | Vlad Mihalcea's Blog | https://vladmihalcea.com |
| 2  | 2017-01-22 | Hypersistence        | https://hypersistence.io |

Getting the report without using the SQL LATERAL JOIN

We need to build a report that extracts the following data from the blog table:

  • the blog id
  • the blog age, in years
  • the date for the next blog anniversary
  • the number of days remaining until the next anniversary.

Calculating the blog age using date interval functions

The blog age needs to be calculated by subtracting the blog creation date from the current date.

The date of the next blog anniversary can be calculated by incrementing the age in years and adding it to the blog creation date.

The number of days until the next anniversary can be calculated by extracting the number of days from the interval given by the next blog anniversary and the current date.

Depending on the relational database you are using, you can do that in the following ways.

For PostgreSQL, you can use the following query:

SELECT
  b.id as blog_id,
  extract(
    YEAR FROM age(now(), b.created_on)
  ) AS age_in_years,
  date(
    created_on + (
      extract(YEAR FROM age(now(), b.created_on)) + 1
    ) * interval '1 year'
  ) AS next_anniversary,
  date(
    created_on + (
      extract(YEAR FROM age(now(), b.created_on)) + 1
    ) * interval '1 year'
  ) - date(now()) AS days_to_next_anniversary
FROM blog b
ORDER BY blog_id

And, you are going to get the expected result:

| blog_id | age_in_years | next_anniversary | days_to_next_anniversary |
|---------|--------------|------------------|--------------------------|
| 1       | 7            | 2021-09-30       | 295                      |
| 2       | 3            | 2021-01-22       | 44                       |

If you’re using MySQL, then you have to execute the following SQL query:

SELECT
  b.id as blog_id,
  TIMESTAMPDIFF(
    YEAR, 
    b.created_on, CURRENT_TIMESTAMP()
  ) AS age_in_years,
  DATE_ADD(
    created_on,
    INTERVAL(
      TIMESTAMPDIFF(
        YEAR, 
        b.created_on, CURRENT_TIMESTAMP()
      ) + 1
    ) YEAR
  ) AS next_anniversary,
  TIMESTAMPDIFF(
    DAY,
    CURRENT_TIMESTAMP(),
    DATE_ADD(
      created_on,
      INTERVAL(
        TIMESTAMPDIFF(
          YEAR, 
          b.created_on, 
          CURRENT_TIMESTAMP()
        ) + 1
      ) YEAR
    )
  ) AS days_to_next_anniversary
FROM blog b
ORDER BY blog_id

As you can see, the age_in_years has to be defined three times because you need it when calculating the next_anniversary and days_to_next_anniversary values.

And, that’s exactly where LATERAL JOIN can help us.

Getting the report using the SQL LATERAL JOIN

The following relational database systems support the LATERAL JOIN syntax:

  • Oracle since 12c
  • PostgreSQL since 9.3
  • MySQL since 8.0.14

SQL Server can emulate the LATERAL JOIN using CROSS APPLY and OUTER APPLY.

LATERAL JOIN allows us to reuse the age_in_years value and just pass it further when calculating the next_anniversary and days_to_next_anniversary values.

For instance, the previous PostgreSQL query can be rewritten like this:

SELECT
  b.id as blog_id,
  age_in_years,
  date(
    created_on + (age_in_years + 1) * interval '1 year'
  ) AS next_anniversary,
  date(
    created_on + (age_in_years + 1) * interval '1 year'
  ) - date(now()) AS days_to_next_anniversary
FROM blog b
CROSS JOIN LATERAL (
  SELECT
    cast(
      extract(YEAR FROM age(now(), b.created_on)) AS int
    ) AS age_in_years
) AS t
ORDER BY blog_id

And, the age_in_years value can be calculated one and reused for the next_anniversary and days_to_next_anniversary computations:

| blog_id | age_in_years | next_anniversary | days_to_next_anniversary |
|---------|--------------|------------------|--------------------------|
| 1       | 7            | 2021-09-30       | 295                      |
| 2       | 3            | 2021-01-22       | 44                       |

And, for MySQL, the previous query can be rewritten to use the LATERAL JOIN, as follows:

SELECT
  b.id as blog_id,
  age_in_years,
    DATE_ADD(
    created_on,
    INTERVAL (age_in_years + 1) YEAR
  ) AS next_anniversary,
  TIMESTAMPDIFF(
    DAY,
    CURRENT_TIMESTAMP(),
    DATE_ADD(
      created_on,
      INTERVAL (age_in_years + 1) YEAR
    )
  ) + 1 AS days_to_next_anniversary
FROM blog b
CROSS JOIN LATERAL (
  SELECT
    TIMESTAMPDIFF(
      YEAR,
      b.created_on,
      CURRENT_TIMESTAMP()
    ) AS age_in_years
  ) AS t
ORDER BY blog_id

Much better, right?

The age_in_years is calculated for every record of the blog table. So, it works like a correlated subquery, but the subquery records are joined with the primary table and, for this reason, we can reference the columns produced by the subquery.

Online Workshops

If you enjoyed this article, I bet you are going to love my upcoming Online Workshops.

Conclusion

LATERAL JOIN is a very useful feature. It allows you to encapsulate a given computation in a subquery and reuse it in the outer query.

Unlike joining directly with a Derived Table, LATERAL JOIN is evaluated for every record in the primary table, and not just once.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.