Imagine having a tool that can automatically detect JPA and Hibernate performance issues.
Hypersistence Optimizer is that tool!
Introduction
In this article, we are going to see how the SQL LATERAL JOIN works, and how we can use it to cross-reference rows from a subquery with rows in the outer table and build compound result sets.
A LATERAL join can be used either explicitly, as we will see in this article, or implicitly as it’s the case for the MySQL JSON_TABLE function.
Database table
Let’s assume we have the following blog database table storing the blogs hosted by our platform:
And, we have two blogs currently hosted:
| id | created_on | title | url |
|----|------------|----------------------|--------------------------|
| 1 | 2013-09-30 | Vlad Mihalcea's Blog | https://vladmihalcea.com |
| 2 | 2017-01-22 | Hypersistence | https://hypersistence.io |
Getting the report without using the SQL LATERAL JOIN
We need to build a report that extracts the following data from the blog table:
the blog id
the blog age, in years
the date for the next blog anniversary
the number of days remaining until the next anniversary.
Calculating the blog age using date interval functions
The blog age needs to be calculated by subtracting the blog creation date from the current date.
The date of the next blog anniversary can be calculated by incrementing the age in years and adding it to the blog creation date.
The number of days until the next anniversary can be calculated by extracting the number of days from the interval given by the next blog anniversary and the current date.
Depending on the relational database you are using, you can do that in the following ways.
For PostgreSQL, you can use the following query:
SELECT
b.id as blog_id,
extract(
YEAR FROM age(now(), b.created_on)
) AS age_in_years,
date(
created_on + (
extract(YEAR FROM age(now(), b.created_on)) + 1
) * interval '1 year'
) AS next_anniversary,
date(
created_on + (
extract(YEAR FROM age(now(), b.created_on)) + 1
) * interval '1 year'
) - date(now()) AS days_to_next_anniversary
FROM blog b
ORDER BY blog_id
If you’re using MySQL, then you have to execute the following SQL query:
SELECT
b.id as blog_id,
TIMESTAMPDIFF(
YEAR,
b.created_on, CURRENT_TIMESTAMP()
) AS age_in_years,
DATE_ADD(
created_on,
INTERVAL(
TIMESTAMPDIFF(
YEAR,
b.created_on, CURRENT_TIMESTAMP()
) + 1
) YEAR
) AS next_anniversary,
TIMESTAMPDIFF(
DAY,
CURRENT_TIMESTAMP(),
DATE_ADD(
created_on,
INTERVAL(
TIMESTAMPDIFF(
YEAR,
b.created_on,
CURRENT_TIMESTAMP()
) + 1
) YEAR
)
) AS days_to_next_anniversary
FROM blog b
ORDER BY blog_id
As you can see, the age_in_years has to be defined three times because you need it when calculating the next_anniversary and days_to_next_anniversary values.
And, that’s exactly where LATERAL JOIN can help us.
Getting the report using the SQL LATERAL JOIN
The following relational database systems support the LATERAL JOIN syntax:
Oracle since 12c
PostgreSQL since 9.3
MySQL since 8.0.14
SQL Server can emulate the LATERAL JOIN using CROSS APPLY and OUTER APPLY.
LATERAL JOIN allows us to reuse the age_in_years value and just pass it further when calculating the next_anniversary and days_to_next_anniversary values.
For instance, the previous PostgreSQL query can be rewritten like this:
SELECT
b.id as blog_id,
age_in_years,
date(
created_on + (age_in_years + 1) * interval '1 year'
) AS next_anniversary,
date(
created_on + (age_in_years + 1) * interval '1 year'
) - date(now()) AS days_to_next_anniversary
FROM blog b
CROSS JOIN LATERAL (
SELECT
cast(
extract(YEAR FROM age(now(), b.created_on)) AS int
) AS age_in_years
) AS t
ORDER BY blog_id
And, the age_in_years value can be calculated one and reused for the next_anniversary and days_to_next_anniversary computations:
And, for MySQL, the previous query can be rewritten to use the LATERAL JOIN, as follows:
SELECT
b.id as blog_id,
age_in_years,
DATE_ADD(
created_on,
INTERVAL (age_in_years + 1) YEAR
) AS next_anniversary,
TIMESTAMPDIFF(
DAY,
CURRENT_TIMESTAMP(),
DATE_ADD(
created_on,
INTERVAL (age_in_years + 1) YEAR
)
) + 1 AS days_to_next_anniversary
FROM blog b
CROSS JOIN LATERAL (
SELECT
TIMESTAMPDIFF(
YEAR,
b.created_on,
CURRENT_TIMESTAMP()
) AS age_in_years
) AS t
ORDER BY blog_id
Much better, right?
The age_in_years is calculated for every record of the blog table. So, it works like a correlated subquery, but the subquery records are joined with the primary table and, for this reason, we can reference the columns produced by the subquery.
Online Workshops
If you enjoyed this article, I bet you are going to love my upcoming 4-day Online Workshop!