SQL LATERAL JOIN – A Beginner’s Guide
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, we are going to see how the SQL LATERAL JOIN works, and how we can use it to cross-reference rows from a subquery with rows in the outer table and build compound result sets.
A LATERAL JOIN can be used either explicitly, as we will see in this article, or implicitly as it’s the case for the MySQL JSON_TABLE function.
Database table
Let’s assume we have the following blog
database table storing the blogs hosted by our platform:
And, we have two blogs currently hosted:
| id | created_on | title | url | |----|------------|----------------------|--------------------------| | 1 | 2013-09-30 | Vlad Mihalcea's Blog | https://vladmihalcea.com | | 2 | 2017-01-22 | Hypersistence | https://hypersistence.io |
Getting the report without using the SQL LATERAL JOIN
We need to build a report that extracts the following data from the blog
table:
- the blog id
- the blog age, in years
- the date for the next blog anniversary
- the number of days remaining until the next anniversary.
Calculating the blog age using date interval functions
The blog age needs to be calculated by subtracting the blog creation date from the current date.
The date of the next blog anniversary can be calculated by incrementing the age in years and adding it to the blog creation date.
The number of days until the next anniversary can be calculated by extracting the number of days from the interval given by the next blog anniversary and the current date.
Depending on the relational database you are using, you can do that in the following ways.
For PostgreSQL, you can use the following query:
SELECT b.id as blog_id, extract( YEAR FROM age(now(), b.created_on) ) AS age_in_years, date( created_on + ( extract(YEAR FROM age(now(), b.created_on)) + 1 ) * interval '1 year' ) AS next_anniversary, date( created_on + ( extract(YEAR FROM age(now(), b.created_on)) + 1 ) * interval '1 year' ) - date(now()) AS days_to_next_anniversary FROM blog b ORDER BY blog_id
And, you are going to get the expected result:
| blog_id | age_in_years | next_anniversary | days_to_next_anniversary | |---------|--------------|------------------|--------------------------| | 1 | 7 | 2021-09-30 | 295 | | 2 | 3 | 2021-01-22 | 44 |
If you’re using MySQL, then you have to execute the following SQL query:
SELECT b.id as blog_id, TIMESTAMPDIFF( YEAR, b.created_on, CURRENT_TIMESTAMP() ) AS age_in_years, DATE_ADD( created_on, INTERVAL( TIMESTAMPDIFF( YEAR, b.created_on, CURRENT_TIMESTAMP() ) + 1 ) YEAR ) AS next_anniversary, TIMESTAMPDIFF( DAY, CURRENT_TIMESTAMP(), DATE_ADD( created_on, INTERVAL( TIMESTAMPDIFF( YEAR, b.created_on, CURRENT_TIMESTAMP() ) + 1 ) YEAR ) ) AS days_to_next_anniversary FROM blog b ORDER BY blog_id
As you can see, the age_in_years
has to be defined three times because you need it when calculating the next_anniversary
and days_to_next_anniversary
values.
And, that’s exactly where LATERAL JOIN can help us.
Getting the report using the SQL LATERAL JOIN
The following relational database systems support the LATERAL JOIN
syntax:
- Oracle since 12c
- PostgreSQL since 9.3
- MySQL since 8.0.14
SQL Server can emulate the LATERAL JOIN
using CROSS APPLY
and OUTER APPLY
.
LATERAL JOIN allows us to reuse the age_in_years
value and just pass it further when calculating the next_anniversary
and days_to_next_anniversary
values.
For instance, the previous PostgreSQL query can be rewritten like this:
SELECT b.id as blog_id, age_in_years, date( created_on + (age_in_years + 1) * interval '1 year' ) AS next_anniversary, date( created_on + (age_in_years + 1) * interval '1 year' ) - date(now()) AS days_to_next_anniversary FROM blog b CROSS JOIN LATERAL ( SELECT cast( extract(YEAR FROM age(now(), b.created_on)) AS int ) AS age_in_years ) AS t ORDER BY blog_id
And, the age_in_years
value can be calculated one and reused for the next_anniversary
and days_to_next_anniversary
computations:
| blog_id | age_in_years | next_anniversary | days_to_next_anniversary | |---------|--------------|------------------|--------------------------| | 1 | 7 | 2021-09-30 | 295 | | 2 | 3 | 2021-01-22 | 44 |
And, for MySQL, the previous query can be rewritten to use the LATERAL JOIN, as follows:
SELECT b.id as blog_id, age_in_years, DATE_ADD( created_on, INTERVAL (age_in_years + 1) YEAR ) AS next_anniversary, TIMESTAMPDIFF( DAY, CURRENT_TIMESTAMP(), DATE_ADD( created_on, INTERVAL (age_in_years + 1) YEAR ) ) + 1 AS days_to_next_anniversary FROM blog b CROSS JOIN LATERAL ( SELECT TIMESTAMPDIFF( YEAR, b.created_on, CURRENT_TIMESTAMP() ) AS age_in_years ) AS t ORDER BY blog_id
Much better, right?
The age_in_years
is calculated for every record of the blog
table. So, it works like a correlated subquery, but the subquery records are joined with the primary table and, for this reason, we can reference the columns produced by the subquery.
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
LATERAL JOIN is a very useful feature. It allows you to encapsulate a given computation in a subquery and reuse it in the outer query.
Unlike joining directly with a Derived Table, LATERAL JOIN is evaluated for every record in the primary table, and not just once.
