SQL LEFT JOIN – A Beginner’s Guide

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, we are going to see how LEFT JOIN works in SQL, and how we can use it to associate rows belonging to different tables and build compound result sets.

Unlike JOIN or INNER JOIN, LEFT JOIN is an OUTER JOIN. So, LEFT JOIN and LEFT OUTER JOIN are equivalent. Other OUTER JOIN types are RIGHT OUTER JOIN and FULL OUTER JOIN. However, since the SQL OUTER keyword is superfluous, it’s usually omitted.

Database tables

To demonstrate how LEFT JOIN works, we are going to use the following two tables, post and post_comment, which form a one-to-many table relationship via the post_id Foreign Key column in the post_comment table that references the id Primary Key column in the post table:

The post and post_comment tables - SQL LEFT JOIN

The post table has the following 3 rows:

| id | title     |
|----|-----------|
| 1  | Java      |
| 2  | Hibernate |
| 3  | JPA       |

and the post_comment table has the following 3 records:

| id | review    | post_id |
|----|-----------|---------|
| 1  | Good      | 1       |
| 2  | Excellent | 1       |
| 3  | Awesome   | 2       |

So, the first record in the post table has two associated child records in the post_comment. The second record in the post table has one child record in the post_comment while the third record in the post table has no associated child.

SQL LEFT JOIN

In SQL, the JOIN clause allows us to associate rows that belong to different tables. For example, CROSS JOIN creates a Cartesian Product that contains all possible combinations of rows between the two joining tables.

Even if CROSS JOIN is useful in certain situations, we usually want to associate tables based on a specific condition. Just like JOIN or INNER JOIN, LEFT JOIN provides an ON clause that defines how the two tables are to be associated.

For instance, if we execute the following SQL LEFT JOIN query:

SELECT
   p.id AS post_id,
   p.title AS post_title,
   pc.review AS review
FROM post p
LEFT JOIN post_comment pc ON pc.post_id = p.id
ORDER BY p.id, pc.id

We get the following result set:

| post_id | post_title | review    |
|---------|------------|-----------|
| 1       | Java       | Good      |
| 1       | Java       | Excellent |
| 2       | Hibernate  | Awesome   |
| 3       | JPA        |           |

The LEFT JOIN clause takes all rows on the left side of the JOIN condition (e.g., post table in our case) and tries to find any matching record on the right side of the JOIN condition.

The first row of the post table has two associated post_comment rows, so the LEFT JOIN clause will combine the first post record with the two associated child post_comment rows, as illustrated by the highlighted records in the query result set:

| post_id | post_title | review    |
|---------|------------|-----------|
| 1       | Java       | Good      |
| 1       | Java       | Excellent |
| 2       | Hibernate  | Awesome   |
| 3       | JPA        |           |

The second row of the post table has only one associated post_comment row, so the LEFT JOIN clause will combine the second post record with its associated child post_comment row, as illustrated by the highlighted record in the query result set:

| post_id | post_title | review    |
|---------|------------|-----------|
| 1       | Java       | Good      |
| 1       | Java       | Excellent |
| 2       | Hibernate  | Awesome   |
| 3       | JPA        |           |

The third row of the post table has no associated post_comment row, so the LEFT JOIN clause will combine the third post record with a virtual row where all post_comment columns are NULL, as illustrated by the highlighted record in the query result set:

| post_id | post_title | review    |
|---------|------------|-----------|
| 1       | Java       | Good      |
| 1       | Java       | Excellent |
| 2       | Hibernate  | Awesome   |
| 3       | JPA        |           |

SQL ANTI JOIN

In relational algebra, a SEMI JOIN (⋉) between two relations, L and R, is defined as the set of all tuples in L for which there is a tuple in R that is equal based o the common attributes of the two relations.

And, the ANTI JOIN between two relations, L and R, is defined like this:

L ▷ R = L − (L ⋉ R)

The proper way of implementing a SEMI JOIN is via the EXISTS or IN clauses, while the ANTI JOIN can be implemented using NOT EXISTS or NOT IN.

So, to get all post rows that have no associated post_comment record, we can use the following SQL query:

SELECT
   p.id AS post_id,
   p.title AS post_title
FROM post p
WHERE NOT EXISTS (
   SELECT 1
   FROM post_comment
   WHERE post_id = p.id
)
ORDER BY p.id

which renders the expected result set:

| post_id | post_title |
|---------|------------|
| 3       | JPA        |

For more details about the EXISTS and NOT EXISTS clauses, check out this article.

However, there are many devs who try to emulate the ANTI JOIN operator using LEFT JOIN, like this:

SELECT
   p.id AS post_id,
   p.title AS post_title
FROM post p
LEFT JOIN post_comment pc ON pc.post_id = p.id
WHERE pc.id IS NULL
ORDER BY p.id

However, even if the two queries generate the same result set, depending on the database engine, the LEFT JOIN alternative might be less efficient than the NOT EXISTS query.

Line INNER JOIN, the OUTER JOIN is useful when the query projection is built out of columns belonging to both tables that are joined. ON the other hand, SEMI JOIN and ANTI JOIN return only the left-side table, so there is no need to combine the records of the left and right-side tables.

I'm running an online workshop on the 20-21 and 23-24 of November about High-Performance Java Persistence.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

LEFT JOIN allows us to fetch records from the left side table even if there is no associated record on the right side table that match the join condition.

However, when there is a match between the left and right-side tables, LEFT JOIN allows us to build a compound projection that includes columns from the two joining tables.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.