How to write EXISTS subqueries with JPA and Hibernate

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, I’m going to show you how to write EXISTS subqueries with JPA and Hibernate.

EXISTS subqueries are very useful as they allow you to implement SemiJoins. Unfortunately, many application developers are not aware of SemiJoins, and they end up emulating it using EquiJoins (e.g., INNER JOIN) at the cost of query performance.

Domain Model

Let’s assume we are using the following Post and PostComment entities:

Post and PostComment entities

The Post entity is the parent and the PostComment is the child entity since the PostComment references the parent via its post property.

Fetching parent entities while filtering by child entities

Let’s assume we want to fetch all Post entities that have a PostComent with a score greater than 10. Most developers would mistakenly use the following query:

List<Post> posts = entityManager.createQuery("""
    select distinct p
    from PostComment pc
    join pc.post p
    where pc.score > :minScore
    order by p.id
    """, Post.class)
.setParameter("minScore", 10)
.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
.getResultList();

This query executes a join between the post and post_comment just for the sake of filtering the post records. Since the projection contains only the Post entity, the JOIN is not needed in this case. Instead, a SemiJoin should be used to filter the Post entity records.

The HINT_PASS_DISTINCT_THROUGH is used to prevent the DISTINCT keyword to be passed to the underlying SQL query as deduplication is done for Java Object references, not SQL table records. Check out this article for more details about this topic.

EXISTS subqueries with JPQL

As I explained in this article, an EXISTS subquery is a much better alternative. Therefore, we can achieve our goal using the following JPQL query:

List<Post> posts = entityManager.createQuery("""
    select p
    from Post p
    where exists (
       select 1
       from PostComment pc
       where
          pc.post = p and
          pc.score > :minScore
    )
    order by p.id
    """, Post.class)
.setParameter("minScore", 10)
.getResultList();

When running the JPQL query above, Hibernate generates the following SQL query:

SELECT 
    p.id AS id1_0_, 
    p.title AS title2_0_
FROM post p
WHERE EXISTS (
    SELECT 1
    FROM post_comment pc
    WHERE 
        pc.post_id=p.id AND 
        pc.score > ?
)
ORDER BY p.id

The advantage of this query is that the SemiJoin does not need to join all post and post_comment records since as soon as a post_comment is found to match the filtering criteria (e.g., pc.score > ?), the EXISTS clause returns true and the query proceeds to the next post record.

EXISTS subqueries with Criteria API

If you want to build the entity query dynamically, then you can use a Criteria API since, like JPQL, it supports subquery filtering.

The previous JPQL query can be rewritten to a Criteria API query, like this:

CriteriaBuilder builder = entityManager.getCriteriaBuilder();

CriteriaQuery<Post> query = builder.createQuery(Post.class);
Root<Post> p = query.from(Post.class);

ParameterExpression<Integer> minScore = builder.parameter(Integer.class);
Subquery<Integer> subQuery = query.subquery(Integer.class);
Root<PostComment> pc = subQuery.from(PostComment.class);
subQuery
    .select(builder.literal(1))
    .where(
        builder.equal(pc.get(PostComment_.POST), p),
        builder.gt(pc.get(PostComment_.SCORE), minScore)
    );

query.where(builder.exists(subQuery));

List<Post> posts = entityManager.createQuery(query)
    .setParameter(minScore, 10)
    .getResultList();

The Criteria API query above generates the very same SQL query the previous JPQL query has generated.

EXISTS subqueries with Blaze Persistence

If you are not a big fan of Criteria API, then there’s a much better alternative to building dynamic entity queries. Blaze Persistence allows you to write dynamic queries that, not only are more readable, but they are also more powerful since you can use LATERAL JOIN, Derived Tables, Common Table Expressions, or Window Functions.

The previous Criteria API query can be rewritten using Criteria API, like this:

final String POST_ALIAS = "p";
final String POST_COMMENT_ALIAS = "pc";

List<Post> posts = cbf.create(entityManager, Post.class)
    .from(Post.class, POST_ALIAS)
    .whereExists()
        .from(PostComment.class, POST_COMMENT_ALIAS)
        .select("1")
        .where(PostComment_.POST).eqExpression(POST_ALIAS)
        .where(PostComment_.SCORE).gtExpression(":minScore")
    .end()
    .select(POST_ALIAS)
    .setParameter("minScore", 10)
    .getResultList();

When executing the Blaze Persistence query above, Hibernate will generate the same SQL statement that was generated by the aforementioned JPQL or Criteria API queries.

Awesome, right?

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

SemiJoins are very useful for filtering, and you should prefer them to EquiJoins when the query projection doesn’t contain any of the joined columns.

In SQL, SemiJoins are expressed using EXISTS subqueries, and this feature is not limited to native SQL queries as you can use EXISTS in your JPA and Hibernate entity queries with both JPQL and Criteria API, and Blaze Persistence queries.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.