How to write EXISTS subqueries with JPA and Hibernate
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, I’m going to show you how to write EXISTS subqueries with JPA and Hibernate.
EXISTS subqueries are very useful as they allow you to implement SemiJoins. Unfortunately, many application developers are not aware of SemiJoins, and they end up emulating it using EquiJoins (e.g., INNER JOIN) at the cost of query performance.
Domain Model
Let’s assume we are using the following Post
and PostComment
entities:
The Post
entity is the parent and the PostComment
is the child entity since the PostComment
references the parent via its post
property.
Fetching parent entities while filtering by child entities
Let’s assume we want to fetch all Post
entities that have a PostComent
with a score greater than 10. Most developers would mistakenly use the following query:
List<Post> posts = entityManager.createQuery(""" select distinct p from PostComment pc join pc.post p where pc.score > :minScore order by p.id """, Post.class) .setParameter("minScore", 10) .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false) .getResultList();
This query executes a join between the post
and post_comment
just for the sake of filtering the post
records. Since the projection contains only the Post
entity, the JOIN is not needed in this case. Instead, a SemiJoin should be used to filter the Post
entity records.
The
HINT_PASS_DISTINCT_THROUGH
is used to prevent theDISTINCT
keyword to be passed to the underlying SQL query as deduplication is done for Java Object references, not SQL table records. Check out this article for more details about this topic.
EXISTS subqueries with JPQL
As I explained in this article, an EXISTS subquery is a much better alternative. Therefore, we can achieve our goal using the following JPQL query:
List<Post> posts = entityManager.createQuery(""" select p from Post p where exists ( select 1 from PostComment pc where pc.post = p and pc.score > :minScore ) order by p.id """, Post.class) .setParameter("minScore", 10) .getResultList();
When running the JPQL query above, Hibernate generates the following SQL query:
SELECT p.id AS id1_0_, p.title AS title2_0_ FROM post p WHERE EXISTS ( SELECT 1 FROM post_comment pc WHERE pc.post_id=p.id AND pc.score > ? ) ORDER BY p.id
The advantage of this query is that the SemiJoin does not need to join all post
and post_comment
records since as soon as a post_comment
is found to match the filtering criteria (e.g., pc.score > ?
), the EXISTS
clause returns true
and the query proceeds to the next post
record.
EXISTS subqueries with Criteria API
If you want to build the entity query dynamically, then you can use a Criteria API since, like JPQL, it supports subquery filtering.
The previous JPQL query can be rewritten to a Criteria API query, like this:
CriteriaBuilder builder = entityManager.getCriteriaBuilder(); CriteriaQuery<Post> query = builder.createQuery(Post.class); Root<Post> p = query.from(Post.class); ParameterExpression<Integer> minScore = builder.parameter(Integer.class); Subquery<Integer> subQuery = query.subquery(Integer.class); Root<PostComment> pc = subQuery.from(PostComment.class); subQuery .select(builder.literal(1)) .where( builder.equal(pc.get(PostComment_.POST), p), builder.gt(pc.get(PostComment_.SCORE), minScore) ); query.where(builder.exists(subQuery)); List<Post> posts = entityManager.createQuery(query) .setParameter(minScore, 10) .getResultList();
The Criteria API query above generates the very same SQL query the previous JPQL query has generated.
EXISTS subqueries with Blaze Persistence
If you are not a big fan of Criteria API, then there’s a much better alternative to building dynamic entity queries. Blaze Persistence allows you to write dynamic queries that, not only are more readable, but they are also more powerful since you can use LATERAL JOIN, Derived Tables, Common Table Expressions, or Window Functions.
The previous Criteria API query can be rewritten using Criteria API, like this:
final String POST_ALIAS = "p"; final String POST_COMMENT_ALIAS = "pc"; List<Post> posts = cbf.create(entityManager, Post.class) .from(Post.class, POST_ALIAS) .whereExists() .from(PostComment.class, POST_COMMENT_ALIAS) .select("1") .where(PostComment_.POST).eqExpression(POST_ALIAS) .where(PostComment_.SCORE).gtExpression(":minScore") .end() .select(POST_ALIAS) .setParameter("minScore", 10) .getResultList();
When executing the Blaze Persistence query above, Hibernate will generate the same SQL statement that was generated by the aforementioned JPQL or Criteria API queries.
Awesome, right?
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
SemiJoins are very useful for filtering, and you should prefer them to EquiJoins when the query projection doesn’t contain any of the joined columns.
In SQL, SemiJoins are expressed using EXISTS subqueries, and this feature is not limited to native SQL queries as you can use EXISTS in your JPA and Hibernate entity queries with both JPQL and Criteria API, and Blaze Persistence queries.
