How to write EXISTS subqueries with JPA and Hibernate
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
In this article, I’m going to show you how to write EXISTS subqueries with JPA and Hibernate.
EXISTS subqueries are very useful as they allow you to implement SemiJoins. Unfortunately, many application developers are not aware of SemiJoins, and they end up emulating it using EquiJoins (e.g., INNER JOIN) at the cost of query performance.
Let’s assume we are using the following
Post entity is the parent and the
PostComment is the child entity since the
PostComment references the parent via its
Fetching parent entities while filtering by child entities
Let’s assume we want to fetch all
Post entities that have a
PostComent with a score greater than 10. Most developers would mistakenly use the following query:
List<Post> posts = entityManager.createQuery(""" select distinct p from PostComment pc join pc.post p where pc.score > :minScore order by p.id """, Post.class) .setParameter("minScore", 10) .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false) .getResultList();
This query executes a join between the
post_comment just for the sake of filtering the
post records. Since the projection contains only the
Post entity, the JOIN is not needed in this case. Instead, a SemiJoin should be used to filter the
Post entity records.
HINT_PASS_DISTINCT_THROUGHis used to prevent the
DISTINCTkeyword to be passed to the underlying SQL query as deduplication is done for Java Object references, not SQL table records. Check out this article for more details about this topic.
EXISTS subqueries with JPQL
As I explained in this article, an EXISTS subquery is a much better alternative. Therefore, we can achieve our goal using the following JPQL query:
List<Post> posts = entityManager.createQuery(""" select p from Post p where exists ( select 1 from PostComment pc where pc.post = p and pc.score > :minScore ) order by p.id """, Post.class) .setParameter("minScore", 10) .getResultList();
When running the JPQL query above, Hibernate generates the following SQL query:
SELECT p.id AS id1_0_, p.title AS title2_0_ FROM post p WHERE EXISTS ( SELECT 1 FROM post_comment pc WHERE pc.post_id=p.id AND pc.score > ? ) ORDER BY p.id
The advantage of this query is that the SemiJoin does not need to join all
post_comment records since as soon as a
post_comment is found to match the filtering criteria (e.g.,
pc.score > ?), the
EXISTS clause returns
true and the query proceeds to the next
EXISTS subqueries with Criteria API
If you want to build the entity query dynamically, then you can use a Criteria API since, like JPQL, it supports subquery filtering.
The previous JPQL query can be rewritten to a Criteria API query, like this:
CriteriaBuilder builder = entityManager.getCriteriaBuilder(); CriteriaQuery<Post> query = builder.createQuery(Post.class); Root<Post> p = query.from(Post.class); ParameterExpression<Integer> minScore = builder.parameter(Integer.class); Subquery<Integer> subQuery = query.subquery(Integer.class); Root<PostComment> pc = subQuery.from(PostComment.class); subQuery .select(builder.literal(1)) .where( builder.equal(pc.get(PostComment_.POST), p), builder.gt(pc.get(PostComment_.SCORE), minScore) ); query.where(builder.exists(subQuery)); List<Post> posts = entityManager.createQuery(query) .setParameter(minScore, 10) .getResultList();
The Criteria API query above generates the very same SQL query the previous JPQL query has generated.
EXISTS subqueries with Blaze Persistence
If you are not a big fan of Criteria API, then there’s a much better alternative to building dynamic entity queries. Blaze Persistence allows you to write dynamic queries that, not only are more readable, but they are also more powerful since you can use LATERAL JOIN, Derived Tables, Common Table Expressions, or Window Functions.
The previous Criteria API query can be rewritten using Criteria API, like this:
final String POST_ALIAS = "p"; final String POST_COMMENT_ALIAS = "pc"; List<Post> posts = cbf.create(entityManager, Post.class) .from(Post.class, POST_ALIAS) .whereExists() .from(PostComment.class, POST_COMMENT_ALIAS) .select("1") .where(PostComment_.POST).eqExpression(POST_ALIAS) .where(PostComment_.SCORE).gtExpression(":minScore") .end() .select(POST_ALIAS) .setParameter("minScore", 10) .getResultList();
When executing the Blaze Persistence query above, Hibernate will generate the same SQL statement that was generated by the aforementioned JPQL or Criteria API queries.
SemiJoins are very useful for filtering, and you should prefer them to EquiJoins when the query projection doesn’t contain any of the joined columns.
In SQL, SemiJoins are expressed using EXISTS subqueries, and this feature is not limited to native SQL queries as you can use EXISTS in your JPA and Hibernate entity queries with both JPQL and Criteria API, and Blaze Persistence queries.