How to optimize JPQL and Criteria API query plans with Hibernate Statistics
Are you struggling with performance issues in your Spring, Jakarta EE, or Java EE application?
What if there were a tool that could automatically detect what caused performance issues in your JPA and Hibernate data access layer?
Wouldn’t it be awesome to have such a tool to watch your application and prevent performance issues during development, long before they affect production systems?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, Micronaut, or Play Framework.
So, rather than fixing performance issues in your production system on a Saturday night, you are better off using Hypersistence Optimizer to help you prevent those issues so that you can spend your time on the things that you love!
Introduction
Every entity query, be it JPQL or Criteria API, needs to be parsed and compiled to an AST (Abstract Syntax Tree) in order to generate the associated SQL query. The entity query compilation takes time, as explained in this article so Hibernate provides a QueryPlanCache to store already-compiled plans.
Starting with Hibernate 5.4, the Hibernate Statistics mechanism allows you to monitor the Query Plan Cache and this article will show you how to take advantage of this feature to speed up IN query performance.
For an introduction about the Hibernate Statistics mechanism, check out this article.
How to optimize JPQL and Criteria API query plans with #Hibernate Statistics @vlad_mihalcea https://t.co/bHfkKKCGOn pic.twitter.com/1k2PUeVpb7
— Java (@java) November 9, 2018
Domain Model
Assuming we have the following Post entity:
Which is mapped like this:
@Entity(name = "Post")
@Table(name = "post")
public class Post {
@Id
private Integer id;
private String title;
//Getters and setters omitted for brevity
}
JPQL IN query
Now, let’s see what happens when Hibernate executes a JPQL query which contains an IN clause:
SessionFactory sessionFactory = entityManagerFactory()
.unwrap(SessionFactory.class);
Statistics statistics = sessionFactory.getStatistics();
statistics.clear();
doInJPA(entityManager -> {
List<Post> posts = entityManager.createQuery(
"select p " +
"from Post p " +
"where p.id in :ids", Post.class)
.setParameter("ids", Arrays.asList(1, 2, 3))
.getResultList();
});
for (String query : statistics.getQueries()) {
LOGGER.info("Executed query: {}", query);
}
To check the executed entity queries, we can use the Statistics#getQueries method. However, by default, Hibernate does not collect any statistics, so we need to enable this feature by setting the hibernate.generate_statistics configuration property to true.
<property name="hibernate.generate_statistics" value="true"/>
Now, when executing the JPQL query above, the following queries are going to be logged:
-- Executed query: select p from Post p where p.id in :ids -- Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2)
The first query is the one we supplied, while the second one is the expanded query since JDBC does not support binding an array of values as individual IN query parameter values.
Criteria API IN query
Now, let’s translate the previous JPQL query to its Criteria API equivalent:
SessionFactory sessionFactory = entityManagerFactory()
.unwrap(SessionFactory.class);
Statistics statistics = sessionFactory.getStatistics();
statistics.clear();
doInJPA(entityManager -> {
CriteriaBuilder builder = entityManager.getCriteriaBuilder();
CriteriaQuery<Post> criteria = builder.createQuery(Post.class);
Root<Post> fromPost = criteria.from(Post.class);
criteria.where(
builder.in(
fromPost.get("id")).value(Arrays.asList(1, 2, 3)
)
);
List<Post> posts = entityManager
.createQuery(criteria)
.getResultList();
});
for (String query : statistics.getQueries()) {
LOGGER.info("Executed query: {}", query);
}
When executing the Criteria API query, Hibernate logs the following executing entity queries:
-- Executed query: select generatedAlias0 from Post as generatedAlias0 where generatedAlias0.id in (:param0) -- Executed query: select generatedAlias0 from Post as generatedAlias0 where generatedAlias0.id in (:param0_0, :param0_1, :param0_2)
So, every entity query, be it JPQL or Criteria API, will have to be expanded if it contains an IN clause.
Query plan cache and IN clause expansion
Now, if the number of IN clause parameter varies, Hibernate will have to compile a significant number of entity queries, as illustrated by the following example:
SessionFactory sessionFactory = entityManagerFactory()
.unwrap(SessionFactory.class);
Statistics statistics = sessionFactory.getStatistics();
statistics.clear();
doInJPA(entityManager -> {
for (int i = 1; i < 16; i++) {
getPostByIds(
entityManager,
IntStream
.range(1, i + 1)
.boxed()
.toArray(Integer[]::new)
);
}
});
assertEquals(
16L,
statistics.getQueryPlanCacheMissCount()
);
for (String query : statistics.getQueries()) {
LOGGER.info("Executed query: {}", query);
}
When executing the test case above, Hibernate generates the following output:
--Executed query: select p from Post p where p.id in :ids --Executed query: select p from Post p where p.id in (:ids_0) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7, :ids_8) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7, :ids_8, :ids_9) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7, :ids_8, :ids_9, :ids_10) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7, :ids_8, :ids_9, :ids_10, :ids_11) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7, :ids_8, :ids_9, :ids_10, :ids_11, :ids_12) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7, :ids_8, :ids_9, :ids_10, :ids_11, :ids_12, :ids_13) --Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7, :ids_8, :ids_9, :ids_10, :ids_11, :ids_12, :ids_13, :ids_14)
So, although the same JPQL is executed multiple times because of IN clause expansion,
Hibernate generates 15 JPQL queries that need to be parsed and compiled separately.
The statistics.getQueryPlanCacheMissCount() returns a value of 16 which includes the original JPQL query and the 15 JPQL queries resulted after IN clause expansion. For this reason, the Query Plan Cache does not help in this situation.
IN clause parameter padding
As already explained in this article, Hibernate offers the IN clause parameter padding feature which can help us reduce the Query Plan Cache miss count we saw previously.
To enable IN clause parameter padding, you need to enable the following configuration property:
<property name="hibernate.query.in_clause_parameter_padding" value="true"/>
After enabling the hibernate.query.in_clause_parameter_padding property,
we can expect just 6 IN clause query expansions:
Statistics statistics = sessionFactory.getStatistics();
statistics.clear();
doInJPA(entityManager -> {
for (int i = 1; i < 16; i++) {
getPostByIds(
entityManager,
IntStream
.range(1, i + 1)
.boxed()
.toArray(Integer[]::new)
);
}
});
assertEquals(
6L,
statistics.getQueryPlanCacheMissCount()
);
for (String query : statistics.getQueries()) {
LOGGER.info("Executed query: {}", query);
}
When running the test case above, we can indeed see that only 6 query expansions took place:
-- Executed query: select p from Post p where p.id in :ids -- Executed query: select p from Post p where p.id in (:ids_0) -- Executed query: select p from Post p where p.id in (:ids_0, :ids_1) -- Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3) -- Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7) -- Executed query: select p from Post p where p.id in (:ids_0, :ids_1, :ids_2, :ids_3, :ids_4, :ids_5, :ids_6, :ids_7, :ids_8, :ids_9, :ids_10, :ids_11, :ids_12, :ids_13, :ids_14, :ids_15)
Cool, right?
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
So, when JPQL or Criteria API entity queries are executed with variable numbers of IN clause bind parameter values, you can improve application performance by enabling the hibernate.query.in_clause_parameter_padding configuration property.
Also, thanks to the Query Plan Cache statistics included in Hibernate 5.4 Statistics, you can get a better understanding of what Hibernate does behind the scenes.



