How to improve statement caching efficiency with IN clause parameter padding

(Last Updated On: May 30, 2018)
Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

Recently, I stumbled on the following Twitter thread:

This jOOQ feature is indeed really useful since it reduces the number of SQL statements that have to be generated when varying the IN clause parameters dynamically.

Starting with Hibernate ORM 5.2.18, it’s now possible to use IN clause parameter padding so that you can improve SQL Statement Caching efficiency.

In this article, I’m going to explain how this new mechanism works and why you should definitely consider it when using a relational database system which supports Execution Plan caching.

Default behavior

Now, considering we have the following JPA entity:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    private Integer id;

    private String title;

    //Getters and setters omitted for brevity
}

And, let’s say we want to load multiple Post entities by their identifiers using the following JPA entity query:

List<Post> getPostByIds(
        EntityManager entityManager, 
        Integer... ids) {
    return entityManager.createQuery(
        "select p " +
        "from Post p " +
        "where p.id in :ids", Post.class)
    .setParameter("ids", Arrays.asList(ids))
    .getResultList();
}

When running the following test case:

assertEquals(
    3, 
    getPostByIds(entityManager, 1, 2, 3).size()
);

assertEquals(
    4, 
    getPostByIds(entityManager, 1, 2, 3, 4).size()
);

assertEquals(
    5, 
    getPostByIds(entityManager, 1, 2, 3, 4, 5).size()
);

assertEquals(
    6, 
    getPostByIds(entityManager, 1, 2, 3, 4, 5, 6).size()
);

Hibernate will execute the following SQL statements:

Query:["
    SELECT  p.id AS id1_0_, p.title AS title2_0_
    FROM    post p
    WHERE   p.id IN (? , ? , ?)
"], 
Params:[
    1, 2, 3
]

Query:["
    SELECT  p.id AS id1_0_, p.title AS title2_0_
    FROM    post p
    WHERE   p.id IN (?, ?, ?, ?)
"], 
Params:[
    1, 2, 3, 4
]

Query:["
    SELECT  p.id AS id1_0_, p.title AS title2_0_
    FROM    post p
    WHERE   p.id IN (? , ? , ? , ? , ?)
"], 
Params:[
    1, 2, 3, 4, 5
]

Query:["
    SELECT  p.id AS id1_0_, p.title AS title2_0_
    FROM    post p
    WHERE   p.id IN (? , ? , ? , ? , ? , ?)
"], 
Params:[
    1, 2, 3, 4, 5, 6
]

Each invocation generates a new SQL statement because the IN query clause requires a different number of bind parameters.

However, if the underlying relational database provides an Execution Plan cache, these 4 SQL queries will generate 4 different Execution Plans.

Therefore, in order to reuse an already generated Execution Plan, we need to use the same SQL statement String value for multiple combinations of IN clause bind parameters.

In clause parameter padding

If you enable the hibernate.query.in_clause_parameter_padding Hibernate

<property
    name="hibernate.query.in_clause_parameter_padding"
    value="true"
</property>

And rerun the previous test case, Hibernate will generate the following SQL queries:

Query:["
    SELECT  p.id AS id1_0_, p.title AS title2_0_
    FROM    post p
    WHERE   p.id IN (?, ?, ?, ?)
"], 
Params:[
    1, 2, 3, 3
]

Query:["
    SELECT  p.id AS id1_0_, p.title AS title2_0_
    FROM    post p
    WHERE   p.id IN (?, ?, ?, ?)
"], 
Params:[
    1, 2, 3, 4
]

Query:["
    SELECT  p.id AS id1_0_, p.title AS title2_0_
    FROM    post p
    WHERE   p.id IN (? , ? , ? , ? , ? , ? , ? , ?)
"], 
Params:[
    1, 2, 3, 4, 5, 5, 5, 5
]

Query:["
    SELECT  p.id AS id1_0_, p.title AS title2_0_
    FROM    post p
    WHERE   p.id IN (? , ? , ? , ? , ? , ? , ? , ?)
"], 
Params:[
    1, 2, 3, 4, 5, 6, 6, 6
]

Therefore, this time, only 2 Execution Plans are needed since both the first two queries and the last two ones have the same number of bind parameter values.

This is possible because Hibernate is now padding parameters until the next power of 2 number. So, for 3 and 4 parameters, 4 bind parameters are being used. For 5 and 6 parameters, 8 bind parameters are being used.

Cool, right?

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

If you’re using Oracle or SQL Server, then you can benefit from Execution Plan caching. The IN clause parameter padding feature increases the chance of reusing an already generated Execution Plan, especially when using a large number of IN clause parameters.

Download free ebook sample

Newsletter logo
10 000 readers have found this blog worth following!

If you subscribe to my newsletter, you'll get:
  • A free sample of my Video Course about running Integration tests at warp-speed using Docker and tmpfs
  • 3 chapters from my book, High-Performance Java Persistence,
  • a 10% discount coupon for my book.

15 Comments on “How to improve statement caching efficiency with IN clause parameter padding

  1. Do I get that right that this only applies to jpql/hql queries, but neither to legacy criteria nor to jpa criteria? At least my smoke tests indicate that.

    • It applies to Criteria API too as long as you bind the variable as parameters, not as literals.

      • I see, just tested this and it worked (JPA criteria). Correct me if I’m wrong, but there isn’t any possibility to bind parameters to the legacy criteria API (we have several spots where we still use this), so this is another reason to migrate to JPA criteria.

      • The legacy Criteria is deprecated for a very long time. If it does not support passing parameters, you might want to switch to JPA Criteria API.

  2. We are currently evaluating a different approach. We’re using a sightly different query:

    SELECT p.* FROM post p WHERE p.ID = ANY(?)

    And pass in a JDBC array.

    There are some downsides with this approach
    – not all databases support arrays
    – Oracle does not support anonymous arrays, a vendor API is required to create arrays
    – the standard syntax works only with Oracle 18c, a different syntax is needed for Oracle 12c

  3. Is this an ‘upcoming’ feature as 5.2.18 does not exist.

      • If you upgraded to 5.3, you’d have this feature. There’s no timeline for 5.2 since it’s not a priority right now.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.