How to avoid the Hibernate Query Cache N+1 issue

(Last Updated On: April 24, 2018)

Introduction

I recently answered this question on the Hibernate forum, and since it’s a very good one, I decided to turn it into an article.

In this post, we will describe how the N+1 query issue is generated when using the second-level Hibernate Query Cache.

Domain Model

Assuming we have the following Domain Model classes:

Which are mapped as follows:

@Entity(name = "Post")
@Table(name = "post")
@org.hibernate.annotations.Cache(
    usage = CacheConcurrencyStrategy.READ_WRITE
)
public class Post {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    //Getters and setters omitted for brevity
}

@Entity(name = "PostComment")
@Table(name = "post_comment")
@org.hibernate.annotations.Cache(
    usage = CacheConcurrencyStrategy.READ_WRITE
)
public class PostComment {

    @Id
    @GeneratedValue
    private Long id;

    @ManyToOne(fetch = FetchType.LAZY)
    private Post post;

    private String review;

    //Getters and setters omitted for brevity
}

So, both the Post and PostComment entities are cacheable and use the READ_WRITE CacheConcurrencyStrategy.

Query Cache

To enable the Query Cache, we need to enable the second-level cache as well. Therefore, the following configuration properties must be supplied:

<property 
    name="hibernate.cache.use_second_level_cache" 
    value="true" 
/>

<property 
    name="hibernate.cache.use_query_cache" 
    value="true" 
/>

<property 
    name="hibernate.cache.region.factory_class" 
    value="ehcache" 
/>

Although we have enabled the Query Cache, it does not automatically apply to any query and we need to explicitly tell Hibernate which queries are to be cached. To do so, you need to use the org.hibernate.cacheable query hint as illustrated by the following example:

public List<PostComment> getLatestPostComments(
        EntityManager entityManager) {
    return entityManager.createQuery(
        "select pc " +
        "from PostComment pc " +
        "order by pc.post.id desc", PostComment.class)
    .setMaxResults(10)
    .setHint(QueryHints.HINT_CACHEABLE, true)
    .getResultList();
}

Now, if we call the getLatestPostComments twice, we can see that the result is fetched from the cache the second time we execute this method.

Therefore, when executing this test case:

printCacheRegionStatistics(
    StandardQueryCache.class.getName()
);
assertEquals(
    3, 
    getLatestPostComments(entityManager).size()
);

printCacheRegionStatistics(
    StandardQueryCache.class.getName()
);
assertEquals(
    3, 
    getLatestPostComments(entityManager).size()
);

Hibernate generates the following output:

Region: org.hibernate.cache.internal.StandardQueryCache,
Statistics: SecondLevelCacheStatistics[
    hitCount=0,
    missCount=0,
    putCount=0,
    elementCountInMemory=0,
    elementCountOnDisk=0,
    sizeInMemory=0
],
Entries: {}

-- Checking cached query results in region: org.hibernate.cache.internal.StandardQueryCache
-- Query results were not found in cache

SELECT pc.id AS id1_1_,
       pc.post_id AS post_id3_1_,
       pc.review AS review2_1_
FROM post_comment pc
ORDER BY pc.post_id DESC
LIMIT 10

-- Caching query results in region: org.hibernate.cache.internal.StandardQueryCache; timestamp=6244549098291200

Region: org.hibernate.cache.internal.StandardQueryCache,
Statistics: SecondLevelCacheStatistics[
    hitCount=0,
    missCount=1,
    putCount=1,
    elementCountInMemory=1,
    elementCountOnDisk=0,
    sizeInMemory=776
],
Entries: {
sql: select pc.id as id1_1_, pc.post_id as post_id3_1_, pc.review as review2_1_ from post_comment pc order by pc.post_id desc; parameters: ; 
named parameters: {}; 
max rows: 10; 
transformer: org.hibernate.transform.CacheableResultTransformer@110f2=[
    6244549098291200, 
    4, 
    3, 
    2
]}

-- Checking cached query results in region: org.hibernate.cache.internal.StandardQueryCache
-- Checking query spaces are up-to-date: [post_comment]
-- [post_comment] last update timestamp: 6244549098266628, result set timestamp: 6244549098291200
-- Returning cached query results

As you can see in the log, only the first call has executed the SQL query as the second one used the cached result set.

N+1 query issue

Now, let’s see what happens if we evict all PostComment entities prior to running the second call to the getLatestPostComments method.

doInJPA(entityManager -> {
    entityManager
    .getEntityManagerFactory()
    .getCache()
    .evict(PostComment.class);
});

doInJPA(entityManager -> {
    assertEquals(
        3, 
        getLatestPostComments(entityManager).size()
    );
});

When running the test case above, Hibernate generates the following output:

-- Checking cached query results in region: org.hibernate.cache.internal.StandardQueryCache
-- Checking query spaces are up-to-date: [post_comment]
-- [post_comment] last update timestamp: 6244574473195524, result set timestamp: 6244574473207808
-- Returning cached query results

SELECT pc.id AS id1_1_0_,
       pc.post_id AS post_id3_1_0_,
       pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 4

SELECT pc.id AS id1_1_0_,
       pc.post_id AS post_id3_1_0_,
       pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 3

SELECT pc.id AS id1_1_0_,
       pc.post_id AS post_id3_1_0_,
       pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 2

As you can see in the logs, even if the entity identifiers were fetched from the Query Cache, because the entities are not found in the second-level cache, the PostComment entities are fetched using SQL queries.

If the Query Cache result contains N entity identifiers, N secondary queries will be executed, which might be actually worse than executing the query that we have previously cached.

This is a typical N+1 query issue, just that the first query is being served from the cache while the N ones hit the database.

As I explained before, you can detect all N+1 query issues using my db-util unit test assert mechanism and fix this problem long before deploying into production.

Avoiding the issue

To avoid this issue, you have to make sure that the cached entity is stored in the second-level cache.

. Make sure that the PostComment entity is cacheable, meaning that you have annotated dit with the Hibernate-specific @Cache annotation. Although JPA defines the @Cacheable annotation, that’s not enough since Hibernate needs to know what CacheConcurrencycStrategy you want to use for the entity in question.
. Also, make sure that the Ehcache timeToIdleSeconds or the equivalent TTL(Time to live) setting of the other second-level cache providers is greater for entities than for the Query Cache. This will ensure that entities will stay in the cache longer than the query cache result set which only stores the entity identifiers.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

While the Hibernate Query Cache can help you off-load the database server, you have to be very careful when caching entities because, unlike DTO projections, the result set is not served entirely from the Query Cache, being dependent on the second-level cache entity regions as well.

That’s why monitoring is very important when you enable the Hibernate second-level cache and its associated Query Cache.

Subscribe to our Newsletter

* indicates required
10 000 readers have found this blog worth following!

If you subscribe to my newsletter, you'll get:
  • A free sample of my Video Course about running Integration tests at warp-speed using Docker and tmpfs
  • 3 chapters from my book, High-Performance Java Persistence, 
  • a 10% discount coupon for my book. 
Get the most out of your persistence layer!

Advertisements

4 thoughts on “How to avoid the Hibernate Query Cache N+1 issue

  1. Hi Vlad,

    I have created an assertion library “datasource-assert” a while ago.
    (Internally it is using datasource-proxy).

    It has assertions for AssertJ, Hamcrest, and regular assertEquals.

    N+1 query(or query count) can be asserted like this:

    assertThat(dataSource).hasSelectCount(3);
    

    Project: https://github.com/ttddyy/datasource-assert

    Document: https://ttddyy.github.io/datasource-assert/docs/current/user-guide/index.html

    Thanks,

  2. Great post, Vlad!

    Query Cache is not that useful without Second Level Cache. Indeed, using Query Cache without caching entities properly leads to the worst scenario: N+1 Selects.

  3. First off, thanks for the explanation and sharing the interesting post. But I don’t understand below two statements:

    if we evict all PostComment entities prior to running the second call to the getLatestPostComments

    So once we evict PostComment entities, I believe, the subsequent query would still result in N+1 queries. How would having Hibernate-specific @Cache annotation help?

    Although JPA defines the @Cacheable annotation, that’s not enough since Hibernate needs to know what CacheConcurrencycStrategy you want to use for the entity in question.

    Why is @Cacheable not enough and how does having Hibernate CacheConcurrencyStrategy help?

    Could you please elaborate more on the above items.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.