The best way to initialize LAZY entity and collection proxies with JPA and Hibernate

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, we are going to see the best way to initialize LAZY proxies and collections when using JPA and Hibernate.

I decided to write this article because there are way too many resources available on the Internet that mislead the reader into using awkward and inefficient practices.

Domain Model

Let’s assume we have a parent Post entity which has a bidirectional @OneToMany association with the PostComment child entity.

Hibernate initialize proxy entities

The Post entity is mapped like this:

@Entity(name = "Post")
@Table(name = "post")
@Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Post {

    @Id
    private Long id;

    private String title;

    @OneToMany(
        mappedBy = "post",
        cascade = CascadeType.ALL,
        orphanRemoval = true
    )
    @Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
    private List<PostComment> comments = new ArrayList<>();

    public Long getId() {
        return id;
    }

    public Post setId(Long id) {
        this.id = id;
        return this;
    }

    public String getTitle() {
        return title;
    }

    public Post setTitle(String title) {
        this.title = title;
        return this;
    }

    public List<PostComment> getComments() {
        return comments;
    }

    public void addComment(PostComment comment) {
        comments.add(comment);
        comment.setPost(this);
    }

    public void removeComment(PostComment comment) {
        comments.remove(comment);
        comment.setPost(null);
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Post)) return false;
        return id != null && id.equals(((Post) o).getId());
    }

    @Override
    public int hashCode() {
        return getClass().hashCode();
    }
}

There are several aspects of the Post entity mapping that are worth explaining:

  • The Post entity uses the READ_WRITE second-level cache concurrency strategy which works in write-through mode.
  • The setters follow a Fluent-style API which is supported by Hibernate.
  • Because the @OneToMany association is bidirectional, we provide the add/remove utility methods to ensure that both sides of the association at kept in-sync. Failing to synchronize both ends of a bidirectional association can cause very hard-to-track issues.
  • The hashCode method returns a constant value since the entity identifier is used for equality checks. This is a technique I introduced 2 years ago since, previously, it was thought that you cannot use the entity identifier when comparing JPQ entity logical equivalence.

The PostComment entity is mapped like this:

@Entity(name = "PostComment")
@Table(name = "post_comment")
@Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class PostComment {

    @Id
    private Long id;

    private String review;

    @ManyToOne(fetch = FetchType.LAZY)
    private Post post;

    public Long getId() {
        return id;
    }

    public PostComment setId(Long id) {
        this.id = id;
        return this;
    }

    public String getReview() {
        return review;
    }

    public PostComment setReview(String review) {
        this.review = review;
        return this;
    }

    public Post getPost() {
        return post;
    }

    public PostComment setPost(Post post) {
        this.post = post;
        return this;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof PostComment)) return false;
        return id != null && id.equals(((PostComment) o).id);
    }

    @Override
    public int hashCode() {
        return getClass().hashCode();
    }

    @Override
    public String toString() {
        return "PostComment{" +
                "id=" + id +
                ", review='" + review + ''' +
                '}';
    }
}

Notice that the fetch strategy of the @ManyToOne association is set to FetchType.LAZY because, by default, @ManyToOne and @OneToOne associations are fetched eagerly, and this can lead to N+1 query issues among other performance issues. For more details, check out this article.

Using the Hibernate initialize without the second-level cache

A lazy-loaded entity or a collection is substituted by a Proxy prior to fetching the entity or the collection. The Proxy can be initialized by accessing any entity property or collection element or by using the Hibernate.initialize method.

Now, let’s consider the following example:

LOGGER.info("Clear the second-level cache");

entityManager.getEntityManagerFactory().getCache().evictAll();

LOGGER.info("Loading a PostComment");

PostComment comment = entityManager.find(
    PostComment.class,
    1L
);

assertEquals(
    "A must read!",
    comment.getReview()
);

Post post = comment.getPost();

LOGGER.info("Post entity class: {}", post.getClass().getName());

Hibernate.initialize(post);

assertEquals(
    "High-Performance Java Persistence",
    post.getTitle()
);

First, we are going to clear the second-level cache since, unless you explicitly enable the second-level cache and configure a provider, Hibernate is not going to use the second-level cache.

When running this test case, Hibernate executes the following SQL statements:

-- Clear the second-level cache

-- Evicting entity cache: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post
-- Evicting entity cache: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment

-- Loading a PostComment

SELECT pc.id AS id1_1_0_,
       pc.post_id AS post_id3_1_0_,
       pc.review AS review2_1_0_
FROM   post_comment pc
WHERE  pc.id=1

-- Post entity class: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post$HibernateProxy$5LVxadxF

SELECT p.id AS id1_0_0_,
       p.title AS title2_0_0_
FROM   post p
WHERE  p.id=1

We can see that the second-level cache was properly evicted and that, after fetching the PostComment entity, the post entity is represented by a HibernateProxy instance which only contains the Post entity identifier that was retrieved from the post_id column of the post_comment database table row.

Now, due to the call to the Hibernate.initialize method, a secondary SQL query is executed to fetch the Post entity, and that’s not very efficient and can lead to N+1 query issues.

So, if you’re not using the second-level cache, it’s not a good idea to fetch lazy associations using secondary SQL queries either by traversing them or using the Hibernate.initialize method.

In the previous case, the PostComment should be fetched along with its post association using the JOIN FETCH JPQL directive.

LOGGER.info("Clear the second-level cache");

entityManager.getEntityManagerFactory().getCache().evictAll();

LOGGER.info("Loading a PostComment");

PostComment comment = entityManager.createQuery(
    "select pc " +
    "from PostComment pc " +
    "join fetch pc.post " +
    "where pc.id = :id", PostComment.class)
.setParameter("id", 1L)
.getSingleResult();

assertEquals(
    "A must read!",
    comment.getReview()
);

Post post = comment.getPost();

LOGGER.info("Post entity class: {}", post.getClass().getName());

assertEquals(
    "High-Performance Java Persistence",
    post.getTitle()
);

This time, Hibernate execute a single SQL statement, and we no longer risk to bump into N+1 query issues:

-- Clear the second-level cache

-- Evicting entity cache: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post
-- Evicting entity cache: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment

-- Loading a PostComment

SELECT pc.id AS id1_1_0_,
       p.id AS id1_0_1_,
       pc.post_id AS post_id3_1_0_,
       pc.review AS review2_1_0_,
       p.title AS title2_0_1_
FROM   post_comment pc
INNER JOIN post p ON pc.post_id=p.id
WHERE  pc.id=1

-- Post entity class: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post

Notice that the Post entity class is not a HibernateProxy anymore because the post association is fetched at query time and initialized as a POJO.

Using the Hibernate initialize with the second-level cache

So, to see when the Hibernate.initialize is really worth using, you need to use the second-level cache:

LOGGER.info("Loading a PostComment");

PostComment comment = entityManager.find(
    PostComment.class,
    1L
);

assertEquals(
    "A must read!",
    comment.getReview()
);

Post post = comment.getPost();

LOGGER.info("Post entity class: {}", post.getClass().getName());

Hibernate.initialize(post);

assertEquals(
    "High-Performance Java Persistence",
    post.getTitle()
);

This time, we are no longer evicting the second-level cache regions, and, since we are using the READ_WRITE cache concurrency strategy, the entities are cached right after they get persisted, hence no SQL query is needed to be executed when running the test case above:

-- Loading a PostComment

-- Cache hit : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment#1`

-- Proxy class: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post$HibernateProxy$rnxGtvMK

-- Cache hit : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post#1`

Both the PostComment and the post association are fetched from the second-level cache as illustrated by the Cache hit log messages.

So, if you are using the second-level cache, it’s fine to use the Hibernate.initiaize to fetch extra associations that you need to fulfill your business use case. In this case, even if you have N+1 cache calls, each call should run very quickly since the second-level cache is configured properly and data is returned from the memory.

The Hibernate.initialize can be used for collections as well. Now, because second-level cache collections are read-through, meaning that they are stored in the cache the first time they get loaded when running the following test case:

LOGGER.info("Loading a Post");

Post post = entityManager.find(
    Post.class,
    1L
);

List<PostComment> comments = post.getComments();

LOGGER.info("Collection class: {}", comments.getClass().getName());

Hibernate.initialize(comments);

LOGGER.info("Post comments: {}", comments);

Hibernate executes an SQL query to load the PostComment collection:

-- Loading a Post

-- Cache hit : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post#1`

-- Collection class: org.hibernate.collection.internal.PersistentBag

- Cache hit, but item is unreadable/invalid : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post.comments`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post.comments#1`

SELECT pc.post_id AS post_id3_1_0_,
       pc.id AS id1_1_0_,
       pc.id AS id1_1_1_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_
FROM   post_comment pc
WHERE  pc.post_id=1

-- Post comments: [
    PostComment{id=1, review='A must read!'}, 
    PostComment{id=2, review='Awesome!'}, 
    PostComment{id=3, review='5 stars'}
]

However, if the PostComment collection is already cached:

doInJPA(entityManager -> {
    Post post = entityManager.find(Post.class, 1L);

    assertEquals(3, post.getComments().size());
});

When running the previous test case, Hibernate can fetch all data from the cache only:

-- Loading a Post

-- Cache hit : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post#1`

-- Collection class: org.hibernate.collection.internal.PersistentBag

-- Cache hit : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post.comments`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post.comments#1`

-- Cache hit : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment#1`

-- Cache hit : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment#2`

-- Cache hit : 
region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment`, 
key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment#3`

I'm running an online workshop on the 20-21 and 23-24 of November about High-Performance Java Persistence.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

The Hibernate.initialize method is useful when loading a Proxy entity or collection that’s stored in the second-level cache. If the underlying entity or collection is not cached, then using loading the Proxy with a secondary SQL query is less efficient than loading the lazy association from the very beginning using a JOIN FETCH directive.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.