The best way to initialize LAZY entity and collection proxies with JPA and Hibernate
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, we are going to see the best way to initialize LAZY proxies and collections when using JPA and Hibernate.
I decided to write this article because there are way too many resources available on the Internet that mislead the reader into using awkward and inefficient practices.
The best way to initialize LAZY proxies and collections when using JPA and #Hibernate. @vlad_mihalcea https://t.co/kWpi3etBAZ pic.twitter.com/sVqeMgFSLu
— Java (@java) December 6, 2018
Domain Model
Let’s assume we have a parent Post
entity which has a bidirectional @OneToMany
association with the PostComment
child entity.
The Post
entity is mapped like this:
@Entity(name = "Post") @Table(name = "post") @Cache(usage = CacheConcurrencyStrategy.READ_WRITE) public class Post { @Id private Long id; private String title; @OneToMany( mappedBy = "post", cascade = CascadeType.ALL, orphanRemoval = true ) @Cache(usage = CacheConcurrencyStrategy.READ_WRITE) private List<PostComment> comments = new ArrayList<>(); public Long getId() { return id; } public Post setId(Long id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } public List<PostComment> getComments() { return comments; } public void addComment(PostComment comment) { comments.add(comment); comment.setPost(this); } public void removeComment(PostComment comment) { comments.remove(comment); comment.setPost(null); } @Override public boolean equals(Object o) { if (this == o) return true; if (!(o instanceof Post)) return false; return id != null && id.equals(((Post) o).getId()); } @Override public int hashCode() { return getClass().hashCode(); } }
There are several aspects of the Post
entity mapping that are worth explaining:
- The
Post
entity uses theREAD_WRITE
second-level cache concurrency strategy which works in write-through mode. - The setters follow a Fluent-style API which is supported by Hibernate.
- Because the
@OneToMany
association is bidirectional, we provide the add/remove utility methods to ensure that both sides of the association at kept in-sync. Failing to synchronize both ends of a bidirectional association can cause very hard-to-track issues. - The
hashCode
method returns a constant value since the entity identifier is used for equality checks. This is a technique I introduced 2 years ago since, previously, it was thought that you cannot use the entity identifier when comparing JPQ entity logical equivalence.
The PostComment
entity is mapped like this:
@Entity(name = "PostComment") @Table(name = "post_comment") @Cache(usage = CacheConcurrencyStrategy.READ_WRITE) public class PostComment { @Id private Long id; private String review; @ManyToOne(fetch = FetchType.LAZY) private Post post; public Long getId() { return id; } public PostComment setId(Long id) { this.id = id; return this; } public String getReview() { return review; } public PostComment setReview(String review) { this.review = review; return this; } public Post getPost() { return post; } public PostComment setPost(Post post) { this.post = post; return this; } @Override public boolean equals(Object o) { if (this == o) return true; if (!(o instanceof PostComment)) return false; return id != null && id.equals(((PostComment) o).id); } @Override public int hashCode() { return getClass().hashCode(); } @Override public String toString() { return "PostComment{" + "id=" + id + ", review='" + review + ''' + '}'; } }
Notice that the fetch strategy of the
@ManyToOne
association is set toFetchType.LAZY
because, by default,@ManyToOne
and@OneToOne
associations are fetched eagerly, and this can lead to N+1 query issues among other performance issues. For more details, check out this article.
Using the Hibernate initialize without the second-level cache
A lazy-loaded entity or a collection is substituted by a Proxy prior to fetching the entity or the collection. The Proxy can be initialized by accessing any entity property or collection element or by using the Hibernate.initialize
method.
Now, let’s consider the following example:
LOGGER.info("Clear the second-level cache"); entityManager.getEntityManagerFactory().getCache().evictAll(); LOGGER.info("Loading a PostComment"); PostComment comment = entityManager.find( PostComment.class, 1L ); assertEquals( "A must read!", comment.getReview() ); Post post = comment.getPost(); LOGGER.info("Post entity class: {}", post.getClass().getName()); Hibernate.initialize(post); assertEquals( "High-Performance Java Persistence", post.getTitle() );
First, we are going to clear the second-level cache since, unless you explicitly enable the second-level cache and configure a provider, Hibernate is not going to use the second-level cache.
When running this test case, Hibernate executes the following SQL statements:
-- Clear the second-level cache -- Evicting entity cache: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post -- Evicting entity cache: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment -- Loading a PostComment SELECT pc.id AS id1_1_0_, pc.post_id AS post_id3_1_0_, pc.review AS review2_1_0_ FROM post_comment pc WHERE pc.id=1 -- Post entity class: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post$HibernateProxy$5LVxadxF SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id=1
We can see that the second-level cache was properly evicted and that, after fetching the PostComment
entity, the post
entity is represented by a HibernateProxy
instance which only contains the Post
entity identifier that was retrieved from the post_id
column of the post_comment
database table row.
Now, due to the call to the Hibernate.initialize
method, a secondary SQL query is executed to fetch the Post
entity, and that’s not very efficient and can lead to N+1 query issues.
So, if you’re not using the second-level cache, it’s not a good idea to fetch lazy associations using secondary SQL queries either by traversing them or using the
Hibernate.initialize
method.
In the previous case, the PostComment
should be fetched along with its post
association using the JOIN FETCH
JPQL directive.
LOGGER.info("Clear the second-level cache"); entityManager.getEntityManagerFactory().getCache().evictAll(); LOGGER.info("Loading a PostComment"); PostComment comment = entityManager.createQuery( "select pc " + "from PostComment pc " + "join fetch pc.post " + "where pc.id = :id", PostComment.class) .setParameter("id", 1L) .getSingleResult(); assertEquals( "A must read!", comment.getReview() ); Post post = comment.getPost(); LOGGER.info("Post entity class: {}", post.getClass().getName()); assertEquals( "High-Performance Java Persistence", post.getTitle() );
This time, Hibernate execute a single SQL statement, and we no longer risk to bump into N+1 query issues:
-- Clear the second-level cache -- Evicting entity cache: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post -- Evicting entity cache: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment -- Loading a PostComment SELECT pc.id AS id1_1_0_, p.id AS id1_0_1_, pc.post_id AS post_id3_1_0_, pc.review AS review2_1_0_, p.title AS title2_0_1_ FROM post_comment pc INNER JOIN post p ON pc.post_id=p.id WHERE pc.id=1 -- Post entity class: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post
Notice that the Post
entity class is not a HibernateProxy
anymore because the post
association is fetched at query time and initialized as a POJO.
Using the Hibernate initialize with the second-level cache
So, to see when the Hibernate.initialize
is really worth using, you need to use the second-level cache:
LOGGER.info("Loading a PostComment"); PostComment comment = entityManager.find( PostComment.class, 1L ); assertEquals( "A must read!", comment.getReview() ); Post post = comment.getPost(); LOGGER.info("Post entity class: {}", post.getClass().getName()); Hibernate.initialize(post); assertEquals( "High-Performance Java Persistence", post.getTitle() );
This time, we are no longer evicting the second-level cache regions, and, since we are using the READ_WRITE
cache concurrency strategy, the entities are cached right after they get persisted, hence no SQL query is needed to be executed when running the test case above:
-- Loading a PostComment -- Cache hit : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment#1` -- Proxy class: com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post$HibernateProxy$rnxGtvMK -- Cache hit : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post#1`
Both the PostComment
and the post
association are fetched from the second-level cache as illustrated by the Cache hit
log messages.
So, if you are using the second-level cache, it’s fine to use the Hibernate.initiaize
to fetch extra associations that you need to fulfill your business use case. In this case, even if you have N+1 cache calls, each call should run very quickly since the second-level cache is configured properly and data is returned from the memory.
The Hibernate.initialize
can be used for collections as well. Now, because second-level cache collections are read-through, meaning that they are stored in the cache the first time they get loaded when running the following test case:
LOGGER.info("Loading a Post"); Post post = entityManager.find( Post.class, 1L ); List<PostComment> comments = post.getComments(); LOGGER.info("Collection class: {}", comments.getClass().getName()); Hibernate.initialize(comments); LOGGER.info("Post comments: {}", comments);
Hibernate executes an SQL query to load the PostComment
collection:
-- Loading a Post -- Cache hit : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post#1` -- Collection class: org.hibernate.collection.internal.PersistentBag - Cache hit, but item is unreadable/invalid : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post.comments`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post.comments#1` SELECT pc.post_id AS post_id3_1_0_, pc.id AS id1_1_0_, pc.id AS id1_1_1_, pc.post_id AS post_id3_1_1_, pc.review AS review2_1_1_ FROM post_comment pc WHERE pc.post_id=1 -- Post comments: [ PostComment{id=1, review='A must read!'}, PostComment{id=2, review='Awesome!'}, PostComment{id=3, review='5 stars'} ]
However, if the PostComment
collection is already cached:
doInJPA(entityManager -> { Post post = entityManager.find(Post.class, 1L); assertEquals(3, post.getComments().size()); });
When running the previous test case, Hibernate can fetch all data from the cache only:
-- Loading a Post -- Cache hit : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post#1` -- Collection class: org.hibernate.collection.internal.PersistentBag -- Cache hit : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post.comments`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$Post.comments#1` -- Cache hit : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment#1` -- Cache hit : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment#2` -- Cache hit : region = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment`, key = `com.vladmihalcea.book.hpjp.hibernate.fetching.HibernateInitializeTest$PostComment#3`
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
The Hibernate.initialize
method is useful when loading a Proxy entity or collection that’s stored in the second-level cache. If the underlying entity or collection is not cached, then using loading the Proxy with a secondary SQL query is less efficient than loading the lazy association from the very beginning using a JOIN FETCH
directive.
