The JPA and Hibernate second-level cache
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, I’m going to explain how the JPA and Hibernate second-level cache mechanism works and why they are very important when it comes to improving the performance of your data access layer.
JPA and Hibernate entity first-level and second-level cache
As I explained in this article, JPA and Hibernate feature a first-level cache as well. However, the first-level cache is bund to the currently executing Thread, so the cached entities cannot be shared by multiple concurrent requests.
On the other hand, the second-level cache is designed to be used by multiple concurrent requests, therefore increasing the likelihood of getting a cache hit.
When fetching a JPA entity:
Post post = entityManager.find(Post.class, 1L);
A Hibernate LoadEntityEvent
is triggered, which is handled by the DefaultLoadEventListener
like this:
First, Hibernate checks whether the first-level cache (a.k.a JPA EntityManager
, Hibernate Session
, or Persistence Context) already contains the entity, and if it does, the managed entity is returned.
If the JPA entity is not found in the first-level cache, Hibernate will check the second-level cache if it’s enabled.
If the entity cannot be fetched from the first or second-level cache, Hibernate will load it from the database using an SQL query. The JDBC ResultSet
from the entity loading query is transformed into a Java Object[]
that’s known as the entity loaded state.
The loaded state array is stored in the first-level cache along with the managed entity in order to help the Hibernate dirty checking mechanism discover if an entity has been modified:
However, the very same entity loaded state is also what’s being loaded from the JPA and Hibernate second-level cache when bypassing the database.
The JPA and Hibernate second-level cache is the cache of the entity loaded state array, not of the actual entity object reference.
Why use the JPA and Hibernate second-level cache
Now that you have seen how the second level cache works when fetching entities, you might wonder why not fetch the entity directly from the database.
Scaling read-only transactions can be done fairly easily by adding more Replica nodes. However, that does not work for the Primary node since that can be only scaled vertically.
And that’s where the second-level cache comes into play. For read-write database transactions that need to be executed on the Primary node, the second-level cache can help you reduce the query load by directing it to the strongly consistent second-level cache:
The JPA and Hibernate second-level cache can help you speed up read-write transactions by offloading the read traffic from the Primary node and serve it from the cache.
Scaling the JPA and Hibernate second-level cache
Traditionally, the second-level cache was stored in the memory of the application, and that was problematic for several reasons.
First, the application memory is limited, so the volume of data that can be cached is limited as well.
Second, when traffic increases and we want to start new application nodes to handle the extra traffic, the new nodes would start with a cold cache, making the problem even worse as they incur a spike in database load until the cache is populated with data:
To address this issue, it’s better to have the cache running as a distributed system, like Redis. This way, the amount of cached data is not limited by the memory size on a single node since sharding can be used to split the data among multiple nodes.
And, when a new application node is added by the auto-scaler, the new node will load data from the same distributed cache. Hence, there’s no cold cache issue anymore.
JPA and Hibernate second-level cache options
There are several things that can be stored by the JPA and Hibernate second-level cache:
- entity loaded state
- collection entity identifiers
- query results for both entities and DTO projections
- the associated entity identifier for a given natural identifier
So, the second-level cache is not limited to fetching entities only.
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
And there is more!
You can earn a significant passive income stream from promoting all these amazing products that I have been creating.
If you're interested in supplementing your income, then join my affiliate program.
Conclusion
The JPA and Hibernate second-level cache is very useful when having to scale rad-write transactions. Because the second-level cache is designed to be strongly consistent, you don’t have to worry that stale data is going to be served from the cache.
More, you don’t have to worry about keeping track of database modifications in order to schedule cache updates either because this is done transparently by Hibernate for you.
