The JPA and Hibernate second-level cache
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
In this article, I’m going to explain how the JPA and Hibernate second-level cache mechanism works and why they are very important when it comes to improving the performance of your data access layer.
JPA and Hibernate entity first-level and second-level cache
As I explained in this article, JPA and Hibernate feature a first-level cache as well. However, the first-level cache is bund to the currently executing Thread, so the cached entities cannot be shared by multiple concurrent requests.
On the other hand, the second-level cache is designed to be used by multiple concurrent requests, therefore increasing the likelihood of getting a cache hit.
When fetching a JPA entity:
Post post = entityManager.find(Post.class, 1L);
LoadEntityEvent is triggered, which is handled by the
DefaultLoadEventListener like this:
First, Hibernate checks whether the first-level cache (a.k.a JPA
Session, or Persistence Context) already contains the entity, and if it does, the managed entity is returned.
If the JPA entity is not found in the first-level cache, Hibernate will check the second-level cache if it’s enabled.
If the entity cannot be fetched from the first or second-level cache, Hibernate will load it from the database using an SQL query. The JDBC
ResultSet from the entity loading query is transformed into a Java
Object that’s known as the entity loaded state.
The loaded state array is stored in the first-level cache along with the managed entity in order to help the Hibernate dirty checking mechanism discover if an entity has been modified:
However, the very same entity loaded state is also what’s being loaded from the JPA and Hibernate second-level cache when bypassing the database.
The JPA and Hibernate second-level cache is the cache of the entity loaded state array, not of the actual entity object reference.
Why use the JPA and Hibernate second-level cache
Now that you have seen how the second level cache works when fetching entities, you might wonder why not fetch the entity directly from the database.
And that’s where the second-level cache comes into play. For read-write database transactions that need to be executed on the Primary node, the second-level cache can help you reduce the query load by directing it to the strongly consistent second-level cache:
The JPA and Hibernate second-level cache can help you speed up read-write transactions by offloading the read traffic from the Primary node and serve it from the cache.
Scaling the JPA and Hibernate second-level cache
Traditionally, the second-level cache was stored in the memory of the application, and that was problematic for several reasons.
First, the application memory is limited, so the volume of data that can be cached is limited as well.
Second, when traffic increases and we want to start new application nodes to handle the extra traffic, the new nodes would start with a cold cache, making the problem even worse as they incur a spike in database load until the cache is populated with data:
To address this issue, it’s better to have the cache running as a distributed system, like Redis. This way, the amount of cached data is not limited by the memory size on a single node since sharding can be used to split the data among multiple nodes.
And, when a new application node is added by the auto-scaler, the new node will load data from the same distributed cache. Hence, there’s no cold cache issue anymore.
JPA and Hibernate second-level cache options
There are several things that can be stored by the JPA and Hibernate second-level cache:
- entity loaded state
- collection entity identifiers
- query results for both entities and DTO projections
- the associated entity identifier for a given natural identifier
So, the second-level cache is not limited to fetching entities only.
The JPA and Hibernate second-level cache is very useful when having to scale rad-write transactions. Because the second-level cache is designed to be strongly consistent, you don’t have to worry that stale data is going to be served from the cache.
More, you don’t have to worry about keeping track of database modifications in order to schedule cache updates either because this is done transparently by Hibernate for you.