How to customize Hibernate dirty checking mechanism
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In my previous article I described the Hibernate automatic dirty checking mechanism. While you should always prefer it, there might be times when you want to add your own custom dirtiness detection strategy.
Custom dirty checking strategies
Hibernate offers the following customization mechanisms:
The anatomy of Hibernate dirty checking mechanism
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
The persistence context enqueues entity state transitions that get translated to database statements upon flushing. For managed entities, Hibernate can auto-detect incoming changes and schedule SQL UPDATES on our behalf. This mechanism is called automatic dirty checking.
The default dirty checking strategy
By default Hibernate checks all managed entity properties. Every time an entity is loaded, Hibernate makes an additional copy of all entity property values. At flush time, every managed entity property is matched against the loading-time snapshot value:
So the number of individual dirty checks is given by the following formula:
where
n = The number of managed entities
p = The number of properties of a given entity
Even if only one property of a single entity has ever changed, Hibernate will still check all managed entities. For a large number of managed entities, the default dirty checking mechanism may have a significant CPU and memory footprint. Since the initial entity snapshot is held separately, the persistence context requires twice as much memory as all managed entities would normally occupy.
How does AUTO flush strategy work in JPA and Hibernate
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
The Hibernate AUTO flush mode behaves differently whether you are bootstrapping Hibernate via JPA or using the stand-alone mechanism.
When using JPA, the AUTO flush mode causes all queries (JPQL, Criteria API, and native SQL) to trigger a flush prior to the query execution. However, this is not the case when bootstrapping Hibernate using the native API.
Not all queries trigger a Session flush
Many would assume that Hibernate always flushes the Session before any executing query. While this might have been a more intuitive approach, and probably closer to the JPA’s AUTO FlushModeType, Hibernate tries to optimize that. If the currently executed query is not going to hit the pending SQL INSERT/UPDATE/DELETE statements then the flush is not strictly required.
As stated in the reference documentation, the AUTO flush strategy may sometimes synchronize the current persistence context prior to a query execution. It would have been more intuitive if the framework authors had chosen to name it FlushMode.SOMETIMES.
A beginner’s guide to flush strategies in JPA and Hibernate
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In my previous post I introduced the entity state transitions Object-relational mapping paradigm.
All managed entity state transitions are translated to associated database statements when the current Persistence Context gets flushed. Hibernate’s flush behavior is not always as obvious as one might think.
Write-behind
Hibernate tries to defer the Persistence Context flushing up until the last possible moment. This strategy has been traditionally known as transactional write-behind.
The write-behind is more related to Hibernate flushing rather than any logical or physical transaction. During a transaction, the flush may occur multiple times.
The flushed changes are visible only for the current database transaction. Until the current transaction is committed, no change is visible by other concurrent transactions.
The persistence context, also known as the first level cache, acts as a buffer between the current entity state transitions and the database.
In caching theory, the write-behind synchronization requires that all changes happen against the cache, whose responsibility is to eventually synchronize with the backing store.
A beginner’s guide to entity state transitions with JPA and Hibernate
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
Hibernate shifts the developer mindset from SQL statements to entity state transitions. Once an entity is actively managed by Hibernate, all changes are going to be automatically propagated to the database.
Manipulating domain model entities (along with their associations) is much easier than writing and maintaining SQL statements. Without an ORM tool, adding a new column requires modifying all associated INSERT/UPDATE statements.
But Hibernate is no silver bullet either. Hibernate doesn’t free us from ever worrying about the actual executed SQL statements. Controlling Hibernate is not as straightforward as one might think and it’s mandatory to check all SQL statements Hibernate executes on our behalf.
Hibernate pooled and pooled-lo identifier generators
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this post, we’ll uncover a sequence identifier generator combining identifier assignment efficiency and interoperability with other external systems (concurrently accessing the underlying database system).
Traditionally there have been two sequence identifier strategies to choose from: sequence
and seqhilo
.
The sequence
identifier, always hitting the database for every new value assignment. Even with database sequence preallocation, we have a significant database round-trip cost.
The seqhilo
identifier, using the hilo
algorithm. This generator calculates some identifier values in-memory, therefore reducing the database round-trip calls. The problem with this optimization technique is that the current database sequence value no longer reflects the current highest in-memory generated value.
The database sequence is used as a bucket number, making it difficult for other systems to interoperate with the database table in question. Other applications must know the inner-workings of the hilo
identifier strategy to properly generate non-clashing identifiers.
A beginner’s guide to Hibernate enhanced identifier generators
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
JPA identifier generators
JPA defines the following identifier strategies:
Strategy | Description |
---|---|
AUTO | The persistence provider picks the most appropriate identifier strategy supported by the underlying database |
IDENTITY | Identifiers are assigned by a database IDENTITY column |
SEQUENCE | The persistence provider uses a database sequence for generating identifiers |
TABLE | The persistence provider uses a separate database table to emulate a sequence object |
In my previous post I exampled the pros and cons of all these surrogate identifier strategies.
How do Identity, Sequence, and Table (sequence-like) generators work in JPA and Hibernate
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In my previous post I talked about different database identifier strategies. This post will compare the most common surrogate primary key strategies:
- IDENTITY
- SEQUENCE
- TABLE (SEQUENCE)
IDENTITY
The IDENTITY type (included in the SQL:2003 standard) is supported by:
The IDENTITY generator allows an integer/bigint column to be auto-incremented on demand. The increment process happens outside of the current running transaction, so a roll-back may end-up discarding already assigned values (value gaps may happen).
The increment process is very efficient since it uses a database internal lightweight locking mechanism as opposed to the more heavyweight transactional course-grain locks.
The only drawback is that we can’t know the newly assigned value prior to executing the INSERT statement. This restriction is hindering the transactional write-behind strategy adopted by Hibernate. For this reason, Hibernates cannot use JDBC batching when persisting entities that are using the IDENTITY generator.
Hibernate and UUID identifiers
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, we are going to see how the UUID entity attributes are persisted when using JPA and Hibernate, for both assigned and auto-generated identifiers.
In my previous post I talked about UUID surrogate keys and the use cases when there are more appropriate than the more common auto-incrementing identifiers.
A UUID database type
There are several ways to represent a 128-bit UUID, and whenever in doubt I like to resort to Stack Exchange for an expert advice.
Because table identifiers are usually indexed, the more compact the database type the less space will the index require. From the most efficient to the least, here are our options:
- Some databases (PostgreSQL, SQL Server) offer a dedicated UUID storage type
- Otherwise we can store the bits as a byte array (e.g. RAW(16) in Oracle or the standard BINARY(16) type)
- Alternatively we can use 2 bigint (64-bit) columns, but a composite identifier is less efficient than a single column one
- We can store the hex value in a CHAR(36) column (e.g 32 hex values and 4 dashes), but this will take the most amount of space, hence it’s the least efficient alternative
Hibernate offers many identifier strategies to choose from and for UUID identifiers we have three options:
- the assigned generator accompanied by the application logic UUID generation
- the hexadecimal “uuid” string generator
- the more flexible “uuid2” generator, allowing us to use java.lang.UUID, a 16 byte array or a hexadecimal String value
The hi/lo algorithm
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In my previous post I talked about various database identifier strategies, you need to be aware of when designing the database model. We concluded that database sequences are very convenient because they are both flexible and efficient for most use cases.
But even with cached sequences, the application requires a database round-trip for every new the sequence value. If your applications demand a high number of insert operations per transaction, the sequence allocation may be optimized with a hi/lo algorithm.
The hi/lo algorithm
The hi/lo algorithms split the sequences domain into “hi” groups. A “hi” value is assigned synchronously. Every “hi” group is given a maximum number of “lo” entries, that can by assigned off-line without worrying about concurrent duplicate entries.
- The “hi” token is assigned by the database, and two concurrent calls are guaranteed to see unique consecutive values
- Once a “hi” token is retrieved we only need the “incrementSize” (the number of “lo” entries)
- The identifiers range is given by the following formula:
and the “lo” value will be taken from:
starting from
- When all “lo” values are used, a new “hi” value is fetched and the cycle continues
Here you can have an example of two concurrent transactions, each one inserting multiple entities: