How to migrate the hilo Hibernate identifier optimizer to the pooled strategy
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, I’m going to show you how to migrate from the legacy hilo sequence-based identifier optimizer to the pooled Hibernate strategy.
I decided to write this article after having a discussion with Gerd Aschemann on Twitter about addressing the HHH-13783 Hibernate issue.
How to migrate from the legacy hilo sequence-based identifier optimizer to the pooled Hibernate strategy.https://t.co/axf70HwMU4 pic.twitter.com/rC73MztVoO
— Java (@java) December 20, 2019
Default sequence identifier generator
Let’s assume we have the following Post
entity, which uses the post_sequence
database sequence generator to generate the entity identifiers automatically upon persist.
@Entity(name = "Post") @Table(name = "post") public class Post { @Id @GeneratedValue( strategy = GenerationType.SEQUENCE, generator = "post_sequence" ) @SequenceGenerator( name = "post_sequence", sequenceName = "post_sequence", allocationSize = 1 ) private Long id; private String title; //Getters and setters omitted for brevity }
Now, when inserting 4 Post
entities:
for (int i = 0; i < 4; i++) { Post post = new Post(); post.setTitle( String.format( "High-Performance Java Persistence, Part %d", i + 1 ) ); entityManager.persist(post); }
Hibernate generates the following SQL statements, assuming we are using PostgreSQL:
CALL NEXT VALUE FOR post_sequence; CALL NEXT VALUE FOR post_sequence; CALL NEXT VALUE FOR post_sequence; CALL NEXT VALUE FOR post_sequence; -- Flushing the Persistence Context INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 1', 1) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 2', 2) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 3', 3) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 4', 4)
When the persist
method is called for each Post
entity, Hibernate calls the post_sequence
database sequence to generate the entity identifier value, which is needed for building the key under which the entity is going to be associated to the currently running Persistence Context (e.g., first-level cache).
When flush
is called by Hibernate prior to committing the database transaction, the Post
entities are inserted in the database using the previously allocated identifier values.
Now, if we know that we are usually persisting more than one Post
entity, then we could reduce the number of database sequence calls by generating multiple identifiers values for a single database sequence value. And, that’s exactly the use case for the sequence-based identifier optimizers.
Hilo optimizer
As I explained in this article, the Hilo optimizer works as illustrated by the following diagram:
With a single database sequence call, we can generate multiple identifier values in the application. The database sequence value represents the hi
value while the lo
value is incremented from 0
to the allocationSize
value for each particular hi
value.
So, let’s change the Post
entity identifier to use the hilo generator:
@Id @GeneratedValue( strategy = GenerationType.SEQUENCE, generator = "post_sequence" ) @GenericGenerator( name = "post_sequence", strategy = "sequence", parameters = { @Parameter(name = "sequence_name", value = "post_sequence"), @Parameter(name = "initial_value", value = "1"), @Parameter(name = "increment_size", value = "3"), @Parameter(name = "optimizer", value = "hilo") } ) private Long id;
When persisting the same 4 Post
entities we created before, Hibernate is going to execute the following SQL statement for the hilo optimizer:
CALL NEXT VALUE FOR post_sequence; CALL NEXT VALUE FOR post_sequence; -- Flushing the Persistence Context INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 1', 1) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 2', 2) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 3', 3) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 4', 4)
So, only database sequence 2 calls were executed, as the first 3 Post
entities used the first database sequence value of 1
to generate the entity identifiers with the values 1
, 2
, and 3
. For the 4th Post
entity, Hibernate needed a new database sequence call, and for the hi
value of 2
, Hibernate could generate the entity identifier values 4
and 5
.
However, the problem with hilo is that the database sequence value is not included in the boundaries of the generated entity identifiers. So, a third-party client, which might be unaware of the hilo strategy we are using, would not know what value to use for the next identifier value as the database sequence values have to be multiplied by the allocationSize
. This is exactly the reason Hibernate introduced the pooled
and pooled-lo
optimizer.
Pooled optimizer
Starting with Hibernate 5, the pooled
optimizer is the default sequence-based strategy used by Hibernate when the JPA entity identifier uses an allocationSize
that’s greater than 1
.
For this reason, using the pooled
optimizer only requires to provide the allocationSize
via the @SequenceGenerator
JPA annotation:
@Id @GeneratedValue( strategy = GenerationType.SEQUENCE, generator = "post_sequence" ) @SequenceGenerator( name = "post_sequence", sequenceName = "post_sequence", allocationSize = 3 ) private Long id;
As I explained in this article, the pooled optimizer works as illustrated by the following diagram:
So, when persisting the same 5
Post
entities, Hibernate executes the same SQL statements the hilo optimizer generated as well. However, this time, the pooled_sequence
database sequence uses an INCREMENT BY
step that’s equal with the allocationSize
attribute of the @SequenceGenerator
annotation:
CREATE SEQUENCE post_sequence START 1 INCREMENT 3
Migrate from the Hibernate hilo to pooled optimizer
So, we cannot just change the JPA annotations when migrating from the legacy hilo to the more interoperable pooled optimizer. We also need to change the underlying database sequence.
If we try to do that, Hibernate is going to throw the following MappingException
:
javax.persistence.PersistenceException: [PersistenceUnit: ] Unable to build Hibernate SessionFactory Caused by: org.hibernate.MappingException: Could not instantiate id generator [entity-name=com.vladmihalcea.book.hpjp.hibernate.identifier.Post] Caused by: org.hibernate.MappingException: The increment size of the [post_sequence] sequence is set to [3] in the entity mapping while the associated database sequence increment size is [1].
Luckily, this can be done very easily with just 2 SQL statements that need to run prior to bootstrapping Hibernate. Usually, this is done via migration scripts which are run by a tool like Flyway:
SELECT setval('post_sequence', (SELECT MAX(id) FROM post) + 1) ALTER SEQUENCE post_sequence INCREMENT BY 3
Note that these
2
SQL statements that change the database sequence according to the pooled optimizer requirements were written for PostgreSQL. For other relational database systems, you need to modify those statements to match the database-specific DDL syntax of the RDBMS used by your application.
That’s it! Now, you can use the pooled optimizer instead of the hilo one, and everything should work like a charm.
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
While the hilo optimizer can optimize the number of database sequence calls, you should favor using the pooled or pooled-lo optimizers as they are interoperable with third-party systems or clients that might be unaware of the hilo strategy used by the application logic.
So, when migrating from hilo to pooled, besides updating the JPA entity identifier mapping, you need to change the database sequence so that it starts from a value that’s greater than the maximum table Primary Key value, as well as changing the sequence increment step to match the allocationSize
attribute.
