Hibernate Batch Sequence Generator
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, I’m going to introduce the Hibernate Batch Sequence Generator implementation that’s provided by Philippe Marschall to the Hypersistence Utils project
Maven Dependency
The Hibernate Batch Sequence Generator is available on Maven Central, so the first thing we need to do is add the Hypersistence Utils dependency. For instance, if you are using Maven, then you need to add the following dependency to your project pom.xml
configuration file:
For Hibernate 6:
<dependency> <groupId>io.hypersistence</groupId> <artifactId>hypersistence-utils-hibernate-60</artifactId> <version>${hypersistence-utils.version}</version> </dependency>
Or for Hibernate 5.5 and 5.4:
<dependency> <groupId>io.hypersistence</groupId> <artifactId>hypersistence-utils-hibernate-55</artifactId> <version>${hypersistence-utils.version}</version> </dependency>
For older Hibernate ORM versions, you can use the hypersistence-utils-hibernate-52
or hypersistence-utils-hibernate-5
. The Hypersistence Utils documentation provides more details about which dependency you should use based on the Hibernate ORM version used by your project.
Domain Model
Let’s assume we have the following post
database table:
The id
column is the Primary Key, and to generate monotonically increasing identifiers, there is the following post_sequence
database schema created by the DBA:
CREATE SEQUENCE post_sequence INCREMENT BY 1 START WITH 1 CACHE 5
JPA Entity Model
We now want to map the post
database table to the following Post
entity class:
And, for this reason, we are going to use the following JPA entity mapping:
@Entity(name = "Post") @Table(name = "post") public class Post { @Id @GeneratedValue( strategy = GenerationType.SEQUENCE, generator = "post_sequence" ) @SequenceGenerator( name = "post_sequence", allocationSize = 1 ) private Long id; private String title; public Long getId() { return id; } public Post setId(Long id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } }
Post Batch Inserts
We have a task that requires us to insert multiple Post
entities, and to speed up the process we are going to enable the automatic JDBC batch inserts using the following Hibernate properties:
<property name="hibernate.jdbc.batch_size" value="5"/>
The hibernate.jdbc.batch_size
sets the default JDBC PreparedStatement
batch size to the value of 5, meaning that we can insert up to 5 post
table rows using a single database INSERT statement that has 5 associated sets of bind parameters values.
The Post
entity batch processing task is implemented like this:
for (int i = 1; i <= POST_SIZE; i++) { if(i % BATCH_SIZE == 0) { EntityTransaction entityTransaction = entityManager.getTransaction(); entityTransaction.commit(); entityTransaction.begin(); entityManager.clear(); } entityManager.persist( new Post() .setTitle( String.format( "High-Performance Java Persistence, Chapter %d", i ) ) ); }
As I explained in this article, it’s good practice to commit the database transaction after processing a batch to avoid long-running transactions and reduce the impact of a failure on the final batch process outcome.
When inserting 10 Post
entities, we see that Hibernate generates the following statements:
SELECT nextval('post_sequence') SELECT nextval('post_sequence') SELECT nextval('post_sequence') SELECT nextval('post_sequence') SELECT nextval('post_sequence') INSERT INTO post (title, id) VALUES (High-Performance Java Persistence, Chapter 1, 1), (High-Performance Java Persistence, Chapter 2, 2), (High-Performance Java Persistence, Chapter 3, 3), (High-Performance Java Persistence, Chapter 4, 4), (High-Performance Java Persistence, Chapter 5, 5) SELECT nextval('post_sequence') SELECT nextval('post_sequence') SELECT nextval('post_sequence') SELECT nextval('post_sequence') SELECT nextval('post_sequence') INSERT INTO post (title, id) VALUES (High-Performance Java Persistence, Chapter 6, 6), (High-Performance Java Persistence, Chapter 7, 7), (High-Performance Java Persistence, Chapter 8, 8), (High-Performance Java Persistence, Chapter 9, 9), (High-Performance Java Persistence, Chapter 10, 10)
As expected, the UPDATE statements are batched properly. However, for every persisted entity, there’s a database sequence call roundtrip that gets executed.
Hibernate Sequence Optimizers
Hibernate provides several sequence optimizers that aim to reduce the number of sequence calls:
Because the post_sequence
has an increment size of 1, we cannot use the pooled
and pooled_lo
optimizers, which require the increment size to match the number of entity identifiers that could be generated with a single sequence call.
And, while the hilo
optimizer can use one post_sequence
call to generate multiple entity identifiers, the generated post
identifier values will be greater than the current sequence number, making it difficult for other systems t figure out what value to generate when inserting a post
table record.
Hibernate Batch Sequence Generator
It was a cold winter morning, and I was in Zürich ready to run my High-Performance Java Persistence training for a Swiss company. That’s when I met Philippe Marschall, who is the author of the clever Hibernate Batch Sequence Generator.
Three years later, during an online workshop, one of my students mentioned the Hibernate Batch Sequence Generator, and, when checking it out, I saw that it was developed by Philippe Marschall. What a small world!
To use the Hibernate Batch Sequence Generator, we need to change the Post
entity identifier mapping like this:
@Id @GeneratedValue( strategy = GenerationType.SEQUENCE, generator = "post_sequence" ) @GenericGenerator( name = "post_sequence", strategy = "io.hypersistence.utils.hibernate.id.BatchSequenceGenerator", parameters = { @Parameter(name = "sequence", value = "post_sequence"), @Parameter(name = "fetch_size", value = "5") } ) private Long id;
With the new identifier generator in place, let’s see what SQL statements are generated by our batch processing task:
WITH RECURSIVE t(n, level_num) AS ( SELECT nextval ('post_sequence') AS n, 1 AS level_num UNION ALL SELECT nextval ('post_sequence') AS n, level_num + 1 AS level_num FROM t WHERE level_num < 5 ) SELECT n FROM t INSERT INTO post (title, id) VALUES (High-Performance Java Persistence, Chapter 1, 1), (High-Performance Java Persistence, Chapter 2, 2), (High-Performance Java Persistence, Chapter 3, 3), (High-Performance Java Persistence, Chapter 4, 4), (High-Performance Java Persistence, Chapter 5, 5) WITH RECURSIVE t(n, level_num) AS ( SELECT nextval ('post_sequence') AS n, 1 AS level_num UNION ALL SELECT nextval ('post_sequence') AS n, level_num + 1 AS level_num FROM t WHERE level_num < 5 ) SELECT n FROM t INSERT INTO post (title, id) VALUES (High-Performance Java Persistence, Chapter 6, 6), (High-Performance Java Persistence, Chapter 7, 7), (High-Performance Java Persistence, Chapter 8, 8), (High-Performance Java Persistence, Chapter 9, 9), (High-Performance Java Persistence, Chapter 10, 10)
The WITH RECURSIVE
query calls the database sequence 5 times, according to the fetch_size
attribute we configured when mapping the Post
entity identifier.
Because the post_sequence
has a CACHE
value of 5
, PostgreSQL preallocates and stores in memory 5
sequence values so the WITH RECURSIVE
query is going to execute very fast.
Awesome, right?
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
The Hibernate Batch Sequence Generator is a very clever implementation and provides several benefits. Since the database sequence uses an increment size of 1, we can use different implementations for front-end and batch processor applications.
The front-end application can use a Post
identifier mapping that doesn’t use any optimizer since users create a single Post
entity at a time.
Batch processors can use a Post
identifier mapping that uses the Hibernate Batch Sequence Generator instead. This is possible since both options work with a database sequence that has the same increment size.
So, this approach allows us to reduce the number of sequence values that would otherwise get lost by using the default pooled
and pooled-lo
Hibernate optimizers on the front end.
Another advantage is that we can use more compact identifier column values. So, if you have an entity that will never get close to 65535
records, we can use a smallint
column and save a lot of space for Primary Keys, Foreign Key, and database indexes as well.
