Hibernate Batch Sequence Generator

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, I’m going to introduce the Hibernate Batch Sequence Generator implementation that’s provided by Philippe Marschall to the Hypersistence Utils project

Maven Dependency

The Hibernate Batch Sequence Generator is available on Maven Central, so the first thing we need to do is add the Hypersistence Utils dependency. For instance, if you are using Maven, then you need to add the following dependency to your project pom.xml configuration file:

For Hibernate 6:

<dependency>
    <groupId>io.hypersistence</groupId>
    <artifactId>hypersistence-utils-hibernate-60</artifactId>
    <version>${hypersistence-utils.version}</version>
</dependency>

Or for Hibernate 5.5 and 5.4:

<dependency>
    <groupId>io.hypersistence</groupId>
    <artifactId>hypersistence-utils-hibernate-55</artifactId>
    <version>${hypersistence-utils.version}</version>
</dependency>

For older Hibernate ORM versions, you can use the hypersistence-utils-hibernate-52 or hypersistence-utils-hibernate-5. The Hypersistence Utils documentation provides more details about which dependency you should use based on the Hibernate ORM version used by your project.

Domain Model

Let’s assume we have the following post database table:

Post table bigint id

The id column is the Primary Key, and to generate monotonically increasing identifiers, there is the following post_sequence database schema created by the DBA:

CREATE SEQUENCE post_sequence
INCREMENT BY 1
START WITH 1
CACHE 5

JPA Entity Model

We now want to map the post database table to the following Post entity class:

Post entity

And, for this reason, we are going to use the following JPA entity mapping:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue(
        strategy = GenerationType.SEQUENCE,
        generator = "post_sequence"
    )
    @SequenceGenerator(
        name = "post_sequence", 
        allocationSize = 1
    )
    private Long id;

    private String title;

    public Long getId() {
        return id;
    }

    public Post setId(Long id) {
        this.id = id;
        return this;
    }

    public String getTitle() {
        return title;
    }

    public Post setTitle(String title) {
        this.title = title;
        return this;
    }
}

Post Batch Inserts

We have a task that requires us to insert multiple Post entities, and to speed up the process we are going to enable the automatic JDBC batch inserts using the following Hibernate properties:

<property name="hibernate.jdbc.batch_size" value="5"/>

The hibernate.jdbc.batch_size sets the default JDBC PreparedStatement batch size to the value of 5, meaning that we can insert up to 5 post table rows using a single database INSERT statement that has 5 associated sets of bind parameters values.

The Post entity batch processing task is implemented like this:

for (int i = 1; i <= POST_SIZE; i++) {
    if(i % BATCH_SIZE == 0) {
        EntityTransaction entityTransaction = entityManager.getTransaction();
        entityTransaction.commit();
        entityTransaction.begin();
        
        entityManager.clear();
    }
    entityManager.persist(
        new Post()
            .setTitle(
                String.format(
                    "High-Performance Java Persistence, Chapter %d",
                    i
                )
            )
    );
}

As I explained in this article, it’s good practice to commit the database transaction after processing a batch to avoid long-running transactions and reduce the impact of a failure on the final batch process outcome.

When inserting 10 Post entities, we see that Hibernate generates the following statements:

SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')

INSERT INTO post 
    (title, id) 
VALUES 
    (High-Performance Java Persistence, Chapter 1, 1), 
    (High-Performance Java Persistence, Chapter 2, 2), 
    (High-Performance Java Persistence, Chapter 3, 3), 
    (High-Performance Java Persistence, Chapter 4, 4), 
    (High-Performance Java Persistence, Chapter 5, 5)
    
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')

INSERT INTO post 
    (title, id) 
VALUES 
    (High-Performance Java Persistence, Chapter 6, 6), 
    (High-Performance Java Persistence, Chapter 7, 7), 
    (High-Performance Java Persistence, Chapter 8, 8), 
    (High-Performance Java Persistence, Chapter 9, 9), 
    (High-Performance Java Persistence, Chapter 10, 10)

As expected, the UPDATE statements are batched properly. However, for every persisted entity, there’s a database sequence call roundtrip that gets executed.

Hibernate Sequence Optimizers

Hibernate provides several sequence optimizers that aim to reduce the number of sequence calls:

Because the post_sequence has an increment size of 1, we cannot use the pooled and pooled_lo optimizers, which require the increment size to match the number of entity identifiers that could be generated with a single sequence call.

And, while the hilo optimizer can use one post_sequence call to generate multiple entity identifiers, the generated post identifier values will be greater than the current sequence number, making it difficult for other systems t figure out what value to generate when inserting a post table record.

Hibernate Batch Sequence Generator

It was a cold winter morning, and I was in Zürich ready to run my High-Performance Java Persistence training for a Swiss company. That’s when I met Philippe Marschall, who is the author of the clever Hibernate Batch Sequence Generator.

Three years later, during an online workshop, one of my students mentioned the Hibernate Batch Sequence Generator, and, when checking it out, I saw that it was developed by Philippe Marschall. What a small world!

To use the Hibernate Batch Sequence Generator, we need to change the Post entity identifier mapping like this:

@Id
@GeneratedValue(
    strategy = GenerationType.SEQUENCE,
    generator = "post_sequence"
)
@GenericGenerator(
    name = "post_sequence",
    strategy = "io.hypersistence.utils.hibernate.id.BatchSequenceGenerator",
    parameters = {
        @Parameter(name = "sequence", value = "post_sequence"),
        @Parameter(name = "fetch_size", value = "5")
    }
)
private Long id;

With the new identifier generator in place, let’s see what SQL statements are generated by our batch processing task:

WITH RECURSIVE t(n, level_num) AS (
    SELECT 
        nextval ('post_sequence') AS n, 
        1 AS level_num 
    UNION ALL 
    SELECT 
        nextval ('post_sequence') AS n, 
        level_num + 1 AS level_num 
    FROM t 
    WHERE level_num < 5
) 
SELECT n FROM t

INSERT INTO post 
    (title, id) 
VALUES 
    (High-Performance Java Persistence, Chapter 1, 1), 
    (High-Performance Java Persistence, Chapter 2, 2), 
    (High-Performance Java Persistence, Chapter 3, 3), 
    (High-Performance Java Persistence, Chapter 4, 4), 
    (High-Performance Java Persistence, Chapter 5, 5)

WITH RECURSIVE t(n, level_num) AS (
    SELECT 
        nextval ('post_sequence') AS n, 
        1 AS level_num 
    UNION ALL 
    SELECT 
        nextval ('post_sequence') AS n, 
        level_num + 1 AS level_num 
    FROM t 
    WHERE level_num < 5
) 
SELECT n FROM t

INSERT INTO post 
    (title, id) 
VALUES 
    (High-Performance Java Persistence, Chapter 6, 6), 
    (High-Performance Java Persistence, Chapter 7, 7), 
    (High-Performance Java Persistence, Chapter 8, 8), 
    (High-Performance Java Persistence, Chapter 9, 9), 
    (High-Performance Java Persistence, Chapter 10, 10)

The WITH RECURSIVE query calls the database sequence 5 times, according to the fetch_size attribute we configured when mapping the Post entity identifier.

Because the post_sequence has a CACHE value of 5, PostgreSQL preallocates and stores in memory 5 sequence values so the WITH RECURSIVE query is going to execute very fast.

Awesome, right?

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Seize the deal! 40% discount. Seize the deal! 40% discount. Seize the deal! 40% discount.

Conclusion

The Hibernate Batch Sequence Generator is a very clever implementation and provides several benefits. Since the database sequence uses an increment size of 1, we can use different implementations for front-end and batch processor applications.

The front-end application can use a Post identifier mapping that doesn’t use any optimizer since users create a single Post entity at a time.

Batch processors can use a Post identifier mapping that uses the Hibernate Batch Sequence Generator instead. This is possible since both options work with a database sequence that has the same increment size.

So, this approach allows us to reduce the number of sequence values that would otherwise get lost by using the default pooled and pooled-lo Hibernate optimizers on the front end.

Another advantage is that we can use more compact identifier column values. So, if you have an entity that will never get close to 65535 records, we can use a smallint column and save a lot of space for Primary Keys, Foreign Key, and database indexes as well.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.