Hibernate Batch Sequence Generator

If you are trading Stocks and Crypto using Revolut, then you are going to love RevoGain!

Introduction

In this article, I’m going to introduce the Hibernate Batch Sequence Generator implementation that’s provided by Philippe Marschall to the Hibernate Types project

Maven Dependency

The Hibernate Batch Sequence Generator is available on Maven Central, so the first thing we need to do is add the Hibernate Types dependency. For instance, if you are using Maven, then you need to add the following dependency into your project pom.xml configuration file:

<dependency>
    <groupId>com.vladmihalcea</groupId>
    <artifactId>hibernate-types-55</artifactId>
    <version>${hibernate-types.version}</version>
</dependency>

For older Hibernate ORM versions, you can use the hibernate-types-5, hibernate-types-43, or hibernate-types-4 dependencies. The Hibernate Types project documentation provides more details about which dependency you should use based on the Hibernate ORM version used by your project.

Domain Model

Let’s assume we have the following post database table:

Post table bigint id

The id column is the Primary Key, and to generate monotonically increasing identifiers, there is the following post_sequence database schema created by the DBA:

CREATE SEQUENCE post_sequence
INCREMENT BY 1
START WITH 1
CACHE 5

JPA Entity Model

We now want to map the post database table to the following Post entity class:

Post entity

And, for this reason, we are going to use the following JPA entity mapping:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue(
        strategy = GenerationType.SEQUENCE,
        generator = "post_sequence"
    )
    @SequenceGenerator(
        name = "post_sequence", 
        allocationSize = 1
    )
    private Long id;

    private String title;

    public Long getId() {
        return id;
    }

    public Post setId(Long id) {
        this.id = id;
        return this;
    }

    public String getTitle() {
        return title;
    }

    public Post setTitle(String title) {
        this.title = title;
        return this;
    }
}

Post Batch Inserts

We have a task that requires us to insert multiple Post entities, and to speed up the process we are going to enable the automatic JDBC batch inserts using the following Hibernate properties:

<property name="hibernate.jdbc.batch_size" value="5"/>

The hibernate.jdbc.batch_size sets the default JDBC PreparedStatement batch size to the value of 5, meaning that we can insert up to 5 post table rows using a single database INSERT statement that has 5 associated sets of bind parameters values.

The Post entity batch processing task is implemented like this:

for (int i = 1; i <= POST_SIZE; i++) {
    if(i % BATCH_SIZE == 0) {
        EntityTransaction entityTransaction = entityManager.getTransaction();
        entityTransaction.commit();
        entityTransaction.begin();
        
        entityManager.clear();
    }
    entityManager.persist(
        new Post()
            .setTitle(
                String.format(
                    "High-Performance Java Persistence, Chapter %d",
                    i
                )
            )
    );
}

As I explained in this article, it’s good practice to commit the database transaction after processing a batch to avoid long-running transactions and reduce the impact of a failure on the final batch process outcome.

When inserting 10 Post entities, we see that Hibernate generates the following statements:

SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')

INSERT INTO post 
    (title, id) 
VALUES 
    (High-Performance Java Persistence, Chapter 1, 1), 
    (High-Performance Java Persistence, Chapter 2, 2), 
    (High-Performance Java Persistence, Chapter 3, 3), 
    (High-Performance Java Persistence, Chapter 4, 4), 
    (High-Performance Java Persistence, Chapter 5, 5)
    
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')
SELECT nextval('post_sequence')

INSERT INTO post 
    (title, id) 
VALUES 
    (High-Performance Java Persistence, Chapter 6, 6), 
    (High-Performance Java Persistence, Chapter 7, 7), 
    (High-Performance Java Persistence, Chapter 8, 8), 
    (High-Performance Java Persistence, Chapter 9, 9), 
    (High-Performance Java Persistence, Chapter 10, 10)

As expected, the UPDATE statements are batched properly. However, for every persisted entity, there’s a database sequence call roundtrip that gets executed.

Hibernate Sequence Optimizers

Hibernate provides several sequence optimizers that aim to reduce the number of sequence calls:

Because the post_sequence has an increment size of 1, we cannot use the pooled and pooled_lo optimizers, which require the increment size to match the number of entity identifiers that could be generated with a single sequence call.

And, while the hilo optimizer can use one post_sequence call to generate multiple entity identifiers, the generated post identifier values will be greater than the current sequence number, making it difficult for other systems t figure out what value to generate when inserting a post table record.

Hibernate Batch Sequence Generator

It was a cold winter morning, and I was in Zürich ready to run my High-Performance Java Persistence training for a Swiss company. That’s when I met Philippe Marschall, who is the author of the clever Hibernate Batch Sequence Generator.

Three years later, during an online workshop, one of my students mentioned the Hibernate Batch Sequence Generator, and, when checking it out, I saw it’s developed by Philippe Marschall. What a small world!

To use the Hibernate Batch Sequence Generator, we need to change the Post entity identifier mapping like this:

@Id
@GeneratedValue(
    strategy = GenerationType.SEQUENCE,
    generator = "post_sequence"
)
@GenericGenerator(
    name = "post_sequence",
    strategy = "com.vladmihalcea.hibernate.id.BatchSequenceGenerator",
    parameters = {
        @Parameter(name = "sequence", value = "post_sequence"),
        @Parameter(name = "fetch_size", value = "5")
    }
)
private Long id;

With the new identifier generator in place, let’s see what SQL statements are generated by our batch processing task:

WITH RECURSIVE t(n, level_num) AS (
    SELECT 
        nextval ('post_sequence') AS n, 
        1 AS level_num 
    UNION ALL 
    SELECT 
        nextval ('post_sequence') AS n, 
        level_num + 1 AS level_num 
    FROM t 
    WHERE level_num < 5
) 
SELECT n FROM t

INSERT INTO post 
    (title, id) 
VALUES 
    (High-Performance Java Persistence, Chapter 1, 1), 
    (High-Performance Java Persistence, Chapter 2, 2), 
    (High-Performance Java Persistence, Chapter 3, 3), 
    (High-Performance Java Persistence, Chapter 4, 4), 
    (High-Performance Java Persistence, Chapter 5, 5)

WITH RECURSIVE t(n, level_num) AS (
    SELECT 
        nextval ('post_sequence') AS n, 
        1 AS level_num 
    UNION ALL 
    SELECT 
        nextval ('post_sequence') AS n, 
        level_num + 1 AS level_num 
    FROM t 
    WHERE level_num < 5
) 
SELECT n FROM t

INSERT INTO post 
    (title, id) 
VALUES 
    (High-Performance Java Persistence, Chapter 6, 6), 
    (High-Performance Java Persistence, Chapter 7, 7), 
    (High-Performance Java Persistence, Chapter 8, 8), 
    (High-Performance Java Persistence, Chapter 9, 9), 
    (High-Performance Java Persistence, Chapter 10, 10)

The WITH RECURSIVE query calls the database sequence 5 times, according to the fetch_size attribute we configured when mapping the Post entity identifier.

Because the post_sequence has a CACHE value of 5, PostgreSQL preallocates and stores in memory 5 sequence values so the WITH RECURSIVE query is going to execute very fast.

Awesome, right?

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

The Hibernate Batch Sequence Generator is a very clever implementation and provides several benefits. Since the database sequence uses an increment size of 1, we can use different implementations for front-end and batch processor applications.

The front-end application can use a Post identifier mapping that doesn’t use any optimizer since users create a single Post entity at a time.

Batch processors can use a Post identifier mapping that uses the Hibernate Batch Sequence Generator instead. This is possible since both options work with a database sequence that has the same increment size.

So, this approach allows us to reduce the number of sequence values that would get otherwise lost by using the default pooled and pooled-lo Hibernate optimizers on the front-end.

Another advantage is that we can use more compact identifier column values. So, if you have an entity that will never get close to 65535 records, we can use a smallint column and save a lot of space for Primary Keys, Foreign Key, and database indexes as well.

Transactions and Concurrency Control eBook

2 Comments on “Hibernate Batch Sequence Generator

  1. Very nice article, Vlad.

    I liked the batch sequence generator, it’s a very clever solution created by Philippe Marschall. Thanks for sharing it and adding it to Hibernate Types project.

    I have 2 points I’d like to comment on:

    I thought that just setting allocationSize to any value bigger than 1 would be enough to avoid many roundtrips to the database. But I think we’d have to set the INCREMENT BY parameter in the sequence as well, right?
    Just out of curiosity, did you know that the CACHE parameter in sequence works differently in Oracle and PostgreSQL? Maybe it can be an issue if the batch processing logic uses different sessions/connections between the batches. What do you think? https://seiler.us/2018-10-02-sequence-caching-oracle-vs-postgresql/

    thanks again, Vlad!

    • The allocationSize in the JPA sequence generator must match the DB sequence increment by value. Otherwise, it will not work with the default pooled optimizer.

      I didn’t know about the CACHE parameter behavior. Thanks for the tip. I’ll study it further.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.