How to generate JPA entity identifier values using a database sequence

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, I’m going to show you how you can generate entity identifier values using a database sequence when using JPA and Hibernate.

Using a database sequence is the most efficient Hibernate identifier generation strategy, as it allows you to take advantage of the automatic JDBC batching mechanism.

JPA entity identifier annotations

The JPA specification defines the following annotations and Enum values we can use to customize the entity identifier generation process:

JPA SequenceGenerator annotations

The @Id annotation is mandatory for entities, and it must be mapped to a table column that has a unique constraint. Most often, the @Id annotation is mapped to the Primary Key table column.

Without specifying a @GeneratedValue annotation, entity identifiers must be assigned manually. If the entity can use a natural identifier assigned by a third-party, then you don’t need to use the @GeneratedValue since the identifiers will be set manually prior to persisting the entity.

The GenerationType Enum defines four entity identifier generation strategies:

  • IDENTITY allows using a table identity column, like the MySQL AUTO_INCREMENT. For JPA and Hibernate, you should prefer using SEQUENCE if the relational database supports it because Hibernate cannot use automatic JDBC batching when persisting entities using the IDENTITY generator.
  • SEQUENCE allows using a database sequence object to generate identifier values. This is the best generation strategy when using JPA and Hibernate.
  • TABLE emulates the database sequence generator using a separate table. As I explained in this article, this is a terrible strategy, and you shouldn’t use it.
  • AUTO picks any of the previous strategy based on the underlying database capabilities.

The @SequenceGenerator annotation allows you to customize the database sequence generation process.

JPA @GeneratedValue entity identifier with SEQUENCE strategy

Let’s consider we have the following Post entity, which uses the @GeneratedValue annotation on the @Id attribute with the strategy of SEQUENCE.

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue(
        strategy = GenerationType.SEQUENCE
    )
    private Long id;

    private String title;

    public Long getId() {
        return id;
    }

    public Post setId(Long id) {
        this.id = id;
        return this;
    }

    public String getTitle() {
        return title;
    }

    public Post setTitle(String title) {
        this.title = title;
        return this;
    }
}

If we generate the database schema using the hbm2ddl tool, then we can see what database schema is expected by Hibernate for this Post entity mapping:

CREATE SEQUENCE hibernate_sequence START 1 INCREMENT 1

CREATE TABLE post (
    id INT8 NOT NULL, 
    title VARCHAR(255), 
    PRIMARY KEY (id)
)

Notice that, by default, the hibernate_sequence is used for all entities using the SEQUENCE identifier generation strategy without an explicit database sequence name.

When persisting 5 Post entities:

for (int i = 1; i <= 5; i++) {
    entityManager.persist(
        new Post().setTitle(
            String.format(
                "High-Performance Java Persistence, Part %d",
                i
            )
        )
    );
}

Hibernate generates the following SQL statements:

SELECT nextval('hibernate_sequence')
SELECT nextval('hibernate_sequence')
SELECT nextval('hibernate_sequence')
SELECT nextval('hibernate_sequence')
SELECT nextval('hibernate_sequence')

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 1', 1)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 2', 2)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 3', 3)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 4', 4)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 5', 5)

Notice that the hibernate_sequence was called five times since, by default, no sequence call optimizer is used. The sequence is called by the persist method because the entity identifier is needed in order to create the key under which the entity is being stored in the Persistence Context.

On the other hand, the INSERT statements are executed when flushing the Persistence Context, as the persist method only schedules the EntityInsertAction.

JPA @SequenceGenerator

If you want to customize the sequence generation process, you need to use the JPA specification @SequenceGenerator annotation.

For instance, if we want to specify a custom database sequence name, like seq_post, which uses an increment step of 5, then we can use the following @SequenceGenerator configuration:

@Id
@GeneratedValue(
    strategy = GenerationType.SEQUENCE,
    generator = "seq_post"
)
@SequenceGenerator(
    name = "seq_post",
    allocationSize = 5
)
private Long id;

The generator attribute of the @GeneratedValue annotation references the name attribute of the @SequenceGenerator annotation.

The name attribute of the @SequenceGenerator is also used to reference the database sequence object that’s called to get the new entity identifiers.

The allocationSize is used to instruct the JPA provider the number of values that can be allocated by the application using a single database sequence call.

So, for the aforementioned mapping, Hibernate generates the following database schema:

CREATE SEQUENCE seq_post START 1 INCREMENT 5

CREATE TABLE post (
    id INT8 NOT NULL, 
    title VARCHAR(255), 
    PRIMARY KEY (id)
)

Notice that the sequence name matches the name attribute of the @SequenceGenerator annotation, and the sequence increment step is 5. This means that the sequence will allocate the values of 1, 6, 11, 16, and so on. For this reason, Hibernate can allocate the values of 2, 3, 4, 5 without needing to call the database sequence, therefore reducing the number of database round trips.

When persisting the same 5 Post entities, Hibernate generates the following SQL statements:

SELECT nextval('seq_post')
SELECT nextval('seq_post')

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 1', 1)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 2', 2)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 3', 3)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 4', 4)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 5', 5)

As I explained in this article, when the allocationSize is greater than 1, Hibernate uses the pooled optimizer.

When using the pooled optimizer, the database sequence value represents the highest value that can be allocated by Hibernate when persisting an entity.

The first sequence call gives the value of 1, so the first Post entity gets that value. Now, when persisting the second Post entity, Hibernate needs to call the sequence again, and it will get the value of 6, so it can generate the identifier values of 2, 3, 4, 5, and 6 without needing any other database sequence call.

I'm running an online workshop on the 11th of October about High-Performance SQL.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Using a database sequence object to generate JPA entity identifier values is the best strategy since it allows you to benefit from automatic batch inserts.

To reduce the number of sequence calls, Hibernate uses sequence optimizer algorithms, like pooled or pooled-lo. The pooled optimizer is the default strategy when using an allocationSize that’s greater than 1 in the @SequenceGenerator annotation.

Transactions and Concurrency Control eBook

2 Comments on “How to generate JPA entity identifier values using a database sequence

  1. Hello,

    do you maybe have a suggestion on how to determine allocationSize parameter? In our use-case we have around 100k entries per hour, but the quarter hour marks take the majority.

    Gaps in the sequence don’t worry us too much. If I set it to 10000 would that be reasonable?

    Thanks

    • If you use the Batch Sequence Generator, you could use a sequence with an increment step of 1, while just your batching processor uses a mapping where the allocation is 1000.

      I wouldn’t set a very large allocation size without using the Batch Sequence Generator as other external systems using the same sequence may induce large gaps in the sequence allocation.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.