How to generate JPA entity identifier values using a database sequence
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, I’m going to show you how you can generate entity identifier values using a database sequence when using JPA and Hibernate.
Using a database sequence is the most efficient Hibernate identifier generation strategy, as it allows you to take advantage of the automatic JDBC batching mechanism.
JPA entity identifier annotations
The JPA specification defines the following annotations and Enum values we can use to customize the entity identifier generation process:
The @Id
annotation is mandatory for entities, and it must be mapped to a table column that has a unique constraint. Most often, the @Id
annotation is mapped to the Primary Key table column.
Without specifying a @GeneratedValue
annotation, entity identifiers must be assigned manually. If the entity can use a natural identifier assigned by a third-party, then you don’t need to use the @GeneratedValue
since the identifiers will be set manually prior to persisting the entity.
The GenerationType
Enum defines four entity identifier generation strategies:
IDENTITY
allows using a table identity column, like the MySQLAUTO_INCREMENT
. For JPA and Hibernate, you should prefer usingSEQUENCE
if the relational database supports it because Hibernate cannot use automatic JDBC batching when persisting entities using theIDENTITY
generator.SEQUENCE
allows using a database sequence object to generate identifier values. This is the best generation strategy when using JPA and Hibernate.TABLE
emulates the database sequence generator using a separate table. As I explained in this article, this is a terrible strategy, and you shouldn’t use it.AUTO
picks any of the previous strategy based on the underlying database capabilities.
The @SequenceGenerator
annotation allows you to customize the database sequence generation process.
JPA @GeneratedValue entity identifier with SEQUENCE strategy
Let’s consider we have the following Post
entity, which uses the @GeneratedValue
annotation on the @Id
attribute with the strategy of SEQUENCE
.
@Entity(name = "Post") @Table(name = "post") public class Post { @Id @GeneratedValue( strategy = GenerationType.SEQUENCE ) private Long id; private String title; public Long getId() { return id; } public Post setId(Long id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } }
If we generate the database schema using the hbm2ddl
tool, then we can see what database schema is expected by Hibernate for this Post
entity mapping:
CREATE SEQUENCE hibernate_sequence START 1 INCREMENT 1 CREATE TABLE post ( id INT8 NOT NULL, title VARCHAR(255), PRIMARY KEY (id) )
Notice that, by default, the hibernate_sequence
is used for all entities using the SEQUENCE
identifier generation strategy without an explicit database sequence name.
When persisting 5 Post
entities:
for (int i = 1; i <= 5; i++) { entityManager.persist( new Post().setTitle( String.format( "High-Performance Java Persistence, Part %d", i ) ) ); }
Hibernate generates the following SQL statements:
SELECT nextval('hibernate_sequence') SELECT nextval('hibernate_sequence') SELECT nextval('hibernate_sequence') SELECT nextval('hibernate_sequence') SELECT nextval('hibernate_sequence') INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 1', 1) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 2', 2) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 3', 3) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 4', 4) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 5', 5)
Notice that the hibernate_sequence
was called five times since, by default, no sequence call optimizer is used. The sequence is called by the persist
method because the entity identifier is needed in order to create the key under which the entity is being stored in the Persistence Context.
On the other hand, the INSERT
statements are executed when flushing the Persistence Context, as the persist
method only schedules the EntityInsertAction
.
JPA @SequenceGenerator
If you want to customize the sequence generation process, you need to use the JPA specification @SequenceGenerator
annotation.
For instance, if we want to specify a custom database sequence name, like seq_post
, which uses an increment step of 5, then we can use the following @SequenceGenerator
configuration:
@Id @GeneratedValue( strategy = GenerationType.SEQUENCE, generator = "seq_post" ) @SequenceGenerator( name = "seq_post", allocationSize = 5 ) private Long id;
The generator
attribute of the @GeneratedValue
annotation references the name
attribute of the @SequenceGenerator
annotation.
The name
attribute of the @SequenceGenerator
is also used to reference the database sequence object that’s called to get the new entity identifiers.
The allocationSize
is used to instruct the JPA provider the number of values that can be allocated by the application using a single database sequence call.
So, for the aforementioned mapping, Hibernate generates the following database schema:
CREATE SEQUENCE seq_post START 1 INCREMENT 5 CREATE TABLE post ( id INT8 NOT NULL, title VARCHAR(255), PRIMARY KEY (id) )
Notice that the sequence name matches the name
attribute of the @SequenceGenerator
annotation, and the sequence increment step is 5. This means that the sequence will allocate the values of 1, 6, 11, 16, and so on. For this reason, Hibernate can allocate the values of 2, 3, 4, 5 without needing to call the database sequence, therefore reducing the number of database round trips.
When persisting the same 5 Post
entities, Hibernate generates the following SQL statements:
SELECT nextval('seq_post') SELECT nextval('seq_post') INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 1', 1) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 2', 2) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 3', 3) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 4', 4) INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence, Part 5', 5)
As I explained in this article, when the allocationSize
is greater than 1, Hibernate uses the pooled
optimizer.
When using the pooled
optimizer, the database sequence value represents the highest value that can be allocated by Hibernate when persisting an entity.
The first sequence call gives the value of 1, so the first Post
entity gets that value. Now, when persisting the second Post
entity, Hibernate needs to call the sequence again, and it will get the value of 6, so it can generate the identifier values of 2, 3, 4, 5, and 6 without needing any other database sequence call.
I'm running an online workshop on the 11th of October about High-Performance SQL.If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
Using a database sequence object to generate JPA entity identifier values is the best strategy since it allows you to benefit from automatic batch inserts.
To reduce the number of sequence calls, Hibernate uses sequence optimizer algorithms, like pooled
or pooled-lo
. The pooled
optimizer is the default strategy when using an allocationSize
that’s greater than 1
in the @SequenceGenerator
annotation.

Hello,
do you maybe have a suggestion on how to determine allocationSize parameter? In our use-case we have around 100k entries per hour, but the quarter hour marks take the majority.
Gaps in the sequence don’t worry us too much. If I set it to 10000 would that be reasonable?
Thanks
If you use the Batch Sequence Generator, you could use a sequence with an increment step of
1
, while just your batching processor uses a mapping where the allocation is1000
.I wouldn’t set a very large allocation size without using the Batch Sequence Generator as other external systems using the same sequence may induce large gaps in the sequence allocation.