PostgreSQL SERIAL or IDENTITY column and Hibernate IDENTITY generator
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
When using PostgreSQL, it’s tempting to use a SERIAL or BIGSERIAL column type to auto-increment Primary Keys.
PostgreSQL 10 also added support for IDENTITY, which behaves in the same way as the legacy SERIAL or BIGSERIAL type.
This article will show you that SERIAL, BIGSERIAL, and IDENTITY are not a very good idea when using JPA and Hibernate.
SERIAL or BIGSERIAL
If you’ve been using MySQL, you know that AUTO_INCREMENT is a very popular choice. When migrating to PostgreSQL, you will notice that SERIAL or BIGSERIAL column types can be used just like AUTO_INCREMENT
in MySQL.
SERIAL is an auto-incremented integer column that takes 4 bytes while BIGSERIAL is an auto-incremented bigint column taking 8 bytes. Behind the scenes, PostgreSQL will use a sequence generator to generate the SERIAL column values upon inserting a new ROW.
Domain model
Now, assuming we have the following post
table:
CREATE TABLE post ( id SERIAL NOT NULL, title VARCHAR(255), PRIMARY KEY (id) )
For this table, PostgreSQL creates a sequence called post_id_seq
that is associated with the id
SERIAL column.
So, when inserting a post
row, the INSERT statement can simply omit the id
column:
INSERT INTO post (title) VALUES ('High-Performance Java Persistence')
The id
column is also the Primary Key of the post
table, and it uses a SERIAL column type. The id
column will be automatically be assigned the next value of the underlying post_id_seq
sequence generator.
To map the post
table, we need a Post
entity class that looks as follows:
The Post
entity id
property uses the GenerationType.IDENTITY
generator because the SERIAL
type acts as AUTO_INCREMENTED column.
@Entity(name = "Post") @Table(name = "post") public class Post { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Integer id; private String title; //Getters and setters omitted for brevity }
Now, to see how the post_id_seq
is used, consider the following test case:
Post post1 = new Post(); post1.setTitle( "High-Performance Java Persistence, Part 1" ); entityManager.persist(post1); Post post2 = new Post(); post2.setTitle( "High-Performance Java Persistence, Part 2" ); entityManager.persist(post2); entityManager.flush(); assertEquals( 2, ( (Number) entityManager .createNativeQuery( "select currval('post_id_seq')") .getSingleResult() ).intValue() );
After inserting 2 Post
entities and flushing the Persistence Context, the current value of the post_id_seq
database sequence is going to be 2, and the next sequence value is going to be 3.
JDBC batch inserts
As convenient as it might look, using the IDENTITY
with Hibernate is not without issues.
If we enable JDBC batching:
<property name="hibernate.jdbc.batch_size" value="5"/>
And persist 3 Post
entities:
for (int i = 0; i < 3; i++) { Post post = new Post(); post.setTitle( String.format( "High-Performance Java Persistence, Part %d", i + 1 ) ); entityManager.persist(post); }
Hibernate is going to generate the following SQL INSERT statements:
INSERT INTO post (title) VALUES ('High-Performance Java Persistence, Part 1') INSERT INTO post (title) VALUES ('High-Performance Java Persistence, Part 2') INSERT INTO post (title) VALUES ('High-Performance Java Persistence, Part 3')
So, batching is going to be disabled when inserting entities.
This is because, when persisting the entity, Hibernate needs to know the entity identifier in order to generate the key under which the entity is stored in the currently running Persistence Context.
To know the identifier, Hibernate needs to execute the INSERT statement, so by the time the Persistence Context is flushed, all inserts have been already executed. Therefore, Hibernate can no longer batch the INSERT statements.
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
Although convenient, and even suggested in many PostgreSQL books, the SERIAL and BIGSERIAL column types are not a very good choice when using JPA and Hibernate. Using a SEQUENCE
generator is a better alternative since the identifier can be generated prior to executing the INSERT statement.
Behind the scenes, the SERIAL and BIGSERIAL column types use a database sequence anyway, so the only difference is that the SEQUENCE
generator calls the sequence is a separate database roundtrip. However, this can also be optimized with the pooled and pooled-lo optimizers.
If the database server is close to the application servers and networking is fast, the extra database roundtrip is not going to be a performance bottleneck. For all these reasons, you should prefer using the SEQUENCE
generator over IDENTITY
no matter if you use PostgreSQL, Oracle, or SQL Server.
