Hibernate and UUID identifiers
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, we are going to see how the UUID entity attributes are persisted when using JPA and Hibernate, for both assigned and auto-generated identifiers.
In my previous post I talked about UUID surrogate keys and the use cases when there are more appropriate than the more common auto-incrementing identifiers.
A UUID database type
There are several ways to represent a 128-bit UUID, and whenever in doubt I like to resort to Stack Exchange for an expert advice.
Because table identifiers are usually indexed, the more compact the database type the less space will the index require. From the most efficient to the least, here are our options:
- Some databases (PostgreSQL, SQL Server) offer a dedicated UUID storage type
- Otherwise we can store the bits as a byte array (e.g. RAW(16) in Oracle or the standard BINARY(16) type)
- Alternatively we can use 2 bigint (64-bit) columns, but a composite identifier is less efficient than a single column one
- We can store the hex value in a CHAR(36) column (e.g 32 hex values and 4 dashes), but this will take the most amount of space, hence it’s the least efficient alternative
Hibernate offers many identifier strategies to choose from and for UUID identifiers we have three options:
- the assigned generator accompanied by the application logic UUID generation
- the hexadecimal “uuid” string generator
- the more flexible “uuid2” generator, allowing us to use java.lang.UUID, a 16 byte array or a hexadecimal String value
The Hibernate UUID assigned generator
The assigned generator allows the application logic to control the entity identifier generation process. By simply omitting the identifier generator definition, Hibernate will consider the assigned identifier. This example uses a BINARY(16) column type, since the target database is HSQLDB.
@Entity(name = "Post") @Table(name = "post") public class Post { @Id @Column(columnDefinition = "BINARY(16)") private UUID id = UUID.randomUUID(); private String title; public UUID getId() { return id; } public Post setId(UUID id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } }
Persisting an Entity:
entityManager.persist( new Post().setTitle("High-Performance Java Persistence") );
Generates exactly one INSERT statement:
INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', [72, 101, 87, -123, -35, 18, 65, -21, -84, -90, 83, -104, -112, -41, -62, -54] )
Let’s see what happens when issuing a merge instead:
entityManager.merge( new Post().setTitle("High-Performance Java Persistence") );
We get both a SELECT and an INSERT this time:
SELECT p.id as id1_0_0_, p.title as title2_0_0_ FROM post p WHERE p.id = [32, -57, 116, 87, 106, 104, 76, -95, -102, 25, -74, 119, 30, -50, -12, -84] INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', [32, -57, 116, 87, 106, 104, 76, -95, -102, 25, -74, 119, 30, -50, -12, -84] )
The persist method takes a transient entity and attaches it to the current Hibernate entityManager. If there is an already attached entity or if the current entity is detached, an exception is thrown.
The merge operation will copy the current object state into an existing persisted entity. This operation works for both transient and detached entities, but for transient entities persist is much more efficient than the merge operation.
For assigned identifiers, a merge will always require an SQL SELECT since Hibernate cannot know if there is already a persisted entity having the same identifier. For other identifier generators, Hibernate looks for a null identifier to figure out if the entity is in the transient state.
If you’re using the Spring Data SimpleJpaRepository#save(S entity) method, then you need to be careful when using entities with assigned identifiers.
The save
method is implemented as follows:
@Transactional public <S extends T> S save(S entity) { if (entityInformation.isNew(entity)) { em.persist(entity); return entity; } else { return em.merge(entity); } }
For assigned identifiers, this method might pick merge
instead of persist
if the entity doesn’t also supply a @Version
property, therefore triggering a SELECT prior to executing the INSERT statement for every newly inserted entity.
Check out the Spring documentation for more details about the best way to use the save
method provided by the JpaRepository
.
The auto-generated Hibernate UUID identifiers
This time, we won’t assign the identifier ourselves but have Hibernate generate it on our behalf. When a null identifier is encountered, Hibernate assumes a transient entity, for whom it generates a new identifier value. This time, the merge operation won’t require a select query prior to inserting a transient entity.
The UUIDHexGenerator
The UUID hex generator is the oldest UUID identifier generator and it’s registered under the “uuid” type. It can generate a 32 hexadecimal UUID string value (it can also use a separator) having the following pattern: 8{sep}8{sep}4{sep}8{sep}4.
This generator is not IETF RFC 4122 compliant, which uses the 8-4-4-4-12 digit representation.
@Entity(name = "Post") @Table(name = "post") public class Post { @Id @GeneratedValue(generator = "uuid") @GenericGenerator(name = "uuid", strategy = "uuid") @Column(columnDefinition = "CHAR(32)") private String id; private String title; public String getId() { return id; } public Post setId(String id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } }
When persisting the Post
entity:
entityManager.persist( new Post().setTitle("High-Performance Java Persistence") );
Hibernate generates the following SQL INSERT statement:
INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', 8a80cb8172c0e9ff0172c0ea02e40000 )
And, when merging a transient Post
entity:
entityManager.merge( new Post().setTitle("High-Performance Java Persistence") );
Hibernate generates a single SQL INSERT statement without needing a SELECT query:
INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', 8a80cb8172c0e9ff0172c0ea03030001 )
The UUIDGenerator
The newer UUID generator is IETF RFC 4122 compliant (variant 2) and it offers pluggable generation strategies. It’s registered under the uuid2
type and it offers a broader type range to choose from:
- java.lang.UUID
- a 16 byte array
- a hexadecimal String value
Because the uuid2
generator is the default strategy used by Hibernate, you don’t need to declare it explicitly. If the entity identifier is of the UUID
type and the entity identifier uses the @GeneratedValue
annotation, then the uuid2
generator strategy is going to be used:
@Entity(name = "Post") @Table(name = "post") public class Post { @Id @GeneratedValue @Column(columnDefinition = "BINARY(16)") private UUID id; private String title; public UUID getId() { return id; } public Post setId(UUID id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } }
Persisting or merging a transient entity:
When persisting the Post
entity:
entityManager.persist( new Post().setTitle("High-Performance Java Persistence") );
Hibernate generates the following SQL INSERT statement:
INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', [90, 17, 87, -73, -69, 81, 77, -47, -102, 110, 74, -4, 85, -74, -24, -95] )
And, when merging a transient Post
entity:
entityManager.merge( new Post().setTitle("High-Performance Java Persistence") );
Hibernate generates a single SQL INSERT statement:
INSERT INTO post ( title, id ) VALUES ( 'High-Performance Java Persistence', [-38, 35, 2, -55, 65, -127, 70, -51, -68, -34, 117, 111, -40, 4, -26, 63] )
These SQL INSERT queries are using a byte array as we configured the @Id
column definition.
I'm running an online workshop on the 11th of October about High-Performance SQL.If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
While you can use a UUID entity identifier with JPA and Hibernate, it’s not always the right choice. First of all, the UUID requires 128 bits, and this problem can be amplified by Foreign Key columns. Since Primary Key and Foreign Keys columns are usually indexes, the extra storage requirement will impact indexes as well.
That’s the reason why numerical entity identifiers are usually a much better option, especially when being generated by a database sequence. Or, if you need to generate the identifiers in the application, then you are better off using a time-sorted TSID instead, as explained in this article.
