Hibernate and UUID identifiers

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, we are going to see how the UUID entity attributes are persisted when using JPA and Hibernate, for both assigned and auto-generated identifiers.

In my previous post I talked about UUID surrogate keys and the use cases when there are more appropriate than the more common auto-incrementing identifiers.

A UUID database type

There are several ways to represent a 128-bit UUID, and whenever in doubt I like to resort to Stack Exchange for an expert advice.

Because table identifiers are usually indexed, the more compact the database type the less space will the index require. From the most efficient to the least, here are our options:

  1. Some databases (PostgreSQL, SQL Server) offer a dedicated UUID storage type
  2. Otherwise we can store the bits as a byte array (e.g. RAW(16) in Oracle or the standard BINARY(16) type)
  3. Alternatively we can use 2 bigint (64-bit) columns, but a composite identifier is less efficient than a single column one
  4. We can store the hex value in a CHAR(36) column (e.g 32 hex values and 4 dashes), but this will take the most amount of space, hence it’s the least efficient alternative

Hibernate offers many identifier strategies to choose from and for UUID identifiers we have three options:

  • the assigned generator accompanied by the application logic UUID generation
  • the hexadecimal “uuid” string generator
  • the more flexible “uuid2” generator, allowing us to use java.lang.UUID, a 16 byte array or a hexadecimal String value

The Hibernate UUID assigned generator

The assigned generator allows the application logic to control the entity identifier generation process. By simply omitting the identifier generator definition, Hibernate will consider the assigned identifier. This example uses a BINARY(16) column type, since the target database is HSQLDB.

@Entity(name = "Post")
@Table(name = "post")
public class Post {
⠀
    @Id
    @Column(columnDefinition = "BINARY(16)")
    private UUID id = UUID.randomUUID();
⠀
    private String title;
⠀
    public UUID getId() {
        return id;
    }
⠀
    public Post setId(UUID id) {
        this.id = id;
        return this;
    }
⠀
    public String getTitle() {
        return title;
    }
⠀
    public Post setTitle(String title) {
        this.title = title;
        return this;
    }
}

Persisting an Entity:

entityManager.persist(
    new Post().setTitle("High-Performance Java Persistence")
);

Generates exactly one INSERT statement:

INSERT INTO post (
    title, 
    id
) 
VALUES (
    'High-Performance Java Persistence', 
    [72, 101, 87, -123, -35, 18, 65, -21, -84, -90, 83, -104, -112, -41, -62, -54]
)

Let’s see what happens when issuing a merge instead:

entityManager.merge(
    new Post().setTitle("High-Performance Java Persistence")
);

We get both a SELECT and an INSERT this time:

SELECT 
    p.id as id1_0_0_, 
    p.title as title2_0_0_ 
FROM 
    post p 
WHERE 
    p.id = [32, -57, 116, 87, 106, 104, 76, -95, -102, 25, -74, 119, 30, -50, -12, -84]

INSERT INTO post (
    title, 
    id
) 
VALUES (
    'High-Performance Java Persistence', 
    [32, -57, 116, 87, 106, 104, 76, -95, -102, 25, -74, 119, 30, -50, -12, -84]
)

The persist method takes a transient entity and attaches it to the current Hibernate entityManager. If there is an already attached entity or if the current entity is detached, an exception is thrown.

The merge operation will copy the current object state into an existing persisted entity. This operation works for both transient and detached entities, but for transient entities persist is much more efficient than the merge operation.

For assigned identifiers, a merge will always require an SQL SELECT since Hibernate cannot know if there is already a persisted entity having the same identifier. For other identifier generators, Hibernate looks for a null identifier to figure out if the entity is in the transient state.

If you’re using the Spring Data SimpleJpaRepository#save(S entity) method, then you need to be careful when using entities with assigned identifiers.

The save method is implemented as follows:

@Transactional
public <S extends T> S save(S entity) {
    if (entityInformation.isNew(entity)) {
        em.persist(entity);
        return entity;
    } else {
        return em.merge(entity);
    }
}

For assigned identifiers, this method might pick merge instead of persist if the entity doesn’t also supply a @Version property, therefore triggering a SELECT prior to executing the INSERT statement for every newly inserted entity.

Check out the Spring documentation for more details about the best way to use the save method provided by the JpaRepository.

The auto-generated Hibernate UUID identifiers

This time, we won’t assign the identifier ourselves but have Hibernate generate it on our behalf. When a null identifier is encountered, Hibernate assumes a transient entity, for whom it generates a new identifier value. This time, the merge operation won’t require a select query prior to inserting a transient entity.

The UUIDHexGenerator

The UUID hex generator is the oldest UUID identifier generator and it’s registered under the “uuid” type. It can generate a 32 hexadecimal UUID string value (it can also use a separator) having the following pattern: 8{sep}8{sep}4{sep}8{sep}4.

This generator is not IETF RFC 4122 compliant, which uses the 8-4-4-4-12 digit representation.

@Entity(name = "Post")
@Table(name = "post")
public class Post {
⠀
    @Id
    @GeneratedValue(generator = "uuid")
    @GenericGenerator(name = "uuid", strategy = "uuid")
    @Column(columnDefinition = "CHAR(32)")
    private String id;
}

When persisting the Post entity:

entityManager.persist(
    new Post().setTitle("High-Performance Java Persistence")
);

Hibernate generates the following SQL INSERT statement:

INSERT INTO post (
    title, 
    id
) 
VALUES (
    'High-Performance Java Persistence', 
    8a80cb8172c0e9ff0172c0ea02e40000
)

And, when merging a transient Post entity:

entityManager.merge(
    new Post().setTitle("High-Performance Java Persistence")
);

Hibernate generates a single SQL INSERT statement without needing a SELECT query:

INSERT INTO post (
    title, 
    id
) 
VALUES (
    'High-Performance Java Persistence', 
    8a80cb8172c0e9ff0172c0ea03030001
)

The UUIDGenerator

The newer UUID generator is IETF RFC 4122 compliant (variant 2) and it offers pluggable generation strategies. It’s registered under the uuid2 type and it offers a broader type range to choose from:

Because the uuid2 generator is the default strategy used by Hibernate, you don’t need to declare it explicitly. If the entity identifier is of the UUID type and the entity identifier uses the @GeneratedValue annotation, then the uuid2 generator strategy is going to be used:

@Entity(name = "Post")
@Table(name = "post")
public class Post {
⠀
    @Id
    @GeneratedValue
    @Column(columnDefinition = "BINARY(16)")
    private UUID id;
}

Persisting or merging a transient entity:

When persisting the Post entity:

entityManager.persist(
    new Post().setTitle("High-Performance Java Persistence")
);

Hibernate generates the following SQL INSERT statement:

INSERT INTO post (
    title, 
    id
) 
VALUES (
    'High-Performance Java Persistence', 
    [90, 17, 87, -73, -69, 81, 77, -47, -102, 110, 74, -4, 85, -74, -24, -95]
)

And, when merging a transient Post entity:

entityManager.merge(
    new Post().setTitle("High-Performance Java Persistence")
);

Hibernate generates a single SQL INSERT statement:

INSERT INTO post (
    title, 
    id
) 
VALUES (
    'High-Performance Java Persistence', 
    [-38, 35, 2, -55, 65, -127, 70, -51, -68, -34, 117, 111, -40, 4, -26, 63]
)

These SQL INSERT queries are using a byte array as we configured the @Id column definition.

I'm running an online workshop on the 20-21 and 23-24 of November about High-Performance Java Persistence.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

While you can use a UUID entity identifier with JPA and Hibernate, it’s not always the right choice. First of all, the UUID requires 128 bits, and this problem can be amplified by Foreign Key columns. Since Primary Key and Foreign Keys columns are usually indexes, the extra storage requirement will impact indexes as well.

That’s the reason why numerical entity identifiers are usually a much better option, especially when being generated by a database sequence. Or, if you need to generate the identifiers in the application, then you are better off using a time-sorted TSID instead, as explained in this article.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.