Why should not use the AUTO JPA GenerationType with MySQL and Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

As I already mentioned, you should never use the TABLE identifier generator since it does not scale properly. In this post, I’ll show you why you should not rely on the AUTO GenerationType strategy if you’re Hibernate application uses MySQL.

Prior to Hibernate 5

On Hibernate 4, if you had the following entity mapping:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Long id;

    private String title;

    public Post() {}

    public Post(String title) {
        this.title = title;
    }
}

When persisting 3 Post entities:

for ( int i = 1; i <= 3; i++ ) {
    entityManager.persist( 
        new Post( 
            String.format( 
                "High-Performance Java Persistence, Part %d", i
            ) 
        ) 
    );
}

Hibernate would generate the following insert statements:

INSERT INTO post (title) 
VALUES ('High-Performance Java Persistence, Part 1')

INSERT INTO post (title) 
VALUES ('High-Performance Java Persistence, Part 2')

INSERT INTO post (title) 
VALUES ('High-Performance Java Persistence, Part 3')

That’s great! Hibernate used the IDENTITY column to generate the entity identifier which is the only reasonable option for MySQL.

Hibernate 5

If you run the same unit test on Hibernate 5, you’ll get the following SQL statements:

SELECT next_val as id_val 
FROM hibernate_sequence FOR UPDATE

UPDATE hibernate_sequence 
SET next_val= 2 where next_val=1

SELECT next_val as id_val 
FROM hibernate_sequence FOR UPDATE

UPDATE hibernate_sequence 
SET next_val= 3 where next_val=1

SELECT next_val as id_val 
FROM hibernate_sequence FOR UPDATE

UPDATE hibernate_sequence 
SET next_val= 4 where next_val=3

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 1', 1)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 2', 2)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence, Part 3', 3)

What’s just happened? Well, Hibernate picks the TABLE generator instead of IDENTITY when the underlying database does not support sequences. However, TABLE generator is not a good choice. Check out the HHH-11014 Jira issue for more details related to this behavior change.

How to fix it?

The fix is extremely easy. You just need to use the native identifier instead:

@Id
@GeneratedValue(
    strategy= GenerationType.AUTO, 
    generator="native"
)
@GenericGenerator(
    name = "native", 
    strategy = "native"
)
private Long id;

Now, when running the previous test case, Hibernate uses the IDENTITY column instead:

INSERT INTO post (title) 
VALUES ('High-Performance Java Persistence, Part 1')

INSERT INTO post (title) 
VALUES ('High-Performance Java Persistence, Part 2')

INSERT INTO post (title) 
VALUES ('High-Performance Java Persistence, Part 3')

If you want to use a portable solution that manages to customize the SEQUENCE generator while still allowing you to pick the IDENTITY generator for MySQL, then check out this article.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

JPA portability is a myth! In reality, you have to know the details of the underlying JPA provider if you want a high-performance enterprise application.

FREE EBOOK

15 Comments on “Why should not use the AUTO JPA GenerationType with MySQL and Hibernate

  1. Hi Vlad,

    If we want to use IDENTITY, why don’t we just write GenerationType.IDENTITY instead of using native identifier,

    Thanks.

    • You could do that too. But the native generator will work even if the DB does not support IDENTITY columns.

  2. Hi Vladmihalcea,
    I always follow your articles on JPA,Hibernate and Persitence, great articles with detail explanation. I always rely on these articles in need, after going through the article you mentioned when we use @Id and @GeneratedValue with strategy GenerationType.Auto it gets next value from hibernate_sequence.

    SELECT next_val as id_val
    FROM hibernate_sequence FOR UPDATE

    UPDATE hibernate_sequence
    SET next_val= 2 where next_val=1

    SELECT next_val as id_val
    FROM hibernate_sequence FOR UPDATE

    UPDATE hibernate_sequence
    SET next_val= 3 where next_val=1

    How does it work on Cluster with multiple nodes deployed mapping to same database, if my application is low latency concurrent application how the other nodes get the updated value i.e. the next value saved at DB. Does it always try to fire a query for each update, I might believe strategy is not good fit on cluster but want to understand how this Auto works on cluster

    Thanks

  3. My 2c: This whole discussion is only relevant if you are still using database-generated keys (being it via identity columns, sequences or (yuck!) keys gen tables). I think we are passed that.

    The benefits in 2019 of using client-side generated keys (for example UUIDs) are simply too many and the issues that previously existed with such strategy have been solved.

    • UUIDs take a lot of space and amplify this issue with every Foreign Key. Not to mention the extra work done by B+Tree clustered index on simple INSERT. So, UUIDs are not without issues.

      • Yeah, well, I still disagree. 🙂

        I regret that I used the term ‘UUID’. What is really meant is an UUID-like sequential value. (Technically not an UUID). Think of it as a byte array with no meaning. These sequential IDs have absolutely no issues wrt clustered indexes and performance …exactly because they are sequential.

        Rather than using the UUID word I like better to refer to the concept as ‘client-side generated keys’ because it is broader and better captures the idea.

        Yes, a 16 byte key would take up double the space of a traditional numeric key which is 8 bytes (bigint). True. And yes this will be amplified with foreign keys. But disk space consumption here is a small penalty to pay compared to the benefits. You may disagree.

        And, there are also client-side generated keys smaller than 16 bytes. For example you have the MongoDB id which is only 12 bytes or the Twitter Snowflake ID which is 8 bytes. These id types are also sequential at least on the first half of the bytes, which is enough not to screw up INSERT performance. However, clearly, the lower you go on the byte count the more you increase the risk of a collision. But should a collision ever occur, they are not hard to survive: just retry the action. (the user doesn’t have to notice, you could even let Hibernate handle it behind the programmer’s back)

        Bottom line: In most systems you can pretty much guarantee that there are less than say 1000 nodes (hosts, whatever) which will attempt concurrent INSERT operations. With such usage pattern you can go as low as 8 bytes for the key and a collision will still be a 10,000 year event. And if you go as low as 8 bytes then clearly the argument about space consumption – when compared to a numeric key – is no longer relevant.

        Time to rethink primary key generation. The concept of a numeric key generated by the database server is no longer the only kid on the block. Client-side generated keys isn’t for everyone, but for the databases where the job of the database is to serve a single application, then they are very good fit.

        Thanks for the blog, btw. I learned a lot of JPA/Hibernate stuff from reading it.

      • The database sequence with the Hibernate optimizers already render a good performance. Client-side UUID generation makes sense for multi-master replication, but that’s way less prevalent than the primary-follower typical replication.

  4. Great you bring me the solution to my incosnsitencies in my db.
    Do this annotation changes only concerns MySql – DB`s?

    • This behaviour is for all DBs not supporting sequences.

  5. Hi after using the below generation type
    @Id
    @Column(name=”id”)
    @GeneratedValue(strategy=GenerationType.AUTO,generator=”native”)

    I am getting org.hibernate.AnnotationException: Unknown Id.generator: native

    I am using hibenrate 5 ,mysql 5.7.

    Please help

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.