How to implement equals and hashCode using the JPA entity identifier (Primary Key)

(Last Updated On: March 6, 2018)

Introduction

As previously explained, using the JPA entity business key for equals and hashCode is always best choice. However, not all entities feature a unique business key, so we need to use another database column that is also unique, like the primary key.

But using the entity identifier for equality is very challenging, and this post is going to show you how you can use it without issues.

Test harness

When it comes to implementing equals and hashCode, there is one and only one rule you should have in mind:

Equals and hashCode must behave consistently across all entity state transitions.

To test the effectiveness of an equals and hashCode implementation, the following test can be used:

protected <T extends Identifiable<? extends Serializable>> 
    void assertEqualityConstraints(Class<T> clazz, T entity) {
    
    Set<T> tuples = new HashSet<>();

    assertFalse(tuples.contains(entity));
    tuples.add(entity);
    assertTrue(tuples.contains(entity));

    doInJPA(entityManager -> {
        entityManager.persist(entity);
        entityManager.flush();
        assertTrue("The entity is found after it's persisted",
            tuples.contains(entity));
    });

    //The entity is found after the entity is detached
    assertTrue(tuples.contains(entity));

    doInJPA(entityManager -> {
        T _entity = entityManager.merge(entity);
        assertTrue("The entity is found after it's merged",
            tuples.contains(_entity));
    });

    doInJPA(entityManager -> {
        entityManager.unwrap(Session.class).update(entity);
        assertTrue("The entity is found after it's reattached",
            tuples.contains(entity));
    });

    doInJPA(entityManager -> {
        T _entity = entityManager.find(clazz, entity.getId());
        assertTrue("The entity is found after it's loaded " +
                   "in an other Persistence Context",
            tuples.contains(_entity));
    });

    executeSync(() -> {
        doInJPA(entityManager -> {
            T _entity = entityManager.find(clazz, entity.getId());
            assertTrue("The entity is found after it's loaded " +
                       "in an other Persistence Context and " +
                       "in an other thread",
                tuples.contains(_entity));
        });
    });
}

Natural id

The first use case to test is the natural id mapping. Considering the following entity:

@Entity
public class Book implements Identifiable<Long> {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    @NaturalId
    private String isbn;

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Book)) return false;
        Book book = (Book) o;
        return Objects.equals(getIsbn(), book.getIsbn());
    }

    @Override
    public int hashCode() {
        return Objects.hash(getIsbn());
    }

    //Getters and setters omitted for brevity
}

The isbn property is also a @NaturalId, therefore, it should be unique and not nullable. Both equals and hashCode use the isbn property in their implementations.

For more details about the @NaturalId annotation, check out this article.

When running the following test case:

Book book = new Book();
book.setTitle("High-PerformanceJava Persistence");
book.setIsbn("123-456-7890");

assertEqualityConstraints(Book.class, book);

Everything works fine, as expected.

Default java.lang.Object equals and hashCode

What if our entity does not have any column that can be used as a @NaturalId? The first urge is to not define your own implementations of equals and hashCode, like in the following example:

@Entity(name = "Book")
public class Book implements Identifiable<Long> {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    //Getters and setters omitted for brevity
}

However, when testing this implementation:

Book book = new Book();
book.setTitle("High-PerformanceJava Persistence");

assertEqualityConstraints(Book.class, book);

Hibernate throws the following exception:

java.lang.AssertionError: The entity is found after it's merged

The original entity is not equal with the one returned by the merge method because two distinct Object(s) do not share the same reference.

Using the entity identifier for equals and hashCode

So if the default equals and hashCode is no good either, then let’s use the entity identifier for our custom implementation. Let’s just use our IDE to generate the equals and hashCode and see how it works:

@Entity
public class Book implements Identifiable<Long> {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Book)) return false;
        Book book = (Book) o;
        return Objects.equals(getId(), book.getId());
    }

    @Override
    public int hashCode() {
        return Objects.hash(getId());
    }

    //Getters and setters omitted for brevity
}

When running the previous test case, Hibernate throws the following exception:

java.lang.AssertionError: The entity is found after it's persisted

When the entity was first stored in the Set, the identifier was null. After the entity was persisted, the identifier was assigned to a value that was automatically generated, hence the hashCode differs. For this reason, the entity cannot be found in the Set after it got persisted.

Fixing the entity identifier equals and hashCode

To address the previous issue, there is only one solution: the hashCode should always return the same value:

@Entity
public class Book implements Identifiable<Long> {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Book)) return false;
        Book book = (Book) o;
        return id != null && id.equals(book.id);
    }

    @Override
    public int hashCode() {
        return 31;
    }

    //Getters and setters omitted for brevity
}

Also, when the entity identifier is null, we can guarantee equality only for the same object references. Otherwise, no transient object is equal to any other transient or persisted object. That’s why the identifier equality check is done only if the current Object identifier is not null.

With this implementation, the equals and hashCode test runs fine for all entity state transitions. The reason why it works is because the hashCode value does not change, hence, we can rely on the java.lang.Object reference equality as long as the identifier is null.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

The entity identifier can be used for equals and hashCode, but only if the hashCode returns the same value all the time. This might sound like a terrible thing to do since it defeats the purpose of using multiple buckets in a HashSet or HashMap.

However, for performance reasons, you should always limit the number of entities that are stored in a collection. You should never fetch thousands of entities in a @OneToMany Set because the performance penalty on the database side is multiple orders of magnitude higher than using a single hashed bucket.

All tests are available on GitHub.

Subscribe to our Newsletter

* indicates required
10 000 readers have found this blog worth following!

If you subscribe to my newsletter, you'll get:
  • A free sample of my Video Course about running Integration tests at warp-speed using Docker and tmpfs
  • 3 chapters from my book, High-Performance Java Persistence, 
  • a 10% discount coupon for my book. 
Get the most out of your persistence layer!

Advertisements

17 thoughts on “How to implement equals and hashCode using the JPA entity identifier (Primary Key)

  1. I copied your code into my spring data spring boot app and I only get nulls in all the audit fields.

    I copied the @Embeddable Audit object as is, and I added this line to my entity:
    @Embedded
    private Audit audit = new Audit();

    Here is the code I took it from: https://github.com/vladmihalcea/high-performance-java-persistence/blob/04c5e1920adbc9f6c0affeda42781284f4abe42d/core/src/test/java/com/vladmihalcea/book/hpjp/hibernate/mapping/embeddable/EmbeddableEntityListenerTest.java

    Have you or anyone else tried this code in Spring?

    1. Try to debug it and see why the code is not called when using Spring and why the callbacks wirk fine in the test.

  2. What about the approach to not use a constant hash code like 31, but instead to use the hash code of the class, sanitized by any proxy generated suffixes (e.g. _$$….) to achieve a better bucket distribution?

      1. I’m using this as a default implementation in an inheritance. The question is if I might have missed any case in a proxy generation tool that might change the whole class and not only append a generation suffix?

    1. I withdraw my question. I’ve performed some performance tests on an application where we work with collections that have up to 2500 (small) entries. Using a constant hash code here makes the calculation about 5-10 times slower than using a hash code based on the database identifier, accepting the drawback that mixed transition state collections won’t work.

      1. 5-10 slower does not tell how much time it takes in absolute figures. If the constant hashCode makes the calculations take 10 microseconds while the “broken” hashCode takes only 2 milliceconds, chances are that this is not the next bottleneck in your application as long as the fastest queries take in the millisecond range (2 or 3 orders of magnitude more time).

      2. In this particular application, it indeed is. Just deleting the hashCode implementation without changing anything else and rerunning the sample setup again causes the big difference. In the application, everything is recalculated from scratch on a change for consistency reasons and due to the high complexity of the calculation, so it happens that several 100s of entities are recreated on one transaction and put on a set. I’m quite sure that this could be optimized to reduce the hash code impact, but this is out of scope.

  3. Hi, beautifully explained article!

    I’m anyway puzzled somehow by what seems to me a contradictory conclusion comparing this article with the previous one (https://vladmihalcea.com/hibernate-facts-equals-and-hashcode/).

    The previous one concludes this:

    “We can’t use an auto-incrementing database id for comparing objects since the transient and the attached object versions won’t be equal to each other.”

    but in this article you are using the id inside equals so after persist the transient and the attached object versions won’t be equal to each other which is bad, right?

    Could you please clarify this?

    Thank you

    1. Good point, I rephrased it like this:

      We can’t use an auto-incrementing database id in the hashCode method since the transient and the attached object versions will no longer be located in the same hashed bucket.

  4. Using the hashCode() method to always return a constant value isn’t much of a solution, since Sets and Map key’s will perform poorly.

    Can you talk about using a GUID generator in the java code to assign a PK before making the entity persistent? I’ve seen that done, and it seems to work well. The downside is it will only work with GUIDs as PKs.

    1. Actually, they will not. Entities are not not in-memory objects. They have a high cost to be retrieved from the DB. By the time the constant hashCode becomes your bottleneck, you had to fetch millions of records which already took a lot of time.

      My video course covers the GUID topic in great detail. Enjoy watching the videos.

      1. Anyways, if I use UUID as an Id and if I generate UUID before persisting an entity, using equals and hashCode of the Primary Key is a good idea, right?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.