The best way to implement equals, hashCode, and toString with JPA and Hibernate

(Last Updated On: October 9, 2018)

Bytecode enhancement and toString

Last week, Mark Struberg, who is an Apache Software Foundation member and OpenJPA contributor, made the following statement:

Basically, he says that implementing toString is bad from a performance perspective. Well, that might be the case in OpenJPA, but in Hibernate things are a little bit different. Hibernate does not use bytecode enhancement by default.

Therefore, the toString method can use any basic entity attributes (that are needed to identify a certain entity in logs) as long as the basic attributes are fetched when the entity is loaded from the database.

Nevertheless, Hibernate allows attributes to be lazy loaded, but even then, the bytecode enhancement is not the necessarily the best approach. Using subentities might be a better alternative, and it does not even require bytecode enhancement.

Equals and hashCode

Unfortunately, Mark continues this discussion with this very misleading statement about equals and hashCode:

This statement is wrong, as this post will demonstrate in great detail.

Equality contract

According to Java specification, a good equals implementation must have the following properties:

  1. reflexive
  2. symmetric
  3. transitive
  4. consistent

The first three are rather intuitive, but ensuring consistency in the context of JPA and Hibernate entities is usually the biggest challenge for developers.

As already explained, equals and hashCode must behave consistently across all entity state transitions.

Identifier types

From an equal contract perspective, the identifiers can be split into two categories:

  • Assigned identifiers
  • Database-generated identifiers

Assigned identifiers

Assigned identifiers are allocated prior to flushing the Persistence Context, and we can further split them into two subcategories:

  • Natural identifiers
  • Database-agnostic UUIDs

Natural identifiers are assigned by a third-party authority, like a book ISBN.

Database-agnostic UUID numbers are generated outside of the database, like calling the java.util.UUID#randomUUID method.

Both natural identifiers and database-agnostic UUIDs have the luxury of being known when the entity gets persisted. For this reason, it is safe to use them in the equals and hashCode implementation:

@Entity(name = "Book")
@Table(name = "book")
public class Book 
    implements Identifiable<Long> {

    private Long id;

    private String title;

    private String isbn;

    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Book)) return false;
        Book book = (Book) o;
        return Objects.equals(getIsbn(), book.getIsbn());

    public int hashCode() {
        return Objects.hash(getIsbn());

    //Getters and setters omitted for brevity

For more details about the @NaturalId annotation, check out this article.

Database-generated identifiers

The database-generated identifiers are a different story. Because the identifier is assigned by the database during flush-time, the consistency guarantee breaks if we implemented the equals and hashCode based on the identifier just like for assigned identifiers.

This issue was detailed in my article, How to implement equals and hashCode using the entity identifier (primary key).

Therefore, whenever you have a database-generated identifier, a synthetic key (be it a numeric identifier or a database UUID type), you have to use the following equals and hashCode implementation:

@Entity(name = "Post")
@Table(name = "post")
public class Post implements Identifiable<Long> {

    private Long id;

    private String title;

    public Post() {}

    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Post)) return false;
        return id != null && id.equals(((Post) o).id);

    public int hashCode() {
        return 31;

    //Getters and setters omitted for brevity

So, the hashCode yields the same value across all entity state transitions, and the equals method is going to use the identifier check only for non-transient entities.

That’s it!

The only time when you’ll see a performance bottleneck due to a single hash bucket is if you have a large collection of tens of thousands of entries.

But then, it implies that you fetched that large collection from the database. The performance penalty of fetching such a collection from the database is multiple orders of magnitude higher than the single bucket overhead.

That’s why you never map large collections with Hibernate. You use queries for those instead. But then, for small collections.

Also, most of the time you don’t even need to use a Set or a Map. For bidirectional associations, List(s) perform better anyway.

More misconceptions

Mark has written a blog post to justify his beliefs.

In his article, Marks says that the database-generated identifier equality implementation does not work for merge or getReference().

Even Vlad’s advanced version does have holes. E.g. if you use em.getReference() or em.merge().

How to implement equals and hashCode using the JPA entity identifier (primary key) article demonstrates that this equals implementation works for detached objects. That was the whole point of coming up with such an implementation. We want it to work across all entity state transitions.

As for getReference(), there’s a check for that as well. It’s all on GitHub.

There’s one argument which I agree with, and that’s about making sure that the equality check is using only entity attributes that are immutable. That’s why the entity identifier sequence number is very appealing. And with the equality implementation method that I offer you, you can use it safely.

Unfortunately, Mark continues with more misconceptions, like:

Why do you need equals() and hashCode() at all?

This is a good question. And my answer is: “you don’t !”

Well, you DO!

If you don’t implement equals and hashCode then the merge test will fail, therefore breaking the consistency guarantee. It’s all explained in my How to implement equals and hashCode using the entity identifier (primary key) article, by the way.

And another misconception, from a Hibernate point of view

Why you shouldn’t store managed and detached entities in the same Collection

Not only that you should NOT avoid mixing detached and managed entities, but this is actually a great feature that allows you to hold on detached objects, and therefore prevent lost updates in long conversations.

And yet another misconception, from a Hibernate implementation perspective:

So, having a cache is really a great idea, but *please* do not store JPA entities in the cache. At least not as long as they are managed.

Hibernate strives to deliver strong consistency. That’s why the READ_WRITE and TRANSACTIONAL cache concurrency strategies allow you to not worry about such inconsistencies. It’s the second-level cache provider that guarantees this isolation level. Just like a relational database system.

Only NONSTRICT_READ_WRITE offers a weaker isolation level, but the non strict naming choice is self-descriptive after all.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.


The best advice I can give you is that you should always question every statement that you read on the Internet. You should always check every advice against your current JPA provider implementation because details make a very big difference.

Subscribe to our Newsletter

* indicates required
10 000 readers have found this blog worth following!

If you subscribe to my newsletter, you'll get:
  • A free sample of my Video Course about running Integration tests at warp-speed using Docker and tmpfs
  • 3 chapters from my book, High-Performance Java Persistence, 
  • a 10% discount coupon for my book. 
Get the most out of your persistence layer!


10 thoughts on “The best way to implement equals, hashCode, and toString with JPA and Hibernate

  1. Do you have a complete runnable version of the code in this article? Also, where is ‘Identifiable’ defined?

  2. Why do you cast ‘o’ to a ‘Post’ in the following code? Also, it was interesting in either in this article or the other one you wrote on equals/hashcode how you should just use Lists over sets and hashmaps for jpa. Is there ever a need for Sets in jpa? You also said that List’s were more performant. Why is that?

    public boolean equals(Object o) {
    if (this == o) return true;
    if (!(o instanceof Book)) return false;
    Book book = (Book) o;
    return id != null && id.equals(((Post) o).id);

      1. Hi @vladmihalcea. You had to replace “Post” to “Book”, not remove the cast. This line fails: “return id != null && id.equals(;” because Object does not have a property id.

        Anyway, this post was helpful.

  3. Hi!
    Thanks for this post.

    I have a couple of questions to you.

    In this post
    you said that:

    “Using a combination of fields that are unique among Entities is probably the best choice for implementing equals and hashCode methods.”

    And in this post I can see that you suggest to compare ID inside equals() for database-generated identifiers.

    It seems to me that “compare ID” approach deprives you of the opportunity to use transient entities inside Set ( because of duplicates inside Set). Correct me if I am wrong.

    Could you please clarify what approach is more preferable nowadays for database-generated identifiers?

    1. This article already answers your questions. In my book GitHub repository, you can find the tests that prove the answers to the question about Set too.

  4. Hi!
    Just wanted to say that getId() != null && Objects.equals(getId(), book.getId()); gets evaluated into getId() != null && (getId() == book.getId()) || (getId() != null && getId().equals(book.getId()));. So, you don’t need to check for null.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.