The best way to use the @ManyToMany annotation with JPA and Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

As simple as JPA annotations might be, it’s not always obvious how efficient they are behind the scenes. In this article, I’m going to show you what is the best way to use the JPA @ManyToMany annotation when using Hibernate.

Domain Model

Assuming we have the following database tables:

post and tag many-to-many table relationship

A typical many-to-many database association includes two parent tables which are linked through a third one containing two Foreign Keys referencing the parent tables.

Using java.util.List

The first choice for many Java developers is to use a java.util.List for Collections that don’t entail any specific ordering.

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    public Post() {}

    public Post(String title) {
        this.title = title;
    }

    @ManyToMany(cascade = { 
        CascadeType.PERSIST, 
        CascadeType.MERGE
    })
    @JoinTable(name = "post_tag",
        joinColumns = @JoinColumn(name = "post_id"),
        inverseJoinColumns = @JoinColumn(name = "tag_id")
    )
    private List<Tag> tags = new ArrayList<>();

    //Getters and setters ommitted for brevity

    public void addTag(Tag tag) {
        tags.add(tag);
        tag.getPosts().add(this);
    }

    public void removeTag(Tag tag) {
        tags.remove(tag);
        tag.getPosts().remove(this);
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Post)) return false;
        return id != null && id.equals(((Post) o).getId());
    }

    @Override
    public int hashCode() {
        return 31;
    }
}

@Entity(name = "Tag")
@Table(name = "tag")
public class Tag {

    @Id
    @GeneratedValue
    private Long id;

    @NaturalId
    private String name;

    @ManyToMany(mappedBy = "tags")
    private List<Post> posts = new ArrayList<>();

    public Tag() {}

    public Tag(String name) {
        this.name = name;
    }

    //Getters and setters ommitted for brevity

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Tag tag = (Tag) o;
        return Objects.equals(name, tag.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name);
    }
} 

There are several aspects to note on the aforementioned mapping that are worth explaining:

  1. The tags association in the Post entity only defines the PERSIST and MERGE cascade types. As explained in this article, the REMOVE entity state transition doesn’t make any sense for a @ManyToMany JPA association since it could trigger a chain deletion that would ultimately wipe both sides of the association.
  2. As explained in this article, the add/remove utility methods are mandatory if you use bidirectional associations so that you can make sure that both sides of the association are in sync.
  3. The Post entity uses the entity identifier for equality since it lacks any unique business key. As explained in this article, you can use the entity identifier for equality as long as you make sure that it stays consistent across all entity state transitions.
  4. The Tag entity has a unique business key which is marked with the Hibernate-specific @NaturalId annotation. When that’s the case, the unique business key is the best candidate for equality checks.
  5. The mappedBy attribute of the posts association in the Tag entity marks that, in this bidirectional relationship, the Post entity owns the association. This is needed since only one side can own a relationship, and changes are only propagated to the database from this particular side.

For more details about the @NaturalId annotation, check out this article.

Although the mapping is correct from a JPA perspective, from a database perspective, it’s not efficient at all. To understand why it is so, you need to log and analyze the automated generated SQL statements.

Considering we have the following entities:

final Long postId = doInJPA(entityManager -> {
    Post post1 = new Post("JPA with Hibernate");
    Post post2 = new Post("Native Hibernate");

    Tag tag1 = new Tag("Java");
    Tag tag2 = new Tag("Hibernate");

    post1.addTag(tag1);
    post1.addTag(tag2);

    post2.addTag(tag1);

    entityManager.persist(post1);
    entityManager.persist(post2);

    return post1.id;
});

When removing a Tag entity from a Post:

doInJPA(entityManager -> {
    Tag tag1 = new Tag("Java");
    Post post1 = entityManager.find(Post.class, postId);
    post1.removeTag(tag1);
});

Hibernate generates the following SQL statements:

SELECT p.id AS id1_0_0_,
       t.id AS id1_2_1_,
       p.title AS title2_0_0_,
       t.name AS name2_2_1_,
       pt.post_id AS post_id1_1_0__,
       pt.tag_id AS tag_id2_1_0__
FROM   post p
INNER JOIN 
       post_tag pt 
ON     p.id = pt.post_id
INNER JOIN 
       tag t 
ON     pt.tag_id = t.id
WHERE  p.id = 1

DELETE FROM post_tag
WHERE  post_id = 1

INSERT INTO post_tag
       ( post_id, tag_id )
VALUES ( 1, 3 )

So, instead of deleting just one post_tag entry, Hibernate removes all post_tag rows associated to the given post_id and reinserts the remaining ones back afterward. This is not efficient at all because it’s extra work for the database, especially for recreating indexes associated with the underlying Foreign Keys.

For this reason, it’s not a good idea to use the java.util.List for @ManyToMany JPA associations.

Using java.util.Set

Instead of a List, we can use a Set.

The Post entity tags association will be changed as follows:

@ManyToMany(cascade = { 
    CascadeType.PERSIST, 
    CascadeType.MERGE
})
@JoinTable(name = "post_tag",
    joinColumns = @JoinColumn(name = "post_id"),
    inverseJoinColumns = @JoinColumn(name = "tag_id")
)
private Set<Tag> tags = new HashSet<>();

And the Tag entity will undergo the same modification:

@ManyToMany(mappedBy = "tags")
private Set<Post> posts = new HashSet<>();

If you worry about the lack of a predefined entry order, then you can use either the @OrderBy or @OrderColumn JPA annotations.

@OrderBy does the sorting in-memory, after the entries are fetched from the database while @OrderColumn materializes the element order in a dedicated column that is stored in the post_tag link table.

Now, when rerunning the previous test case, Hibernate generates the following SQL statements:

SELECT p.id AS id1_0_0_,
       t.id AS id1_2_1_,
       p.title AS title2_0_0_,
       t.name AS name2_2_1_,
       pt.post_id AS post_id1_1_0__,
       pt.tag_id AS tag_id2_1_0__
FROM   post p
INNER JOIN 
       post_tag pt 
ON     p.id = pt.post_id
INNER JOIN 
       tag t 
ON     pt.tag_id = t.id
WHERE  p.id = 1

DELETE FROM post_tag
WHERE  post_id = 1 AND tag_id = 3

Much better! There is only one DELETE statement executed which removes the associated post_tag entry.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Using JPA and Hibernate is very convenient since it can boost developer productivity. However, this does not mean that you have to sacrifice application performance.

By choosing the right mappings and data access pattern, you can make the difference between an application that barely crawls and one that runs at warp speed.

So, when using the @ManyToMany annotation, always use a java.util.Set and avoid the java.util.List.

FREE EBOOK

22 Comments on “The best way to use the @ManyToMany annotation with JPA and Hibernate

  1. Hey Vlad,
    Really appreciate the instructions. I’m struggling with the remove with the instructions you’ve provided however. I’ve implemented the @ManyToMany as a Set as you explained, with the removeTag(), equals(), and hashCode() copy-pasted from your example.

    The removal fails for me on one side. The Tag is removed from tags, but the Post is not removed from tag.getPosts.

    it seems that even though

    tag.getPosts().iterator().next.equals(post)

    evaluates to true, and I’ve verified that the hashCodes are equal, tag.getPosts().contains(post) evaluates to false. Therefor the post is never removed. I’ve been staring at my debugger for hours but get stuck deep in the HashSet.remove() function, specifically HashMap.removeNode(…).

    Any ideas I can try?

    Best,

    Bas

    • It’s hard to tell what the problem is without seeing the code. This issue is much better handled via a consulting session. So, if your company is interested in working with me, then I’ll send you a contract.

  2. How might I deal with the situation where my bidirectional @ManytoMany relx defines an extra attribute (String:alias) which I need reversed in the map.entry in one side of relationship?

    I’m modeling a keystore which contains secretkeys aliased by a string ‘alias’.

    Doing this by modeling a KeyStoreEntity with Map<String, SecretKeyEntity> on one end and for SecretKeyEntity a Map<String, KeyStoreEntity> on other side works with one problem.

    In both of these cases the String is the ‘alias’ name used and represents the key in the Map.Entry I mention. The alias is created as 3rd column in join table when I use @MapKeyColumn(name=”alias”) on both sides.

    My problem is that on SecretKeyEntity side I will have cases where the same alias is used in different keystores. Thus I need Map<KeyStoreEntity, String> rather than Map<String, KeyStoreEntity> on one end. Without this the same alias/key will collapse/overwrite value in map.

    Unless my relx is fundamentally flawed and I need to redesign schema, it feels like I’m looking for an annotation of @MapValueColumn(“alias”) to produce map type of <KeyStoreEntity, String> I’m trying for. this annotation of course doesn’t exist. I also will apparently require use of @MapJoinColumn? to specify an entity for key-side which I understand.

    My underlying schema “appears” correct and am seeing right tables/cols in database. Issue is only with signature of Map method on one end…wanting to reverse Map.Entry tuple.

    What might I be missing here?

    By the way, I purchased your book, and it’s excellent. Also your coverage of OSIV anti-pattern was invaluable to me in getting response times down ten times and “unclogging” my performance to a large degree. It really makes quite clear the need for the DTO/entity separation.

    Thanks Vlad!

      • You’re fast Vlad! I’ve seen this post before and have re-examined it for applicability.

        I don’t see where I have control over what is specified for Map.Entry value in the one side of relationship.

        JPA seems to give control over column used for key but I don’t have equivalent for value side ala @MapValueColumn(name=”alias”). Unless I’m mistaken (and perhaps am) it seems like my problem would be solved with this annotation.

        JPA shows me how to specify entity or primitive as key…just not the ability to use ‘alias’ the sole primitive in join table produced (keystore, secretkey, alias) as value.

        I need the following…

        KeyStoreEntity
        -secretKeys : Map
        SecretKeyEntity
        -keystores : Map

        switched to be:

        KeyStoreEntity
        -secretKeys : Map
        SecretKeyEntity
        -keystores : Map

        I hope I’ve described my problem properly and really appreciate your assistance Vlad. I realize you’re a busy dude.

      • The JPA mapping has its limits. If you want absolute flexibility, then a query us always a much better option than a mapping. Try designing the mapping as a table relationship, use a simple JPA mapping, and address your complex mapping at query time.

  3. Hello vladmihalcea, I have a issue, when I use the findby(attrib) the rows of the jointables manytomany are deleted

  4. I created your Many To Many which is Unidirectional (Post has many tags).

    The problem I’m having is that one post has 10000 tags (lets say) – how to add new tag to this collection without loading all 10000 tags from DB in memory?

    Is there any tutorial (I cannot find it in your book), where you say – if there are too many entities that will be added to collection then do this, otherwise use other mapping or something?

    Also, what is the best practice when preserving entities from GUI -> REST Endpoint -> Hibernate? Should we just send differences from GUI layer, or track changes in some way, or always send full entity object graph? Are there any tutorials related to this? (eg, for post I could change only timestamp and send timestamp from GUI -> backend for update, or I could send full entity state with all attributes on each update (even if only timestamp changed))?

    • Having 10000 tags for a post does not sound like a very good idea. Don’t map such an association in JPA, as I mentioned in my book.

      Also, if you read my book carefully, you will see that I don’t advise fetching more data than necessary, and that should answer your second question. Also, for efficiency, using DTOs is more appropriate for reading data than having to send entire graphs of data.

      If you buy my Mach 2 video course, I can give you a 20 minutes consulting time slot to address your questions.

  5. Hi Vlad,

    Thanks for great articles.

    I have an issue which I’m not able to resolve.

    I have a many-to-many relationship which is ordered by creation timestamp.

    @ManyToMany(fetch = FetchType.LAZY, cascade = {
            CascadeType.PERSIST,
            CascadeType.MERGE
    })
    @JoinTable(name = “influencer_follower”,
            joinColumns = @JoinColumn(name = “influencer_id”),
            inverseJoinColumns = @JoinColumn(name = “user_id”)
    )
    @OrderBy(“created_date desc”)
    private List followers = new ArrayList();
    

    So far, so good.
    The problem arises when I’m trying to add a follower to the list. Hibernate deletes and re-inserts followers, with all the followers having current timestamp assigned to created_date.
    One thing I see inconsistent here is that the followers’ ordering should not be managed by hibernate, as the created_date column is being initialized by MySQL default CURRENT_TIMESTAMP value. It’s more for reading ordered data. For inserting I would rather use Set, and the list would be ordered chronologically.

    How can I handle this case?

    Thanks in advance!

  6. Hello Vlad! Great Article and thank you for your time.

    I am using this as a reference while implementing Spring Security login with Roles. So my ManyToMany relationship is between users and roles. When registering a user I set the roles set using setter for it. So my code looks like, user.setRoles(roles). I pass a set with the USER_ROLE in it. The user is saved but it does not populate the users_roles table. I am unsure why. When I try to use the addRole() function I get a null error when it tries to update the user on the role side of the relationship, at role.getUsers().add(this). Any advice?

    Thank you again for your time,

    Philip

  7. I was implementing your code (first one) using Hibernate alone, no JPA, but cascade = {CascadeType.PERSIST, CascadeType.MERGE} is not enough. I was getting java.lang.IllegalStateException: org.hibernate.TransientObjectException: object references an unsaved transient instance exception. I had to give CascaseType.ALL to get success.

    • Sounds like a bug in Hibernate. You should open a Jira issue for it.

  8. Hi Vlad,

    I’m curious if unidirectional @ManyToMany relationship with Set on the owner side would perform nice like the bidirectional with Sets on both sides? So, with just a single delete which would remove the entity.

  9. Hi Vlad,

    I encountered an infinite recursion issue by following exactly your guide. Adding post1 containing tag1 is okay; adding post2 also containing tag1 infinite recursion appear: post2 entity contains tag1 entity, the tag1 entity contains post2 entity, so on and so on… finally it cause stackoverflow error. Any thoughts on it?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.