The best way to use the @ManyToMany annotation with JPA and Hibernate

Introduction

As simple as JPA annotations might be, it’s not always obvious how efficient they are behind the scenes. In this article, I’m going to show you what is the best way to use the JPA @ManyToMany annotation when using Hibernate.

Domain Model

Assuming we have the following database tables:

A typical many-to-many database association includes two parent tables which are linked through a third one containing two Foreign Keys referencing the parent tables.

Using java.util.List

The first choice for many Java developers is to use a java.util.List for Collections that don’t entail any specific ordering.

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    public Post() {}

    public Post(String title) {
        this.title = title;
    }

    @ManyToMany(cascade = { 
        CascadeType.PERSIST, 
        CascadeType.MERGE
    })
    @JoinTable(name = "post_tag",
        joinColumns = @JoinColumn(name = "post_id"),
        inverseJoinColumns = @JoinColumn(name = "tag_id")
    )
    private List<Tag> tags = new ArrayList<>();

    //Getters and setters ommitted for brevity

    public void addTag(Tag tag) {
        tags.add(tag);
        tag.getPosts().add(this);
    }

    public void removeTag(Tag tag) {
        tags.remove(tag);
        tag.getPosts().remove(this);
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Post)) return false;
        return id != null && id.equals(((Post) o).id);
    }

    @Override
    public int hashCode() {
        return 31;
    }
}

@Entity(name = "Tag")
@Table(name = "tag")
public class Tag {

    @Id
    @GeneratedValue
    private Long id;

    @NaturalId
    private String name;

    @ManyToMany(mappedBy = "tags")
    private List<Post> posts = new ArrayList<>();

    public Tag() {}

    public Tag(String name) {
        this.name = name;
    }

    //Getters and setters ommitted for brevity

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Tag tag = (Tag) o;
        return Objects.equals(name, tag.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name);
    }
} 

There are several aspects to note on the aforementioned mapping that are worth explaining:

  1. The tags association in the Post entity only defines the PERSIST and MERGE cascade types. As explained in this article, the REMOVE entity state transition doesn’t make any sense for a @ManyToMany JPA association since it could trigger a chain deletion that would ultimately wipe both sides of the association.
  2. As explained in this article, the add/remove utility methods are mandatory if you use bidirectional associations so that you can make sure that both sides of the association are in sync.
  3. The Post entity uses the entity identifier for equality since it lacks any unique business key. As explained in this article, you can use the entity identifier for equality as long as you make sure that it stays consistent across all entity state transitions.
  4. The Tag entity has a unique business key which is marked with the Hibernate-specific @NaturalId annotation. When that’s the case, the unique business key is the best candidate for equality checks.
  5. The mappedBy attribute of the posts association in the Tag entity marks that, in this bidirectional relationship, the Post entity own the association. This is needed since only one side can own a relationship, and changes are only propagated to the database from this particular side.

Although the mapping is correct from a JPA perspective, from a database perspective, it’s not efficient at all. To understand why it is so, you need to log and analyze the automated generated SQL statements.

Considering we have the following entities:

final Long postId = doInJPA(entityManager -> {
    Post post1 = new Post("JPA with Hibernate");
    Post post2 = new Post("Native Hibernate");

    Tag tag1 = new Tag("Java");
    Tag tag2 = new Tag("Hibernate");

    post1.addTag(tag1);
    post1.addTag(tag2);

    post2.addTag(tag1);

    entityManager.persist(post1);
    entityManager.persist(post2);

    return post1.id;
});

When removing a Tag entity from a Post:

doInJPA(entityManager -> {
    Tag tag1 = new Tag("Java");
    Post post1 = entityManager.find(Post.class, postId);
    post1.removeTag(tag1);
});

Hibernate generates the following SQL statements:

SELECT p.id AS id1_0_0_,
       t.id AS id1_2_1_,
       p.title AS title2_0_0_,
       t.name AS name2_2_1_,
       pt.post_id AS post_id1_1_0__,
       pt.tag_id AS tag_id2_1_0__
FROM   post p
INNER JOIN 
       post_tag pt 
ON     p.id = pt.post_id
INNER JOIN 
       tag t 
ON     pt.tag_id = t.id
WHERE  p.id = 1

DELETE FROM post_tag
WHERE  post_id = 1

INSERT INTO post_tag
       ( post_id, tag_id )
VALUES ( 1, 3 )

So, instead of deleting just one post_tag entry, Hibernate removes all post_tag rows associated to the given post_id and reinserts the remaining ones back afterward. This is not efficient at all because it’s extra work for the database, especially for recreating indexes associated with the underlying Foreign Keys.

For this reason, it’s not a good idea to use the java.util.List for @ManyToMany JPA associations.

Using java.util.Set

Instead of a List, we can use a Set.

The Post entity tags association will be changed as follows:

@ManyToMany(cascade = { 
    CascadeType.PERSIST, 
    CascadeType.MERGE
})
@JoinTable(name = "post_tag",
    joinColumns = @JoinColumn(name = "post_id"),
    inverseJoinColumns = @JoinColumn(name = "tag_id")
)
private Set<Tag> tags = new HashSet<>();

And the Tag entity will undergo the same modification:

@ManyToMany(mappedBy = "tags")
private Set<Post> posts = new HashSet<>();

If you worry about the lack of a predefined entry order, then you can use either the @OrderBy or @OrderColumn JPA annotations.

@OrderBy does the sorting in-memory, after the entries are fetched from the database while @OrderColumn materializes the element order in a dedicated column that is stored in the post_tag link table.

Now, when rerunning the previous test case, Hibernate generates the following SQL statements:

SELECT p.id AS id1_0_0_,
       t.id AS id1_2_1_,
       p.title AS title2_0_0_,
       t.name AS name2_2_1_,
       pt.post_id AS post_id1_1_0__,
       pt.tag_id AS tag_id2_1_0__
FROM   post p
INNER JOIN 
       post_tag pt 
ON     p.id = pt.post_id
INNER JOIN 
       tag t 
ON     pt.tag_id = t.id
WHERE  p.id = 1

DELETE FROM post_tag
WHERE  post_id = 1 AND tag_id = 3

Much better! There is only one DELETE statement executed which removes the associated post_tag entry.

If you enjoyed this article, I bet you are going to love my book as well.

Conclusion

Using JPA and Hibernate is very convenient since it can boost developer productivity. However, this does not mean that you have to sacrifice application performance.

By choosing the right mappings and data access pattern, you can make the difference between an application that barely crawls and one that runs at warp speed.

So, when using the @ManyToMany annotation, always use a java.util.Set and avoid the java.util.List.

If you liked this article, you might want to subscribe to my newsletter too.

Advertisements

15 thoughts on “The best way to use the @ManyToMany annotation with JPA and Hibernate

  1. Two things:
    1. mappedBy – would like to see an good explaination as to what it does , to me this is magic.
    2, Wonder why it makes for hibernate a difference there if it is a list or a set.

      1. Sorry to bug you, just trying to understand. The code snippet shows that it is a bidirectional relationship. So if it is a bi-directional association, can we simply use List and calling the method post1.removeTag(tag1) is enough ?

    1. As I explained in my book, in a @ManyToMany association, both sides are parents since the child side is the link table. That’s why you can use cascades on both parent sides. But just because you can, it does not mean it’s mandatory to have it.

      1. I see I got it wrong. So it is a different concept the owner side of relationship than the parent side of relationship? Thank you, I am already checking out the published chapters, then I’ll almost for sure buy the whole book as well 🙂

    1. Because many-to-many associations are suitable to link two independent entities, while one-to-many is for associating a parent-child association. If we did what you suggested, then imagine how many rows we’d have to change if we decided to rename one tag.

  2. Hello vladmihalcea, first of all thank you for your nice and clear article, I like the way you illustrate the scenarios.
    I’ve a question concerning additional columns in the link table. Is it possible with the @ManyToMany annotaion or do i need to replace it with extra entity (e.g. Entity Post_Tag with @ManyToOne Post and @ManyToOne Tag) to be able to define addtional custom fields in the link table? What comes to my mind would be to define a bidirectional relationship between my custom entity, so I would be able to do the following (I ommit the getters and setters which are usualy needed to define the bidirectional relationship):

    Entity Post {

    @OneToMany(mappedBy=”post”, cascade = ALL)
    List linkTable;
    }

    Entity Tag {

    @OneToMany(mappedBy=”tag”, cascade = ALL)
    List link;
    }

    Entity Post_Tag{

    Post_Tag(Post, Tag) {
    … setup relations …
    }

    @NotNull
    @ManyToOne
    Post post;

    @NotNull
    @ManyToOne(cascade = persist/merge)
    Tag tag;

    AdditionalField1
    AdditionalFieldN
    }

    Post post1 = new Post(“some post”);
    Tag tag1 = new Tag(“some tag”);

    Post_Tag link = new Post_Tag(post1, tag1)
    link.setAdditionalProp1(…)
    link.setAdditionalPropN(…)

    post1.addLink(link);

    entityManager.persist(post1);

    The question is whether this is q good aproach to deal with the requirement of additional columns in a manyToMany relationship or if they are better aprochaes to deal with that kind of requirement, maybe even with the @manyTomany annotaion.

    I would appreciate it if you could find some time to answer this long post 🙂
    ty in advance

    Matthias

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s