The best way to map a many-to-many association with extra columns when using JPA and Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

For a simple many-to-many database relationship, you can use the @ManyToMany JPA annotation and, therefore, hide the join table.

However, sometimes you need more than the two Foreign Key columns in the join table, and, for this purpose, you need to replace the @ManyToMany association with two bidirectional @OneToMany associations. Unlike unidirectional @OneToMany, the bidirectional relationship is the best way to map a one-to-many database relationship that requires a collection of Child elements on the parent side

In this article, we are going to see how you can map a many-to-many database relationship using an intermediary entity for the join table. This way, we can map additional columns that would be otherwise impossible to persist using the @ManyToMany JPA annotation.

Domain Model

Assuming we have the following database tables:

PostTag with extra columns

The first thing we need is to map the composite Primary Key which belongs to the intermediary join table. As explained in this article, we need an @Embeddable type to hold the composite entity identifier:

@Embeddable
public class PostTagId
    implements Serializable {

    @Column(name = "post_id")
    private Long postId;

    @Column(name = "tag_id")
    private Long tagId;

    private PostTagId() {}

    public PostTagId(
        Long postId, 
        Long tagId) {
        this.postId = postId;
        this.tagId = tagId;
    }

    //Getters omitted for brevity

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;

        if (o == null || getClass() != o.getClass()) 
            return false;

        PostTagId that = (PostTagId) o;
        return Objects.equals(postId, that.postId) && 
               Objects.equals(tagId, that.tagId);
    }

    @Override
    public int hashCode() {
        return Objects.hash(postId, tagId);
    }
}

There are two very important aspects to take into consideration when mapping an @Embeddable composite identifier:

  1. You need the @Embeddable type to be Serializable
  2. The @Embeddable type must override the default equals and hashCode methods based on the two Primary Key identifier values.

Next, we need to map the join table using a dedicated entity:

@Entity(name = "PostTag")
@Table(name = "post_tag")
public class PostTag {

    @EmbeddedId
    private PostTagId id;

    @ManyToOne(fetch = FetchType.LAZY)
    @MapsId("postId")
    private Post post;

    @ManyToOne(fetch = FetchType.LAZY)
    @MapsId("tagId")
    private Tag tag;

    @Column(name = "created_on")
    private Date createdOn = new Date();

    private PostTag() {}

    public PostTag(Post post, Tag tag) {
        this.post = post;
        this.tag = tag;
        this.id = new PostTagId(post.getId(), tag.getId());
    }

    //Getters and setters omitted for brevity

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;

        if (o == null || getClass() != o.getClass())
            return false;

        PostTag that = (PostTag) o;
        return Objects.equals(post, that.post) &&
               Objects.equals(tag, that.tag);
    }

    @Override
    public int hashCode() {
        return Objects.hash(post, tag);
    }
}

The Tag entity is going to map the @OneToMany side for the tag attribute in the PostTag join entity:

@Entity(name = "Tag")
@Table(name = "tag")
@NaturalIdCache
@Cache(
    usage = CacheConcurrencyStrategy.READ_WRITE
)
public class Tag {

    @Id
    @GeneratedValue
    private Long id;

    @NaturalId
    private String name;

    @OneToMany(
        mappedBy = "tag",
        cascade = CascadeType.ALL,
        orphanRemoval = true
    )
    private List<PostTag> posts = new ArrayList<>();

    public Tag() {
    }

    public Tag(String name) {
        this.name = name;
    }

    //Getters and setters omitted for brevity

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Tag tag = (Tag) o;
        return Objects.equals(name, tag.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name);
    }
}

The Tag entity is marked with the following Hibernate-specific annotations:

  1. The @NaturalId annotation allows us to fetch the Tag entity by its business key.
  2. The @Cache annotation marks the cache concurrency strategy.
  3. The @NaturalIdCache tells Hibernate to cache the entity identifier associated with a given business key.

For more details about the @NaturalId and @NaturalIdCache annotations, check out this article.

With these annotations in place, we can fetch the Tag entity without needing to hit the database.

And the Post entity is going to map the @OneToMany side for the post attribute in the PostTag join entity:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    @OneToMany(
        mappedBy = "post", 
        cascade = CascadeType.ALL, 
        orphanRemoval = true
    )
    private List<PostTag> tags = new ArrayList<>();

    public Post() {
    }

    public Post(String title) {
        this.title = title;
    }

    //Getters and setters omitted for brevity

    public void addTag(Tag tag) {
        PostTag postTag = new PostTag(this, tag);
        tags.add(postTag);
        tag.getPosts().add(postTag);
    }

    public void removeTag(Tag tag) {
        for (Iterator<PostTag> iterator = tags.iterator(); 
             iterator.hasNext(); ) {
            PostTag postTag = iterator.next();

            if (postTag.getPost().equals(this) &&
                    postTag.getTag().equals(tag)) {
                iterator.remove();
                postTag.getTag().getPosts().remove(postTag);
                postTag.setPost(null);
                postTag.setTag(null);
            }
        }
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;

        if (o == null || getClass() != o.getClass()) 
            return false;

        Post post = (Post) o;
        return Objects.equals(title, post.title);
    }

    @Override
    public int hashCode() {
        return Objects.hash(title);
    }
}

Notice that the Post entity features the addTag and removeTag utility methods which are needed by every bidirectional association so that all sides of the association stay in sync.

While we could have added the same add/remove methods to the Tag entity, it’s unlikely that these associations will be set from the Tag entity because the users operate with Post entities.

To better visualize the entity relationships, check out the following diagram:

Double side bidirectional PostTag

Testing time

First, let’s persist some Tag entities which we’ll later associate to a Post:

Tag misc = new Tag("Misc");
Tag jdbc = new Tag("JDBC");
Tag hibernate = new Tag("Hibernate");
Tag jooq = new Tag("jOOQ");

doInJPA(entityManager -> {
    entityManager.persist( misc );
    entityManager.persist( jdbc );
    entityManager.persist( hibernate );
    entityManager.persist( jooq );
});

Now, when we persist two Post entities:

Session session = entityManager
    .unwrap( Session.class );

Tag misc = session
    .bySimpleNaturalId(Tag.class)
    .load( "Misc" );

Tag jdbc = session
    .bySimpleNaturalId(Tag.class)
    .load( "JDBC" );

Tag hibernate = session
    .bySimpleNaturalId(Tag.class)
    .load( "Hibernate" );

Tag jooq = session
    .bySimpleNaturalId(Tag.class)
    .load( "jOOQ" );

Post hpjp1 = new Post(
    "High-Performance Java Persistence 1st edition"
);
hpjp1.setId(1L);

hpjp1.addTag(jdbc);
hpjp1.addTag(hibernate);
hpjp1.addTag(jooq);
hpjp1.addTag(misc);

entityManager.persist(hpjp1);

Post hpjp2 = new Post(
    "High-Performance Java Persistence 2nd edition"
);
hpjp2.setId(2L);

hpjp2.addTag(jdbc);
hpjp2.addTag(hibernate);
hpjp2.addTag(jooq);

entityManager.persist(hpjp2);

Hibernate generates the following SQL statements:

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence 1st edition', 1)

INSERT INTO post_tag (created_on, post_id, tag_id) 
VALUES ('2017-07-26 13:14:08.988', 1, 2)

INSERT INTO post_tag (created_on, post_id, tag_id) 
VALUES ('2017-07-26 13:14:08.989', 1, 3)

INSERT INTO post_tag (created_on, post_id, tag_id) 
VALUES ('2017-07-26 13:14:08.99', 1, 4)

INSERT INTO post_tag (created_on, post_id, tag_id) 
VALUES ('2017-07-26 13:14:08.99', 1, 1)

INSERT INTO post (title, id) 
VALUES ('High-Performance Java Persistence 2nd edition', 2)

INSERT INTO post_tag (created_on, post_id, tag_id) 
VALUES ('2017-07-26 13:14:08.992', 2, 3)

INSERT INTO post_tag (created_on, post_id, tag_id) 
VALUES ('2017-07-26 13:14:08.992', 2, 4)

INSERT INTO post_tag (created_on, post_id, tag_id) 
VALUES ('2017-07-26 13:14:08.992', 2, 2)

Now, since the Misc Tag entity was added by mistake, we can remove it as follows:

Tag misc = entityManager.unwrap( Session.class )
    .bySimpleNaturalId(Tag.class)
    .load( "Misc" );

Post post = entityManager.createQuery(
    "select p " +
    "from Post p " +
    "join fetch p.tags pt " +
    "join fetch pt.tag " +
    "where p.id = :postId", Post.class)
.setParameter( "postId", 1L )
.getSingleResult();

post.removeTag( misc );

Hibernate generating the following SQL statements:

SELECT p.id AS id1_0_0_,
       p_t.created_on AS created_1_1_1_,
       p_t.post_id AS post_id2_1_1_,
       p_t.tag_id AS tag_id3_1_1_,
       t.id AS id1_2_2_,
       p.title AS title2_0_0_,
       p_t.post_id AS post_id2_1_0__,
       p_t.created_on AS created_1_1_0__,
       p_t.tag_id AS tag_id3_1_0__,
       t.name AS name2_2_2_
FROM   post p
INNER JOIN 
       post_tag p_t ON p.id = p_t.post_id
INNER JOIN 
       tag t ON p_t.tag_id = t.id
WHERE  p.id = 1

SELECT p_t.tag_id AS tag_id3_1_0_,
       p_t.created_on AS created_1_1_0_,
       p_t.post_id AS post_id2_1_0_,
       p_t.created_on AS created_1_1_1_,
       p_t.post_id AS post_id2_1_1_,
       p_t.tag_id AS tag_id3_1_1_
FROM   post_tag p_t
WHERE  p_t.tag_id = 1

DELETE 
FROM   post_tag 
WHERE  post_id = 1 AND tag_id = 1

The second SELECT query is needed by this line in the removeTag utility method:

postTag.getTag().getPosts().remove(postTag);

However, if you don’t need to navigate all Post entities associated to a Tag, you can remove the posts collection from the Tag entity and this secondary SELECT statement will not be executed anymore.

Using a single-side bidirectional association

The Tag entity will not map the PostTag @OneToMany bidirectional association anymore.

@Entity(name = "Tag")
@Table(name = "tag")
@NaturalIdCache
@Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Tag {

    @Id
    @GeneratedValue
    private Long id;

    @NaturalId
    private String name;

    public Tag() {
    }

    public Tag(String name) {
        this.name = name;
    }

    //Getters omitted for brevity

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;

        if (o == null || getClass() != o.getClass()) 
            return false;

        Tag tag = (Tag) o;
        return Objects.equals(name, tag.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name);
    }
}

The PostTag entity and its PostTagId @Embeddable are identical with the previous example.

However, the Post entity addTag and removeTag are simplified as follows:

public void addTag(Tag tag) {
    PostTag postTag = new PostTag(this, tag);
    tags.add(postTag);
}

public void removeTag(Tag tag) {
    for (Iterator<PostTag> iterator = tags.iterator(); 
         iterator.hasNext(); ) {
        PostTag postTag = iterator.next();

        if (postTag.getPost().equals(this) &&
                postTag.getTag().equals(tag)) {
            iterator.remove();
            postTag.setPost(null);
            postTag.setTag(null);
        }
    }
}

The rest of the Post entity is the same as with the previous example as seen in the following diagram:

Single side bidirectional PostTag

Inserting the PostTag entities is going to render the same SQL statements as seen before.

But when removing the PostTag entity, Hibernate is going to execute a single SELECT query as well as a single DELETE statement:

SELECT p.id AS id1_0_0_,
       p_t.created_on AS created_1_1_1_,
       p_t.post_id AS post_id2_1_1_,
       p_t.tag_id AS tag_id3_1_1_,
       t.id AS id1_2_2_,
       p.title AS title2_0_0_,
       p_t.post_id AS post_id2_1_0__,
       p_t.created_on AS created_1_1_0__,
       p_t.tag_id AS tag_id3_1_0__,
       t.name AS name2_2_2_
FROM   post p
INNER JOIN 
       post_tag p_t ON p.id = p_t.post_id
INNER JOIN 
       tag t ON p_t.tag_id = t.id
WHERE  p.id = 1

DELETE 
FROM   post_tag 
WHERE  post_id = 1 AND tag_id = 1

Online Workshops

If you enjoyed this article, I bet you are going to love my upcoming Online Workshops!

Conclusion

While mapping the many-to-many database relationship using the @ManyToMany annotation is undoubtedly simpler, when you need to persist extra columns in the join table, you need to map the join table as a dedicated entity.

Although a little bit more work, the association works just as its @ManyToMany counterpart, and this time we can List collections without worrying about SQL statement performance issues.

When mapping the intermediary join table, it’s better to map only one side as a bidirectional @OneToMany association since otherwise a second SELECT statement will be issued while removing the intermediary join entity.

Transactions and Concurrency Control eBook

16 Comments on “The best way to map a many-to-many association with extra columns when using JPA and Hibernate

  1. Hi Vlad,

    In your equals methods, you have

    if (o == null || getClass() != o.getClass())

    shouldn’t this be

    if (o == null || !(o instanceof EntityClass))

    ?

    Because otherwise, if o is a hibernate proxy subclass of, e.g.:Post, then this will return false.

    I noticed the use of getClass() in the equals() methods of PostTagId, PostTag, Tag, and Post

  2. Hi Vlad, I’m trying to do the same example that you explained but I found the next error,

    org.hibernate.PersistentObjectException: detached entity passed to persist:

    I persisted the firs entity “tag” and then when I try to persit the second “post” I get that.
    Are you know something that can help me.

  3. HI Vlad, I have a single-side bidirectional association except no L2 cache and my join table has it’s own @Id (and sequence), id pair is simply a unique constraint. I am trying to do a one shot delete by doing post.getTags().clear() but it doesn’t work as one-shot (aka delete from tags where post_id = xxx)

    Could you please update your post with bulk/all delete information? Should be interesting for readers.

    • I didn’t understand the problem. It’s hard to cover every possible use case that involves a certain relationship in a single article. I think it’s worth asking the question on StackOverflow as it allows you to better capture the details related to the problem you are trying to solve.

  4. Good evening Vlad!
    How to remove multiple posts from tag? The working code:

            for (int i = 0; i < tag.getPosts().size(); i++) {
                if(tag.removePost(tag.getPosts().get(i).getPost())){
                    i--;
                }
            }
    

    How it should be:

            for (PostTag postTag : tag.getPosts()) {
                tag.removePost(postTag.getPost());
            }
    

    What am I doing wrong?

    • You can use a JPQL DELETE query. It’s the fastest way.

  5. Would this be a solution to this Issue https://hibernate.atlassian.net/browse/HHH-12239 ?
    I am Implementing a ManyToMany between Users and Roles, bidirectional and the User being an owner. So deleting the User it works fine, but deleting the Role it will go for the ConstraintViolationException.

    Now the only way to have this work is by getting for the Role each User and for each User in that list remove the role from their list of roles and then delete the role. The performance is not the great on this one. Could solve with some raw SQL, but wanted to have a Hibernate+JPA way on dealing with it. And I didn’t understand what you mean on “Try using a DDL ON_DELETE cascade on the FK.” as a solution to that thread, if you could explain more it would be great. Thanks in advance

    • I meant using the @OnDelete annotation. Check out my High-Performance Java Persistence book for more details.

  6. I’m using Spring Boot and the code never save on the relational table “post_tag”. I couldn’t find the code on GitHub as well, please anyone has the GitHub link?

    Thanks.

  7. Hi Vlad,

    I have followed your mapping strategy.
    However, I found that the @ManyToOne relations in the Join entity are Eager fetch despite being marked as Lazy.
    Is Lazy the expected behavior?
    I’m using Spring boot 2.1.9 and hibernate 5.3.12.Final.

  8. Thanks for sharing such detailed article, it helped me in my project but here i have one doubt that ->

    will PostTag entity have separate id column as you showed in diagram or it just virtual representation and have no physical existence in db because here in my project i have got table created with only two columns that each one is primary key of mapped two entities.

    • The PostTag entity needs a composite identifer based on the 2 FK. It does not need an extra id.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.