How to synchronize bidirectional entity associations with JPA and Hibernate

(Last Updated On: March 27, 2019)
Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

While answering this StackOverflow question, I realized that it’s a good idea to summarize how various bidirectional associations should be synchronized when using JPA and Hibernate.

Therefore, in this article, you are going to learn how and also why you should always synchronize both sides of an entity relationship, no matter if it’s @OneToMany, @OneToOne or @ManyToMany.

One-To-Many

Let’s assume we have a parent Post entity which has a bidirectional association with the PostComment child entity:

The PostComment entity looks as follows:

@Entity(name = "PostComment")
@Table(name = "post_comment")
public class PostComment {

    @Id
    @GeneratedValue
    private Long id;

    private String review;

    @ManyToOne(
        fetch = FetchType.LAZY
    )
    @JoinColumn(name = "post_id")
    private Post post;

    //Getters and setters omitted for brevity

    @Override
    public boolean equals(Object o) {
        if (this == o) 
            return true;
            
        if (!(o instanceof PostComment)) 
            return false;
            
        return 
            id != null && 
           id.equals(((PostComment) o).getId());
    }
    @Override
    public int hashCode() {
        return 31;
    }
}

There are several things to notice in the PostComment entity mapping above.

First, the @ManyToOne association uses the FetchType.LAZY strategy because by default @ManyToOne and @OneToOne associations use the FetchType.EAGER strategy which is bad for performance.

Second, the equals and hashCode methods are implemented so that we can use safely use the entity identifier, as explained in this article.

The Post entity is mapped as follows:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    @OneToMany(
        mappedBy = "post", 
        cascade = CascadeType.ALL, 
        orphanRemoval = true
    )
    private List<PostComment> comments = new ArrayList<>();

    //Getters and setters omitted for brevity

    public void addComment(PostComment comment) {
        comments.add(comment);
        comment.setPost(this);
    }

    public void removeComment(PostComment comment) {
        comments.remove(comment);
        comment.setPost(null);
    }
}

The comments @OneToMany association is marked with the mappedBy attribute which indicates that the @ManyToOne side is responsible for handling this bidirectional association.

However, we still need to have both sides in sync as otherwise, we break the Domain Model relationship consistency, and the entity state transitions are not guaranteed to work unless both sides are properly synchronized.

For this reason, the Post entity defines the addComment and removeComment entity state synchronization methods.

So, when you add a PostComment, you need to use the addComment method:

Post post = new Post();
post.setTitle("High-Performance Java Persistence");

PostComment comment = new PostComment();
comment.setReview("JPA and Hibernate");
post.addComment(comment);

entityManager.persist(post);

And, when you remove a PostComment, you should use the removeComent method as well:

Post post = entityManager.find(Post.class, 1L);
PostComment comment = post.getComments().get(0);

post.removeComment(comment);

For more details about the best way to map a @OneToMany association, check out this article.

One-To-One

For the one-to-one association, let’s assume the parent Post entity has a PostDetails child entity as illustrated in the following diagram:

The child PostDetails entity looks like this:

@Entity(name = "PostDetails")
@Table(name = "post_details")
public class PostDetails {

    @Id
    private Long id;

    @Column(name = "created_on")
    private Date createdOn;

    @Column(name = "created_by")
    private String createdBy;

    @OneToOne(fetch = FetchType.LAZY)
    @MapsId
    private Post post;
    
    //Getters and setters omitted for brevity
}

Notice that we have set the @OneToOne fetch attribute to FetchType.LAZY, for the very same reason we explained before. We are also using @MapsId because we want the child table row to share the Primary Key with its parent table row meaning that the Primary Key is also a Foreign Key back to the parent table record.

The parent Post entity looks as follows:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    @OneToOne(
        mappedBy = "post", 
        cascade = CascadeType.ALL, 
        orphanRemoval = true, 
        fetch = FetchType.LAZY
    )
    private PostDetails details;

    //Getters and setters omitted for brevity

    public void setDetails(PostDetails details) {
        if (details == null) {
            if (this.details != null) {
                this.details.setPost(null);
            }
        }
        else {
            details.setPost(this);
        }
        this.details = details;
    }
}

The details @OneToOne association is marked with the mappedBy attribute which indicates that the PostDetails side is responsible for handling this bidirectional association.

The setDetails method is used for synchronizing both sides of this bidirectional association and is used both for adding and removing the associated child entity.

So, when we want to associate a Post parent entity with a PostDetails, we use the setDetails method:

Post post = new Post();
post.setTitle("High-Performance Java Persistence");

PostDetails details = new PostDetails();
details.setCreatedBy("Vlad Mihalcea");

post.setDetails(details);

entityManager.persist(post);

The same is true when we want to dissociate the Post and the PostDetails entity:

Post post = entityManager.find(Post.class, 1L);

post.setDetails(null);

For more details about the best way to map a @OneToOne association, check out this article.

Many-To-Many

Let’s assume the Post entity forms a many-to-many association with Tag as illustrated in the following diagram:

The Tag is mapped as follows:

@Entity(name = "Tag")
@Table(name = "tag")
public class Tag {

    @Id
    @GeneratedValue
    private Long id;

    @NaturalId
    private String name;

    @ManyToMany(mappedBy = "tags")
    private Set<Post> posts = new HashSet<>();

    //Getters and setters omitted for brevity

    @Override
    public boolean equals(Object o) {
        if (this == o) 
            return true;
            
        if (!(o instanceof Tag))
            return false;
        
        Tag tag = (Tag) o;
        return Objects.equals(name, tag.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name);
    }
}

Notice the use of the @NaturalId Hibernate-specific annotation which is very useful for mapping business keys.

Because the Tag entity has a business key, we can use that for implementing equals and hashCode as explained in this article.

The Post entity is then mapped as follows:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

    @Id
    @GeneratedValue
    private Long id;

    private String title;

    public Post() {}

    public Post(String title) {
        this.title = title;
    }

    @ManyToMany(
        cascade = { 
            CascadeType.PERSIST, 
            CascadeType.MERGE
        }
    )
    @JoinTable(name = "post_tag",
        joinColumns = @JoinColumn(name = "post_id"),
        inverseJoinColumns = @JoinColumn(name = "tag_id")
    )
    private Set<Tag> tags = new LinkedHashSet<>();

    //Getters and setters omitted for brevity   

    public void addTag(Tag tag) {
        tags.add(tag);
        tag.getPosts().add(this);
    }

    public void removeTag(Tag tag) {
        tags.remove(tag);
        tag.getPosts().remove(this);
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) 
            return true;
        
        if (!(o instanceof Post)) return false;
        
        return id != null && id.equals(((Post) o).getId());
    }

    @Override
    public int hashCode() {
        return 31;
    }
}

The tags @ManyToMany association is marked with the mappedBy attribute which indicates that the Tag side is responsible for handling this bidirectional association.

The addTag and removeTag methods are used for synchronizing the bidirectional association. Because we rely on the remove method from the Set interface, both the Tag and Post must implement equals and hashCode properly. While Tag can use a natural identifier, the Post entity does not have such a business key. For this reason, we used the entity identifier to implement these two methods, as explained in this article.

To associate the Post and Tag entities, we can use the addTag method like this:

Post post1 = new Post("JPA with Hibernate");
Post post2 = new Post("Native Hibernate");

Tag tag1 = new Tag("Java");
Tag tag2 = new Tag("Hibernate");

post1.addTag(tag1);
post1.addTag(tag2);

post2.addTag(tag1);

entityManager.persist(post1);
entityManager.persist(post2);

To dissociate the Post and Tag entities, we can use the removeTag method:

Post post1 = entityManager
.createQuery(
    "select p " +
    "from Post p " +
    "join fetch p.tags " +
    "where p.id = :id", Post.class)
.setParameter( "id", postId )
.getSingleResult();

Tag javaTag = entityManager.unwrap(Session.class)
.bySimpleNaturalId(Tag.class)
.getReference("Java");

post1.removeTag(javaTag);

For more details about the best way to map a @ManyToMany association, check out this article.

That’s it!

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Whenever you are using a bidirectional JPA association, it is mandatory to synchronizing both ends of the entity relationship.

Not only that working with a Domain Model, which does not enforce relationship consistency, is difficult and error prone, but without synchronizing both ends of a bidirectional association, the entity state transitions are not guaranteed to work.

So, save yourself some trouble and do the right thing.

Download free ebook sample

Newsletter logo
10 000 readers have found this blog worth following!

If you subscribe to my newsletter, you'll get:
  • A free sample of my Video Course about running Integration tests at warp-speed using Docker and tmpfs
  • 3 chapters from my book, High-Performance Java Persistence,
  • a 10% discount coupon for my book.

22 Comments on “How to synchronize bidirectional entity associations with JPA and Hibernate

  1. what’s more, in your article: A beginner’s guide to JPA and Hibernate Cascade Types, For Author to Book, you put the add/remove utility methods in mappedBy side, this really confused me, could you please explain the difference? many thanks

    public void addBook(Book book) {
        books.add(book);
        book.authors.add(this);
    }
    
    public void removeBook(Book book) {
        books.remove(book);
        book.getAuthors().remove(this);
    }
    
  2. Excellent article! For @OneToOne and @OneToMany relationships, the add/remove utility methods are put in non-owning side(mappedBy) or parent side), but instead @ManyToMany the add/remove utility methods are put in owning side. what’s wrong if the add/remove utility methods are put in non-owning side? thanks.

    • It’s not the owning vs non-owning that’s important, but to have them on the parent side, which and the many-to-many has 2 parents. Therefore, for many-to-many, you can put them on either side.

      • Thanks for your quick response. Based on your test case: BidirectionalManyToManyTest.java, I ran testRemove() test case, and got the following output( important message were filtered out):

        [“create table post (id bigint not null, title varchar(255), primary key (id))”], Params:[]
        [“create table post_tag (post_id bigint not null, tag_id bigint not null)”], Params:[]
        [“create table tag (id bigint not null, name varchar(255), primary key (id))”], Params:[]

        [“insert into post (title, id) values (?, ?)”], Params:[(JPA with Hibernate, 1)]
        [“insert into tag (name, id) values (?, ?)”], Params:[(Java, 2)]
        [“insert into tag (name, id) values (?, ?)”], Params:[(Hibernate, 3)]
        [“insert into post (title, id) values (?, ?)”], Params:[(Native Hibernate, 4)]
        [“insert into post_tag (post_id, tag_id) values (?, ?)”], Params:[(1, 2)]
        [“insert into post_tag (post_id, tag_id) values (?, ?)”], Params:[(1, 3)]
        [“insert into post_tag (post_id, tag_id) values (?, ?)”], Params:[(4, 2)]

        [“select bidirectio0_.id as id1_0_0_, bidirectio0_.title as title2_0_0_ from post bidirectio0_ where bidirectio0_.id=?”], Params:[(1)]
        [“delete from post_tag where post_id=?”], Params:[(1)]

        [“delete from post where id=?”], Params:[(1)]

        then I changed entity Post as the mappedBy side, and entity Tag as owning side.
        All other codes are same as yours except for the following:

        public static class Post {

        @ManyToMany(mappedBy = “posts”)
        private List tags = new ArrayList<>();

        }

        public static class Tag {

        @ManyToMany(cascade = {CascadeType.PERSIST, CascadeType.MERGE})
        @JoinTable(name = “tag_post”,
        joinColumns = @JoinColumn(name = “tag_id”),
        inverseJoinColumns = @JoinColumn(name = “post_id”)
        )
        private List posts = new ArrayList<>();

        }

        while I ran testRemove() test case again, I got very strange output. I can see Add/Remove only happened at the current mappedBy side: Post

        [“create sequence hibernate_sequence start with 1 increment by 1”], Params:[]
        [“create table post (id bigint not null, title varchar(255), primary key (id))”], Params:[]
        [“create table tag (id bigint not null, name varchar(255), primary key (id))”], Params:[]
        [“create table tag_post (tag_id bigint not null, post_id bigint not null)”], Params:[]
        [“alter table tag_post add constraint FKceeso860swdv8btriq9cwrbcv foreign key (post_id) references post”], Params:[]
        [“alter table tag_post add constraint FK36t4u7evt09pdln4evfoo4v5n foreign key (tag_id) references tag”], Params:[]

        [“insert into post (title, id) values (?, ?)”], Params:[(JPA with Hibernate, 1)]
        [“insert into post (title, id) values (?, ?)”], Params:[(Native Hibernate, 2)]

        [“select bidirectio0_.id as id1_0_0_, bidirectio0_.title as title2_0_0_ from post bidirectio0_ where bidirectio0_.id=?”], Params:[(1)]
        [“delete from post where id=?”], Params:[(1)]

        so what’s wrong with my codes? many thanks.

      • Cascading is the problem or the lack thereof.

  3. As long as you are using Java and a relational database, you will have to deal with this issue.

    To an extent. Some tools do a better job, others worse. Been there. Done ORM. There should be no “magic”. Yes, it needs to be understood. The question is not that one of understanding but (a better) division of responsibility.

    • Yes, but the ORM cannot write the application code as well. Synchronizing both ends of a bidirectional association is the right thing to do for business logic as well, otherwise subtle bugs can be introduced.

      • Actually, synchronizing relationships may, in some cases and implementations have to be, like you say, business logic, but it in most cases it should not be. Why? Because associations are mapped and persisted by ORM that has to understand and maintain their implications.

        Yes, there are certain requirements that have to be met for this to work, such as sufficient mapping/annotation information, ability to intercept what the code does and assurance that there is at most one Java instance mapped to any one entity at a time (in a single transaction). However, once these are met, the risk of subtle (and very much not subtle) bugs is far greater if we leave it to business logic to synchronize associations.

        Why? Well, for one, this is mostly about large products, not small. In large products we rely on separation of concern, encapsulation and simplicity to reduce the risk of developers making mistakes. They are meant to think about their domain of responsibility mostly. They WILL forget to synchronize associations if it is up to them and this will not be discovered immediately. It will often “bite” someone else and their code when they, unbeknownst to them, receive an out-of-sync object.

        Who/what knows what needs to end up where? ORM. There are cases when this may sound like not true. Say that there is an association that is an (ordered) list on one end and a simple 0..1 reference on the other. Setting that reference does not supply sufficient information as to where should it be positioned in that list on the other side. W.r.t. we could say (not a comprehensive list of options):

        This is an error and should never be done on its own. There shouldn’t even be such a method. If anything attempts it, it should immediately fail so it gets noticed quickly… but the framework must still be able to update this by reflecting the change on the “owner” side, in this case the list.
        There could be a rule that does this. This is easily implemented as a setter that really modifies the other side as opposed to directly. Or it could be annotated some way. Or something else. But it can be done.

        These cases are not only solvable as per above but also rather rare compared to simple cases that can be trivially dealt with.

        Also consider what it means for business logic to do synchronization properly, both ways:

        Ensure that bi-directional sync does not become an infinite call loop “ping-pong”.
        Potentially understand how the relationship is mapped so that it can perform all required structure updates.
        Check if any/all “ends” are initialized/loaded. If they are update them. If not loaded but are derived from other sides, ensure that session.flush() occurs regardless of the flush mode – potential performance hit and an excuse to “temporarily” forget about the sync… until it “bites” someone else. This flush, in turn, may fail if the “setup” isn’t complete yet, leaving the business logic developer with no real options.
        Deal with a possible inconsistency due to the fact that some code was developed in the past without or with defective sync logic (as opposed to well-tested centralized version).

        … this quickly turns into rather nasty code that must be repeated for every end of every association.

      • Hibernate is an open-source software, so if you think you can implement this feature, you should definitely give it a try and see how it works.

  4. Hi Vlad, can you please elaborate on “we break the Domain Model relationship consistency”. What kind of problems can arise if, lets say, I will just add PostComment to a Post without adding a Post to a PostComment?

    • Any business logic that depends on the presence or lack of that association will break. And not just the business logic, but the UI will also render wrong results which can further lead to wrong decisions.

  5. In the @OneToMany example, shouldn’t the addComment be responsible to remove any previous association from comment?

      • I tend to agree with @Sérgio. Association management needs to work all (both) ways and should always be consistently encapsulated. Just like Post.addComment() can be used to make the change, so can/should PostComment.setPost() – it needs to remove self from any previous association. The trick is to do this without causing an infinite call loop. In the above example there is no loop because setPost() doesn’t do what I’d say it should be expected to do in the first place to maintain consistency.

      • The way you described could trigger an association fetch when I only want to persist a child. In that case, I don’t want to ever fetch the parent since I only need a proxy for it.

      • @vladmihalcea You are absolutely right that it could/would. However, that is a distinct and different concern from what “should” happen in this context.

        One uses ORM to raise the abstraction level and reduce having to think about the underlying plumbing. Association management must be handled consistently and, for the reason you just stated in addition to others, should really be handled by Hibernate itself (ideally). Bytecode enhancement (https://docs.jboss.org/hibernate/orm/5.3/userguide/html_single/chapters/pc/BytecodeEnhancement.html) does this, but stops short in some cases, say https://hibernate.atlassian.net/browse/HHH-11196. What it could do is determine whether a fetch would be needed at all based on how the association is mapped in the first place and/or it can store the updates in alternate structures.

        Otherwise the consistency drops out very quickly: developers may be unaware they need to do anything, unintentionally forget it or intentionally skip it due to performance implications they cannot work around (but Hibernate itself could) and potentially ambushing unsuspecting developers/code that happens to run later…

        It would seem that the correct way to do this is to check if the inverse association is loaded first. If it is, it needs to be updated. If not, we need to see if there is a point in loading it (ie. only if the mapping is actually in it and would not be correctly updated otherwise) and if there is, reflect the change there. All needs to be shielded against infinite call loops. Lots of plumbing that Hibernate should be able to do most of the time. Some cases may be ambiguous such as managing associations whose inverse side requires special ordering, for example. In any case, encapsulating this complexity and shielding other developers is beneficial in multiple ways…

      • One uses ORM to raise the abstraction level and reduce having to think about the underlying plumbing.

        This is the root of many performance issues.

        No, you need to know exactly how everything works. Hibernate is not an abstraction for SQL, data modelling or query patterns. Hibernate is just an abstraction for plain JDBC.

        The Bytecode enhancement association management only works from parent to child and is a code smell. You shouldn’t use that. You should use the add/remove methods instead.

        All in all, you don’t need more hacks to deal with this issue. If you want performance, you need to know what happens behind the hood.

        And, consistency is very well dealt with if you have integration tests that prove the expected behavior.

        For more details, check out my High-Performance Java Persistence book.

      • No, you need to know exactly how everything works.

        That is the core of our disagreement. I disagree with you on this one – as to what it should be, not what it is. The developers should be given time to think about optimizations at their level of functionality and not what can be more or less sufficiently optimized automatically. Yes, some performance may be lost, but much of it gained by being able to focus on own business logic. Without this, ORM in Hibernate form is all but useless (if you look at it, as you say “Hibernate is just an abstraction for plain JDBC”) — there are other solutions for this, lighter weight, that do not address the time/focus concern at all. That way we can always keep going a way/layer/abstraction lower to squeeze even more performance and loose other benefits in the end. I’ve been a witness to a project that shifted from direct-JDBC to a proprietary ORM with higher abstraction than Hibernate that yielded 10x performance boost because of shifted focus alone.

      • With or without Hibernate, the ORM impedance is still there. As long as you are using Java and a relational database, you will have to deal with this issue. Hibernate is complex because the problem is challenging. You get to understand how difficult the problem is when you are trying to do the same with plain JDBC or with other lightweight frameworks. If you learn and understand it, you can do terrific things with Hibernate. If you don’t understand it, you will eventually start blaming the tool for not doing enough “magic”.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Want to run your data access layer at warp speed?