The best way to soft delete with Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

Each database application is unique. While most of the time, deleting a record is the best approach, there are times when the application requirements demand that database records should never be physically deleted.

So who uses this technique?

For instance, StackOverflow does it for all Posts (e.g. Questions and Answers). The StackOverflow Posts table has a ClosedDate column which acts as a soft delete mechanism since it hides an Answer for all users who have less than 10k reputation.

If you’re using Oracle, you can take advantage of its Flashback capabilities, so you don’t need to change your application code to offer such a functionality. Another option is to use the SQL Server Temporal Table feature.

However, not all relational database systems support Flashback queries, or they allow you to recover a certain record without having to restore from a database backup. In this case, Hibernate allows you to simplify the implementation of soft deletes, and this article is going to explain the best way to implement the logical deletion mechanism.

Domain Model

Considering we have the following tables in your database:

Soft Delete Domain Model

As you can see in the diagram above, the post, post_details, post_comment, and tag tables contain a deleted column which dictates the visibility of a given row. What’s interesting about this database table model is that it covers all three database relationship types:

  • one-to-one
  • one-to-many
  • many-to-many

Therefore, we will discuss the mapping of all these entities as well as their relationships, so stay tuned!

Tag entity

Let’s start with Tag entity mapping since it lacks any entity relationship:

@Entity(name = "Tag")
@Table(name = "tag")
@SQLDelete(sql =
    "UPDATE tag " +
    "SET deleted = true " +
    "WHERE id = ?")
@Loader(namedQuery = "findTagById")
@NamedQuery(name = "findTagById", query =
    "SELECT t " +
    "FROM Tag t " +
    "WHERE " +
    "	t.id = ?1 AND " +
    "	t.deleted = false")
@Where(clause = "deleted = false")
public class Tag 
    extends BaseEntity {

    @Id
    private String id;

    //Getters and setters omitted for brevity
}

The deleted column is defined the BaseEntity class which looks as follows:

@MappedSuperclass
public abstract class BaseEntity {

    private boolean deleted;
}

The @SqlDelete annotation allows you to override the default DELETE statement executed by Hibernate, so we substitute an UPDATE statement instead. Therefore, removing an entity will end up updating the deleted column to true.

The @Loader annotation allows us to customize the SELECT query used to load an entity by its identifier. Hence, we want to filter every record whose deleted column is set to true.

The @Where clause is used for entity queries, and we want to provide it so that Hibernate can append the deleted column filtering condition to hide deleted rows.

While prior to Hibernate 5.2, it was sufficient to provide the @Where clause annotation, in Hibernate 5.2, it’s important to provide a custom @Loader as well so that the direct fetching works as well.

So, considering we have four Tag entities:

doInJPA( entityManager -> {
    Tag javaTag = new Tag();
    javaTag.setId("Java");
    entityManager.persist(javaTag);

    Tag jpaTag = new Tag();
    jpaTag.setId("JPA");
    entityManager.persist(jpaTag);

    Tag hibernateTag = new Tag();
    hibernateTag.setId("Hibernate");
    entityManager.persist(hibernateTag);

    Tag miscTag = new Tag();
    miscTag.setId("Misc");
    entityManager.persist(miscTag);
} );

When removing the Misc Tag:

doInJPA( entityManager -> {
    Tag miscTag = entityManager.getReference(Tag.class, "Misc");
    entityManager.remove(miscTag);
} );

Hibernate will execute the following SQL statement:

UPDATE tag 
SET deleted = true 
WHERE id = 'Misc'

Brilliant!

So now, if we want to load the entity, we get null instead:

doInJPA( entityManager -> {
    assertNull(entityManager.find(Tag.class, "Misc"));
} );

This is because Hibernate executed the following SQL statement:

SELECT 
    t.id as id1_4_, 
    t.deleted as deleted2_4_ 
FROM 
    tag t 
WHERE 
    ( t.deleted = 0 ) AND 
    t.id = ? AND 
    t.deleted = 0

Although the deleted clause is appended twice because we declared both the @Where clause and the @Loader, most RDBMS will eliminate duplicate filters during query parsing. If we only provide the @Where clause, there will be no duplicate delete clause, but then the deleted rows will be visible when being fetched directly.

Also, when running an entity query against all Tag entities, we can see that we can see just three Tags now:

doInJPA( entityManager -> {
    List<Tag> tags = entityManager.createQuery(
        "select t from Tag t", Tag.class)
    .getResultList();

    assertEquals(3, tags.size());
} );

This is because Hibernate manages to append the deleted clause filter when executing the SQL query:

SELECT 
    t.id as id1_4_, 
    t.deleted as deleted2_4_ 
FROM tag t 
WHERE ( t.deleted = 0 )

PostDetails entity

Just like Tag, PostDetails follows the same mapping considerations:

@Entity(name = "PostDetails")
@Table(name = "post_details")
@SQLDelete(sql = 
    "UPDATE post_details " +
    "SET deleted = true " +
    "WHERE id = ?")
@Loader(namedQuery = "findPostDetailsById")
@NamedQuery(name = "findPostDetailsById", query = 
    "SELECT pd " +
    "FROM PostDetails pd " +
    "WHERE " +
    "  pd.id = ?1 AND " +
    "  pd.deleted = false")
@Where(clause = "deleted = false")
public class PostDetails 
    extends BaseEntity {

    @Id
    private Long id;

    @Column(name = "created_on")
    private Date createdOn;

    @Column(name = "created_by")
    private String createdBy;

    public PostDetails() {
        createdOn = new Date();
    }

    @OneToOne(fetch = FetchType.LAZY)
    @MapsId
    private Post post;

    //Getters and setters omitted for brevity
}

Even if it features a @OneToOne association to a Post, there is no need to filter this relationship since a child entity cannot exist without its parent.

PostComment entity

The same logic applies to PostComment:

@Entity(name = "PostComment")
@Table(name = "post_comment")
@SQLDelete(sql =
    "UPDATE post_comment " +
    "SET deleted = true " +
    "WHERE id = ?")
@Loader(namedQuery = "findPostCommentById")
@NamedQuery(name = "findPostCommentById", query =
    "SELECT pc " +
    "from PostComment pc " +
    "WHERE " +
    "  pc.id = ?1 AND " +
    "  pc.deleted = false")
@Where(clause = "deleted = false")
public class PostComment 
    extends BaseEntity {

    @Id
    private Long id;

    @ManyToOne(fetch = FetchType.LAZY)
    private Post post;

    private String review;

    //Getters and setters omitted for brevity
}

Even if it features a @ManyToOne association to a Post, there is no need to filter this relationship since a child entity cannot exist without its parent.

Post entity

The Post entity being the root of our entity aggergate, it has relationships to PostDetails, PostComment, and Tag:

@Entity(name = "Post")
@Table(name = "post")
@SQLDelete(sql = 
    "UPDATE post " +
    "SET deleted = true " +
    "WHERE id = ?")
@Loader(namedQuery = "findPostById")
@NamedQuery(name = "findPostById", query = 
    "SELECT p " +
    "FROM Post p " +
    "WHERE " +
    "  p.id = ?1 AND " +
    "  p.deleted = false")
@Where(clause = "deleted = false")
public class Post 
    extends BaseEntity {

    @Id
    private Long id;

    private String title;

    @OneToMany(
        mappedBy = "post",
        cascade = CascadeType.ALL,
        orphanRemoval = true
    )
    private List<PostComment> comments = new ArrayList<>();

    @OneToOne(
        mappedBy = "post",
        cascade = CascadeType.ALL,
        orphanRemoval = true,
        fetch = FetchType.LAZY
    )
    private PostDetails details;

    @ManyToMany
    @JoinTable(
        name = "post_tag",
        joinColumns = @JoinColumn(name = "post_id"),
        inverseJoinColumns = @JoinColumn(name = "tag_id")
    )
    private List<Tag> tags = new ArrayList<>();

    //Getters and setters omitted for brevity

    public void addComment(PostComment comment) {
        comments.add(comment);
        comment.setPost(this);
    }

    public void removeComment(PostComment comment) {
        comments.remove(comment);
        comment.setPost(null);
    }

    public void addDetails(PostDetails details) {
        this.details = details;
        details.setPost(this);
    }

    public void removeDetails() {
        this.details.setPost(null);
        this.details = null;
    }

    public void addTag(Tag tag) {
        tags.add(tag);
    }
}

The Post entity mapping is similar to the Tag entity, which we already discussed so we will focus on the @OneToMany and @ManyToMany associations.

The bidirectional @OneToMany association

While prior to Hibernate 5.2, it was necessary to provide the @Where clause annotation to collections (e.g. `@OneToMany` or `@ManyToMany`), in Hibernate 5.2 we don’t need these collection-level annotations since the `PostComment` is already annotated accordingly, and Hibernate knows that it needs to filter any _deleted_ `PostComment`.

Therefore, assuming we have one Post entity with two PostComment child entities:

doInJPA( entityManager -> {
    Post post = new Post();
    post.setId(1L);
    post.setTitle("High-Performance Java Persistence");
    entityManager.persist(post);

    PostComment comment1 = new PostComment();
    comment1.setId(1L);
    comment1.setReview("Great!");
    post.addComment(comment1);

    PostComment comment2 = new PostComment();
    comment2.setId(2L);
    comment2.setReview("Excellent!");
    post.addComment(comment2);
} );

When we delete a PostComment:

doInJPA( entityManager -> {
    Post post = entityManager.find(Post.class, 1L);
    post.removeComment(post.getComments().get(0));
} );

The cascade mechanism will trigger the child removal, and Hibernate is going to execute the following SQL statement:

UPDATE post_comment 
SET deleted = true 
WHERE id = 1

And now we can see that the collection has only one entry:

doInJPA( entityManager -> {
    Post post = entityManager.find(Post.class, 1L);
    assertEquals(1, post.getComments().size());
} );

When fetching the comments collection, Hibernate executes the following query:

SELECT 
    pc.id as id1_0_, 
    pc.deleted as deleted2_0_, 
    pc.title as title3_0_ 
FROM 
    post pc 
WHERE 
    ( pc.deleted = 0) AND 
    pc.id=1 AND 
    pc.deleted = 0

The reason why we need the @Where clause annotation on the @OneToMany and @ManyToMany associations is that collections act just like entity queries. A child entity might be deleted, so we need to hide it when we fetch a collection.

The bidirectional @ManyToMany association

Again, since we are using a bidirectional association, there is no need to apply the @Where annotation at the children relationship level. the @Where annotation on collection only makes sense for unidirectional associations, but those are not as efficient as bidirectional ones.

So, if we have one Post entity with three Tag child entities:

doInJPA( entityManager -> {
    Post post = new Post();
    post.setId(1L);
    post.setTitle("High-Performance Java Persistence");

    entityManager.persist(post);

    post.addTag(entityManager.getReference(
        Tag.class, "Java"
    ));
    post.addTag(entityManager.getReference(
        Tag.class, "Hibernate"
    ));
    post.addTag(entityManager.getReference(
        Tag.class, "Misc"
    ));
} );

doInJPA( entityManager -> {
    Post post = entityManager.find(Post.class, 1L);
    assertEquals(3, post.getTags().size());
} );

If we remove one Tag:

doInJPA( entityManager -> {
    Tag miscTag = entityManager.getReference(Tag.class, "Misc");
    entityManager.remove(miscTag);
} );

Then, we will no longer see it in the tags collection:

doInJPA( entityManager -> {
    Post post = entityManager.find(Post.class, 1L);
    assertEquals(2, post.getTags().size());
} );

That’s because Hibernate is filtering it out when loading the child entities:

SELECT 
    pt.post_id as post_id1_3_0_, 
    pt.tag_id as tag_id2_3_0_, 
    t.id as id1_4_1_, 
    t.deleted as deleted2_4_1_ 
FROM post_tag pt 
INNER JOIN 
    tag t ON pt.tag_id = t.id 
WHERE 
    ( t.deleted = 0 ) AND 
    pt.post_id = 1

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Soft deleting is a very handy feature when your application requires keeping the deleted entries and just hiding them in the UI. While it is much more convenient to use a technology such as Flashback in Oracle, if your database lacks such a feature, then Hibernate can simplify this task for you.

FREE EBOOK

Newsletter logo
10 000 readers have found this blog worth following!

If you subscribe to my newsletter, you'll get:
  • A free sample of my Video Course about running Integration tests at warp-speed using Docker and tmpfs
  • 3 chapters from my book, High-Performance Java Persistence,
  • a 10% discount coupon for my book.

16 Comments on “The best way to soft delete with Hibernate

  1. How to use softdelete if i need something like ->
    Delete_ststutte = true, delete_user = current_user
    Thank you

    • Exactly as you wrote. Make sure you include that clause in the query you’re overwriting.

    • I think it was fixed. At least, that’s what the Jira issue says.

      • Hi,

        Well the JIRA does say fixed for 5.2.13 … I’m not sure if the changes were carried over to the 5.3 and 5.4 branches. Still seeing no schema in the soft delete update statement in 5.3.7 so I’m sensing that perhaps it wasn’t. Or was applied to more recent versions of 5.3/5.4 … that’s the part I’m having trouble determining.

        Haven’t come up with a workaround yet for reading in the the Spring Boot yml config of the default schema … and using it explicitly in the SQLDelete class level annotation.

        -Jim

      • If it was fixed in 5.2, it will be available in 5.3 and 5.4.

  2. Hi Vlad.
    Very nice this is a very nice way to handle soft deletes.
    My question is how can we load the soft deleted entities? With this approach if we make a query after a deleted Post we will get EntityNotFound exception when hibernate tries to load it’s post details, or am I mistaken?

    Thank you.

    • If the Post is deleted, Hibernate will return null, so I don’t see how you can get the EntityNotFoundException.

  3. Please note: This will not work if you have to use it along with @Version.

    • If you want to add optimistic locking, then you just have to incude the version as well in the custom CRUD queries. Go to my high-performance-java-persistence GitHub repository, and open the SoftDeleteVersionTest. If you run it, you’ll see that it works like a charm.

  4. Hello Sir,
    Can we do soft delete and save auditing information (for which in Spring JPA provides a way to implement AuditorAwareImpl) both in one query? mean to say changing the value of the flag to mark record as delete and who did this operation saving this info, in one shot , one query.

    • I’ve never used the AuditorAwareImpl in Spring Data, so the Spring team will provide you the answer you are looking for

  5. Hi Vlad,

    thanks for great posts you generously share with developers.

    Could you please explain what would be the most efficient way to delete Post entity with delete cascade of all of its child records ( post_tag, post_comment_post_details) in case of unidirectional @ManyToOne relations as it seems for me this is one of the limitations of @ManyToOne desing (if Hibernate is not generating DDL).

    Of course, my idea is not to define CASCADE DELETE rule on FK in database, but let Hibernate to take care of deletion.

    Is the only solution switching to bi-directional, although this can be time-consuming job if you have big database, or there is a faster and better way to delete parent and all of its childs?

    Thanks!
    Mark

  6. Wouldn’t the best way to soft delete be to implement such feature in the domain? If a Post should never get deleted but should rather be archived then why not modeling an explicit post.archive() operation. If the business has interest in “deleted” posts they probably never really meant to delete them in the first place. That way you can forget about entityManager.remove entirely and only rely on entity state mutations.

    • When using JPA and Hibernate, you need to use the EntityManager. DDD works only for in-memory storages. When a RDBMS comes into play, it’s hard to hold all your logic in a POJO. In the end, it’s a matter of perspective, so instead of thinking that the domain has to be a POJO, you switch your mindset to viewing the domain as the whole service layer, data layer, entities and DTOs. That’s the domain model and logic as well.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.