The best way to soft delete with Hibernate

Introduction

Each database application is unique. While most of the time, deleting a record is the best approach, there are times when the application requirements demand that database records should never be physically deleted.

So who uses this technique?

For instance, StackOverflow does it for all Posts (e.g. Questions and Answers). The StackOverflow Posts table has a ClosedDate column which acts as a soft delete mechanism since it hides an Answer for all users who have less than 10k reputation.

If you’re using Oracle, you can take advantage of its Flashback capabilities, so you don’t need to change your application code to offer such a functionality. Another option is to use the SQL Server Temporal Table feature.

However, not all relational database systems support Flashback queries, or they allow you to recover a certain record without having to restore from a database backup. In this case, Hibernate allows you to simplify the implementation of soft deletes, and this article is gong to explain the best way to implement the logical deletion mechanism.

Domain Model

Considering we have the following tables in your database:

As you can see in the diagram above, the post, post_details, post_comment, and tag tables contain a deleted column which dictates the visibility of a given row. What’s interesting about this database table model is that it covers all three database relationship types:

  • one-to-one
  • one-to-many
  • many-to-many

Therefore, we will discuss the mapping of all these entities as well as their relationships, so stay tuned!

Tag entity

Let’s start with Tag entity mapping since it lacks any entity relationship:

@Entity(name = "Tag")
@Table(name = "tag")
@SQLDelete(sql =
    "UPDATE tag " +
    "SET deleted = true " +
    "WHERE id = ?")
@Loader(namedQuery = "findTagById")
@NamedQuery(name = "findTagById", query =
    "SELECT t " +
    "FROM Tag t " +
    "WHERE " +
    "	t.id = ?1 AND " +
    "	t.deleted = false")
@Where(clause = "deleted = false")
public class Tag 
    extends BaseEntity {

    @Id
    private String id;

    //Getters and setters omitted for brevity
}

The deleted column is defined the BaseEntity class which looks as follows:

@MappedSuperclass
public abstract class BaseEntity {

    private boolean deleted;
}

The @SqlDelete annotation allows you to override the default DELETE statement executed by Hibernate, so we substitute an UPDATE statement instead. Therefore, removing an entity will end up updating the deleted column to true.

The @Loader annotation allows us to customize the SELECT query used to load an entity by its identifier. Hence, we want to filter every record whose deleted column is set to true.

The @Where clause is used for entity queries, and we want to provide it so that Hibernate can append the deleted column filtering condition to hide deleted rows.

While prior to Hibernate 5.2, it was sufficient to provide the @Where clause annotation, in Hibernate 5.2, it’s important to provide a custom @Loader as well so that the direct fetching works as well.

So, considering we have four Tag entities:

doInJPA( entityManager -> {
    Tag javaTag = new Tag();
    javaTag.setId("Java");
    entityManager.persist(javaTag);

    Tag jpaTag = new Tag();
    jpaTag.setId("JPA");
    entityManager.persist(jpaTag);

    Tag hibernateTag = new Tag();
    hibernateTag.setId("Hibernate");
    entityManager.persist(hibernateTag);

    Tag miscTag = new Tag();
    miscTag.setId("Misc");
    entityManager.persist(miscTag);
} );

When removing the Misc Tag:

doInJPA( entityManager -> {
    Tag miscTag = entityManager.getReference(Tag.class, "Misc");
    entityManager.remove(miscTag);
} );

Hibernate will execute the following SQL statement:

UPDATE tag 
SET deleted = true 
WHERE id = 'Misc'

Brilliant!

So now, if we want to load the entity, we get null instead:

doInJPA( entityManager -> {
    assertNull(entityManager.find(Tag.class, "Misc"));
} );

This is because Hibernate executed the following SQL statement:

SELECT 
    t.id as id1_4_, 
    t.deleted as deleted2_4_ 
FROM 
    tag t 
WHERE 
    ( t.deleted = 0 ) AND 
    t.id = ? AND 
    t.deleted = 0

Although the deleted clause is appended twice because we declared both the @Where clause and the @Loader, most RDBMS will eliminate duplicate filters during query parsing. If we only provide the @Where clause, there will be no duplicate delete clause, but then the deleted rows will be visible when being fetched directly.

Also, when running an entity query against all Tag entities, we can see that we can see just three Tags now:

doInJPA( entityManager -> {
    List<Tag> tags = entityManager.createQuery(
        "select t from Tag t", Tag.class)
    .getResultList();

    assertEquals(3, tags.size());
} );

This is because Hibernate manages to append the deleted clause filter when executing the SQL query:

SELECT 
    t.id as id1_4_, 
    t.deleted as deleted2_4_ 
FROM tag t 
WHERE ( t.deleted = 0 )

PostDetails entity

Just like Tag, PostDetails follows the same mapping considerations:

@Entity(name = "PostDetails")
@Table(name = "post_details")
@SQLDelete(sql = 
    "UPDATE post_details " +
    "SET deleted = true " +
    "WHERE id = ?")
@Loader(namedQuery = "findPostDetailsById")
@NamedQuery(name = "findPostDetailsById", query = 
    "SELECT pd " +
    "FROM PostDetails pd " +
    "WHERE " +
    "	pd.id = ?1 AND " +
    "	pd.deleted = false")
@Where(clause = "deleted = false")
public class PostDetails 
    extends BaseEntity {

    @Id
    private Long id;

    @Column(name = "created_on")
    private Date createdOn;

    @Column(name = "created_by")
    private String createdBy;

    public PostDetails() {
        createdOn = new Date();
    }

    @OneToOne(fetch = FetchType.LAZY)
    @JoinColumn(name = "id")
    @MapsId
    private Post post;

    //Getters and setters omitted for brevity
}

Even if it features a @OneToOne association to a Post, there is no need to filter this relationship since a child entity cannot exist without its parent.

PostComment entity

The same logic applies to PostComment:

@Entity(name = "PostComment")
@Table(name = "post_comment")
@SQLDelete(sql =
    "UPDATE post_comment " +
    "SET deleted = true " +
    "WHERE id = ?")
@Loader(namedQuery = "findPostCommentById")
@NamedQuery(name = "findPostCommentById", query =
    "SELECT pc " +
    "from PostComment pc " +
    "WHERE " +
    "	pc.id = ?1 AND " +
    "	pc.deleted = false")
@Where(clause = "deleted = false")
public class PostComment 
    extends BaseEntity {

    @Id
    private Long id;

    @ManyToOne(fetch = FetchType.LAZY)
    private Post post;

    private String review;

    //Getters and setters omitted for brevity
}

Even if it features a @ManyToOne association to a Post, there is no need to filter this relationship since a child entity cannot exist without its parent.

Post entity

The Post entity being the root of our entity aggergate, it has relationships to PostDetails, PostComment, and Tag:

@Entity(name = "Post")
@Table(name = "post")
@SQLDelete(sql = 
    "UPDATE post " +
    "SET deleted = true " +
    "WHERE id = ?")
@Loader(namedQuery = "findPostById")
@NamedQuery(name = "findPostById", query = 
    "SELECT p " +
    "FROM Post p " +
    "WHERE " +
    "	p.id = ?1 AND " +
    "	p.deleted = false")
@Where(clause = "deleted = false")
public class Post 
    extends BaseEntity {

    @Id
    private Long id;

    private String title;

    @OneToMany(
        mappedBy = "post",
        cascade = CascadeType.ALL,
        orphanRemoval = true
    )
    private List<PostComment> comments = new ArrayList<>();

    @OneToOne(
        mappedBy = "post",
        cascade = CascadeType.ALL,
        orphanRemoval = true,
        fetch = FetchType.LAZY
    )
    private PostDetails details;

    @ManyToMany
    @JoinTable(
        name = "post_tag",
        joinColumns = @JoinColumn(name = "post_id"),
        inverseJoinColumns = @JoinColumn(name = "tag_id")
    )
    private List<Tag> tags = new ArrayList<>();

    //Getters and setters omitted for brevity

    public void addComment(PostComment comment) {
        comments.add(comment);
        comment.setPost(this);
    }

    public void removeComment(PostComment comment) {
        comments.remove(comment);
        comment.setPost(null);
    }

    public void addDetails(PostDetails details) {
        this.details = details;
        details.setPost(this);
    }

    public void removeDetails() {
        this.details.setPost(null);
        this.details = null;
    }

    public void addTag(Tag tag) {
        tags.add(tag);
    }
}

The Post entity mapping is similar to the Tag entity, which we already discussed so we will focus on the @OneToMany and @ManyToMany associations.

The bidirectional @OneToMany association

While prior to Hibernate 5.2, it was necessary to provide the @Where clause annotation to collections (e.g. `@OneToMany` or `@ManyToMany`), in Hibernate 5.2 we don’t need these collection-level annotations since the `PostComment` is already annotated accordingly, and Hibernate knows that it needs to filter any _deleted_ `PostComment`.

Therefore, assuming we have one Post entity with two PostComment child entities:

doInJPA( entityManager -> {
    Post post = new Post();
    post.setId(1L);
    post.setTitle("High-Performance Java Persistence");
    entityManager.persist(post);

    PostComment comment1 = new PostComment();
    comment1.setId(1L);
    comment1.setReview("Great!");
    post.addComment(comment1);

    PostComment comment2 = new PostComment();
    comment2.setId(2L);
    comment2.setReview("Excellent!");
    post.addComment(comment2);
} );

When we delete a PostComment:

doInJPA( entityManager -> {
    Post post = entityManager.find(Post.class, 1L);
    post.removeComment(post.getComments().get(0));
} );

The cascade mechanism will trigger the child removal, and Hibernate is going to execute the following SQL statement:

UPDATE post_comment 
SET deleted = true 
WHERE id = 1

And now we can see that the collection has only one entry:

doInJPA( entityManager -> {
    Post post = entityManager.find(Post.class, 1L);
    assertEquals(1, post.getComments().size());
} );

When fetching the comments collection, Hibernate executes the following query:

SELECT 
    pc.id as id1_0_, 
    pc.deleted as deleted2_0_, 
    pc.title as title3_0_ 
FROM 
    post pc 
WHERE 
    ( pc.deleted = 0) AND 
    pc.id=1 AND 
    pc.deleted = 0

The reason why we need the @Where clause annotation on the @OneToMany and @ManyToMany associations is that collections act just like entity queries. A child entity might be deleted, so we need to hide it when we fetch a collection.

The bidirectional @ManyToMany association

Again, since we are using a bidirectional association, there is no need to apply the @Where annotation at the children relationship level. the @Where annotation on collection only makes sense for unidirectional associations, but those are not as efficient as bidirectional ones.

So, if we have one Post entity with three Tag child entities:

doInJPA( entityManager -> {
    Post post = new Post();
    post.setId(1L);
    post.setTitle("High-Performance Java Persistence");

    entityManager.persist(post);

    post.addTag(entityManager.getReference(
        Tag.class, "Java"
    ));
    post.addTag(entityManager.getReference(
        Tag.class, "Hibernate"
    ));
    post.addTag(entityManager.getReference(
        Tag.class, "Misc"
    ));
} );

doInJPA( entityManager -> {
    Post post = entityManager.find(Post.class, 1L);
    assertEquals(3, post.getTags().size());
} );

If we remove one Tag:

doInJPA( entityManager -> {
    Tag miscTag = entityManager.getReference(Tag.class, "Misc");
    entityManager.remove(miscTag);
} );

Then, we will no longer see it in the tags collection:

doInJPA( entityManager -> {
    Post post = entityManager.find(Post.class, 1L);
    assertEquals(2, post.getTags().size());
} );

That’s because Hibernate is filtering it out when loading the child entities:

SELECT 
    pt.post_id as post_id1_3_0_, 
    pt.tag_id as tag_id2_3_0_, 
    t.id as id1_4_1_, 
    t.deleted as deleted2_4_1_ 
FROM post_tag pt 
INNER JOIN 
    tag t ON pt.tag_id = t.id 
WHERE 
    ( t.deleted = 0 ) AND 
    pt.post_id = 1

If you enjoyed this article, I bet you are going to love my book as well.

Conclusion

Soft deleting is a very handy feature when your application requires keeping the deleted entries and just hiding them in the UI. While it is much more convenient to use a technology such as Flashback in Oracle, if your database lacks such a feature, then Hibernate can simplify this task for you.

If you liked this article, you might want to subscribe to my newsletter too.

Advertisements

33 thoughts on “The best way to soft delete with Hibernate

    1. That’s true. But, as I explained in my book, 2nd-level caching applicability is rather limited. It’s very valuable to off-load the Master node especially for data that’s hot and frequently modified.

  1. Is there pros/cons on this approach over adding the Filter to the domain class? We have done that in the past so when it’s enabled its added to all queries without needing to rewrite them.

    1. The @Filter mechanism allows you to change the outcome dynamically. The only downside is that the @Filter is not applied to direct fetching. Another downside is that you have to remember to enable the filter for every Session, so you need a way to ensure this is applied consistently, like when using AOP.

      From a performance perspective, there shouldn’t be any difference, so it’s just a matter of application requirements.

      1. Yes. We ran into the direct fetching while trying to load by id and had to turn these into query expressions.

  2. Can we control this feature dynamically without changing the entity annotations, like if there is a need to turn off the soft delete feature.

      1. We used filters for both feature enabling and also for RBAC i.e. enable filters based on the authenticated user roles.

  3. Hi Vlad,

    great post, but could you give an example where a deleted entity is returned without specifying the @Loader annotation?
    Couldn’t produce a false test for this before adding @Loader

    Kind regards,
    Michael

      1. I’m using hibernate 5.0.10 and the where clause is also added for EntityManager.find(Clazz, id) so it returns null

      2. I tested it on 5.2. On 5.1, it works as you suggested, so I added a note related to your suggestion as well as to @Where clause annotations on collections.

  4. Hello Vlad,

    I needed to add the VERSION to the Delete query;

    @SQLDelete(sql = “UPDATE MY_TABLE SET DELETED = 0 WHERE ID = ? AND VERSION = ?”)

    Regards,

    Roland

  5. Hello Vlad,

    Is it possible to use named parameters instead of positional parameters?

    [ERROR] WARN: [DEPRECATION] Encountered positional parameter near line 1, column 88 in HQL: [select lRelation from com.ludus.server.domain.Relation lRelation where lRelation.mId = ? and lRelation.mDeleted = false]. Positional parameter are considered deprecated; use named parameters or JPA-style positional parameters instead.
    [ERROR] Mar 15, 2017 12:03:54 PM org.hibernate.hql.internal.ast.HqlSqlWalker generatePositionalParameter

    Regards,

    Roland

      1. Like this?

        @SQLDelete(sql = “UPDATE MY_TABLE SET DELETED = 0 WHERE ID = :id AND VERSION = :version”)

        Otherwise, how do i know the parameter names?

    1. Thats the whole point here (the ? use is deprecated)

      [ERROR] WARN: [DEPRECATION] Encountered positional parameter near line 1, column 88 in HQL: [select lRelation from com.ludus.server.domain.Relation lRelation where lRelation.mId = ? and lRelation.mDeleted = false]. Positional parameter are considered deprecated; use named parameters or JPA-style positional parameters instead.
      [ERROR] Mar 15, 2017 12:03:54 PM org.hibernate.hql.internal.ast.HqlSqlWalker generatePositionalParameter

      1. Just write the query like this:

        @NamedQuery(name = "findTagById", query =
        		"SELECT t " +
        		"FROM Tag t " +
        		"WHERE " +
        		"	t.id = ?1 AND " +
        		"	t.deleted = false")
        
  6. In my experience a lot of times that I’ve needed soft-delete I also need the application to be aware of the soft-deleted entities, not simply have them transparently excluded. So most of the time I would just put the @Where or a @Filter on the associations that should only return live entities, but still retain the ability to retrieve deleted entities.

  7. Hi Vlad. Thanks for the great article!

    Let’s imagine that we have this imaginary situation:

    I’ve Employee table which referenced by many tables like Family, ID Card, etc (more than 5 tables) and implement soft delete (set mark_for_delete = true)

    I want add validation if employee data still referenced by other table, i want throw exception that said “can not delete employee”

    What is the best way to do this validation?

    Thanks

  8. Hi vladmihalcea, thanks for the post. Is it possible to pass parameters to update statement to update more than one column? for example

    update employee set emp_end_date=sysdate, emp_end_by=?, emp_active=0 where emp_id=?

    Thanks.

    1. I don’t think so. But you can use a database trigger for that which checks the state of the emp_active column and sets the emp_end_by column when emp_active becomes 0.

  9. Hi Vlad

    there seems to be another problem with soft delete. The OrphanRemoval=true on OneToMany associations.
    It doesn’t matter in your example because PostComment also has soft delete defined, but when only the Post entity has a soft delete, the comments are removed on entityManager.remove().
    So the SQLDelete does not prevent orphan removals of associations.

    Kind regards,
    Michael

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s