The best way to implement an audit log using Hibernate Envers
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
In this article, we are going to learn what is the best way to implement an audit log to track INSERT, UPDATE, and DELETE statements using Hibernate Envers.
As previously explained, CDC (Change Data Capture) is an essential step to extract change events from an OLTP application to make them available to other modules in an enterprise system (e.g. caches, data warehouse).
While Debezium is the most efficient way of doing CDC, it might be that you need a simpler solution in your project. Hibernate Envers is a Hibernate ORM extension that allows you to capture change events in no time.
In this article, we are going to see the most efficient way of using Hibernate Envers.
Hibernate Envers Maven dependency
Because Hibernate Envers is packaged as a separate dependency, if you want to use it, you need to declare the following Maven dependency:
<dependency> <groupId>org.hibernate</groupId> <artifactId>hibernate-envers</artifactId> <version>${hibernate.version}</version> </dependency>
Hibernate Envers @Audited annotation
Now, after adding the hibernate-envers
dependency, you need to instruct Hibernate which entities should be audited, and this can be done via the @Audited
entity-level annotation.
@Entity(name = "Post") @Table(name = "post") @Audited public class Post { @Id private Long id; private String title; //Getters and setters omitted for brevity @Override public String toString() { return "Post{" + "id=" + id + ", title='" + title + '\'' + '}'; } }
Hibernate Envers default audit logging strategy
Once you add the @Audited
annotation to your entity and generate the database schema using the hbm2ddl
tool, the following audit tables are going to be created:
CREATE TABLE post_AUD ( id BIGINT NOT NULL, REV INTEGER NOT NULL, REVTYPE TINYINT, title VARCHAR(255), PRIMARY KEY ( id, REV ) ) CREATE TABLE revinfo ( rev INTEGER GENERATED BY DEFAULT AS IDENTITY ( START WITH 1 ), revtstmp BIGINT, PRIMARY KEY ( rev ) ) ALTER TABLE post_aud ADD CONSTRAINT fkllaf9w93qaiooguo8mfvwtwbg FOREIGN KEY ( REV ) REFERENCES revinfo
Using hbm2ddl is not recommended for production environments so use it just for prototyping your database schema. Therefore, you should prefer using an automatic schema migration tool like Flyway.
Nevertheless, Hibernate expects to find the aforementioned audit tables in your database, otherwise, Envers will fail. So, make sure you add them to your migration scripts along with the actual entity tables.
The revinfo
table stores the revision number and its epoch timestamp while the post_AUD
table stores the entity snapshot at a particular revision.
In Envers terminology, a revision stands for a database transaction that has either inserted, updated, or deleted the audited entity. Each audited entity has a mirrored table that, by default, ends with the AUD
suffix and which stores the state of the entity at the end of a particular revision.
The REVTYPE
column stores the ordinal of the RevisionType
Enum which encapsulates one of the following entity state modifications:
ADD
– anINSERT
SQL statement has created the entity in questionMOD
– anUPDATE
SQL statement has modified our entityDEL
– aDELETE
SQL statement has removed the audited entity from the database table
Persisting an entity
When creating a Post
entity and persisting into the currently running Persistence Context:
Post post = new Post(); post.setId( 1L ); post.setTitle( "High-Performance Java Persistence 1st edition" ); entityManager.persist( post );
Hibernate generates the following three SQL INSERT
statements:
INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence 1st edition', 1) INSERT INTO REVINFO (REV, REVTSTMP) VALUES (default, 1503062974131) INSERT INTO post_AUD (REVTYPE, title, id, REV) VALUES (0, 'High-Performance Java Persistence 1st edition', 1, 1)
- The first statement creates the
Post
entity and is executed by Hibernate ORM. - The second statement is executed by Envers to create a new revision.
- The third statement is also executed by Envers to capture the state of the
Post
entity at this revision.
Notice the REVTYPE
column value is 0, matching the RevisionType.ADD
enumeration value.
Updating an entity
When updating the previously created Post
entity:
Post post = entityManager.find( Post.class, 1L ); post.setTitle( "High-Performance Java Persistence 2nd edition" );
Hibernate generates the following SQL statements:
SELECT p.id as id1_0_0_, p.title as title2_0_0_ FROM post p WHERE p.id = 1 UPDATE post SET title = 'High-Performance Java Persistence 2nd edition' WHERE id = 1 INSERT INTO REVINFO (REV, REVTSTMP) VALUES (default, 1503064508185) INSERT INTO post_AUD (REVTYPE, title, id, REV) VALUES (1, 'High-Performance Java Persistence 2nd edition', 1, 2)
- The first two statements are generated by Hibernate ORM to load and modify the entity.
- The second statements are generated by Envers to capture the new revision and the new state of the
Post
entity.
Notice the REVTYPE
column value is 1, matching the RevisionType.MOD
enumeration value.
Removing an entity
When removing the Post
entity:
entityManager.remove( entityManager.getReference( Post.class, 1L ) );
Hibernate generates the following SQL statements:
DELETE FROM post WHERE id = 1 INSERT INTO REVINFO (REV, REVTSTMP) VALUES (default, 1503065294147) INSERT INTO post_AUD (REVTYPE, title, id, REV) VALUES (2, NULL(VARCHAR), 1, 3)
After the entity is removed from the database table by Hibernate ORM, Envers inserts a new revision and all the properties, except for the entity identifier, are going to be set to null
.
Notice the REVTYPE
column value is 2, matching the RevisionType.DEL
enumeration value.
Querying the entity snapshots
Hibernate Envers offers the AuditReaderFactory
which takes a JPA EntityManager
or a Hibernate Session
instance and generates an AuditReader
object.
The AuditReader
can be used to query the audit log as demonstrated by the following example which extracts all the Post
entity state snapshots:
List<Post> posts = AuditReaderFactory.get( entityManager ) .createQuery() .forRevisionsOfEntity( Post.class, true, true ) .add( AuditEntity.id().eq( 1L ) ) .getResultList(); assertEquals( 3, posts.size() ); for ( int i = 0; i < posts.size(); i++ ) { LOGGER.info( "Revision {} of Post entity: {}", i + 1, posts.get( i ) ); }
When running the test case above, Hibernate generates the following output:
SELECT p.id as id1_1_, p.REV as REV2_1_, p.REVTYPE as REVTYPE3_1_, p.title as title4_1_ FROM post_AUD p WHERE p.id = 1 ORDER BY p.REV ASC -- Revision 1 of Post entity: Post{id=1, title='High-Performance Java Persistence 1st edition'} -- Revision 2 of Post entity: Post{id=1, title='High-Performance Java Persistence 2nd edition'} -- Revision 3 of Post entity: Post{id=1, title='null'}
So far, so good!
However, assuming that you have previously extracted the Post
entity revisions:
List<Number> revisions = doInJPA( entityManager -> { return AuditReaderFactory.get( entityManager ).getRevisions( Post.class, 1L ); } );
If you want to load the Post
entity snapshot as of a given revision:
Post post = (Post) AuditReaderFactory.get( entityManager ) .createQuery() .forEntitiesAtRevision( Post.class, revisions.get( 0 ) ) .getSingleResult(); assertEquals( "High-Performance Java Persistence 1st edition", post.getTitle() );
Hibernate Envers is going to generate an SQL query like this one:
SELECT p1.id AS id1_1_, p1.REV AS REV2_1_, p1.REVTYPE AS REVTYPE3_1_, p1.title AS title4_1_ FROM post_AUD p1 WHERE ( p1.REV IN ( SELECT MAX(p2.REV) FROM post_AUD p2 WHERE p2.REV <= 1 AND p1.id = p2.id ) ) AND p1.REVTYPE <> 2
Well, that’s definitely not very efficient!
Queries like the one above are generated by the DefaultAuditStrategy
and, for a very large audit log, they don’t perform that well.
Luckily, Envers offers the ValidityAuditStrategy
to overcome this issue.
Hibernate Envers ValidityAuditStrategy to the rescue
To switch from the DefaultAuditStrategy
to ValidityAuditStrategy
, you have the provide the following Hibernate configuration property:
<property name="org.hibernate.envers.audit_strategy" value="org.hibernate.envers.strategy.internal.ValidityAuditStrategy" />
Now, when generating the database schema using the hbm2ddl
tool, the following audit tables are going to be created instead:
CREATE TABLE post_AUD ( id BIGINT NOT NULL , REV INTEGER NOT NULL , REVTYPE TINYINT , REVEND INTEGER , title VARCHAR(255) , PRIMARY KEY ( id, REV ) ) CREATE TABLE revinfo ( rev INTEGER GENERATED BY DEFAULT AS IDENTITY ( START WITH 1 ), revtstmp BIGINT, PRIMARY KEY ( rev ) ) ALTER TABLE post_AUD ADD CONSTRAINT FKllaf9w93qaiooguo8mfvwtwbg FOREIGN KEY ( REV ) REFERENCES revinfo ALTER TABLE post_AUD ADD CONSTRAINT FKmo46u9kx2pmomhkxbmctlbwmg FOREIGN KEY ( REVEND ) REFERENCES revinfo
The revinfo
table is identical with the DefaultAuditStrategy
, but the post_AUD
features a new REVEND
column which references the revinfo
table and marks the last revision for which this entity snapshot was still valid.
Persisting an entity
When persisting the same Post
entity, Hibernate generates the following SQL statements:
INSERT INTO post (title, id) VALUES ('High-Performance Java Persistence 1st edition', 1) INSERT INTO REVINFO (REV, REVTSTMP) VALUES (default, 1503116813359) INSERT INTO post_AUD (REVTYPE, REVEND, title, id, REV) VALUES (0, NULL(INTEGER), 'High-Performance Java Persistence 1st edition', 1, 1)
The REVEND
column value is NULL
, so this audit log record is, therefore, valid.
Updating an entity
When updating the Post
entity, Hibernate executes the following SQL statements:
SELECT p.id as id1_0_0_, p.title as title2_0_0_ FROM post p WHERE p.id = 1 UPDATE post SET title = 'High-Performance Java Persistence 2nd edition' WHERE id = 1 INSERT INTO REVINFO (REV, REVTSTMP) VALUES (default, 1503117067335) INSERT INTO post_AUD (REVTYPE, REVEND, title, id, REV) VALUES (1, NULL(INTEGER), 'High-Performance Java Persistence 2nd edition', 1, 2) UPDATE post_AUD SET REVEND = 2 WHERE id = 1 AND REV <> 2 AND REVEND IS NULL
Not only that a new Post
entity audit log entry is added, but the previous one is updated to mark that it’s no longer valid.
Compared to the
DefaultAuditStrategy
, theValidityAuditStrategy
does more work while storing the audit logs because it needs to update the end of revision.However, just like with a database index, this extra write-time overhead is insignificant compared to the read-time response time improvement.
Removing an entity
When deleting the Post
entity, Hibernate generates the following SQL statements:
DELETE FROM post WHERE id = 1 INSERT INTO REVINFO (REV, REVTSTMP) VALUES (default, 1503117987724) INSERT INTO post_AUD (REVTYPE, REVEND, title, id, REV) VALUES (2, NULL(INTEGER), NULL(VARCHAR), 1, 3) UPDATE post_AUD SET REVEND = 3 WHERE id = 1 AND REV <> 3 AND REVEND IS NULL
Just like with the Post
entity update, the entity removal will also mark the end of revision for the previously stored audit log record.
Querying the entity snapshots
While fetching all Post
entity state snapshots generates the same SQL query like DefaultAuditStrategy
, it is when querying by a specific revision that the benefit of having the REVEND
extra column will provide a better query execution plan:
SELECT p.id as id1_1_, p.REV as REV2_1_, p.REVTYPE as REVTYPE3_1_, p.REVEND as REVEND4_1_, p.title as title5_1_ FROM post_AUD p WHERE p.REV <= 1 AND p.REVTYPE <> 2 AND ( p.REVEND > 1 OR p.REVEND IS NULL)
Much better than using a correlated subquery! To speed up this type of query, we could add an index for the REV
and REVEND
columns, therefore, avoiding a sequential scan.
I'm running an online workshop on the 11th of October about High-Performance SQL.If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
Hibernate Envers is extremely easy to set up, making it a very compelling choice for a Hibernate-based application. Although the DefaultAuditStrategy
does less work while executing the entity state modification transaction, the ValidityAuditStrategy
is a much better choice when trying to speed up the entity snapshot retrievals.
Depending on your application requirements, you should pick the revision strategy that better suits you audit log data access patterns.
