A beginner’s guide to Non-Repeatable Read anomaly
Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!
Database transactions are defined by the four properties known as ACID. The Isolation Level (I in ACID) allows you to trade off data integrity for performance.
The weaker the isolation level, the more anomalies can occur, and in this article, we are going to describe the Non-Repeatable Read phenomenon.
Observing data changed by a concurrent transaction
If one transaction reads a database row without applying a shared lock on the newly fetched record, then a concurrent transaction might change this row before the first transaction has ended.
In the diagram above, the flow of statements goes like this:
- Alice and Bob start two database transactions.
- Bob’s reads the
titlecolumn value is
- Alice modifies the
titleof a given
postrecord to the value of
- Alice commits her database transaction.
- If Bob’s re-reads the
postrecord, he will observe a different version of this table row.
This phenomenon is problematic when the current transaction makes a business decision based on the first value of the given database row (a client might order a product based on a stock quantity value that is no longer a positive integer).
How the database prevents it
If a database uses a 2PL (Two-Phase Locking) and shared locks are taken on every read, this phenomenon will be prevented since no concurrent transaction would be allowed to acquire an exclusive lock on the same database record.
Most database systems have moved to an MVCC (Multi-Version Concurrency Control) model, and shared locks are no longer mandatory for preventing non-repeatable reads.
By verifying the current row version, a transaction can be aborted if a previously fetched record has changed in the meanwhile.
Repeatable Read and Serializable prevent this anomaly by default. With Read Committed, it is possible to avoid non-repeatable (fuzzy) reads if the shared locks are acquired explicitly (e.g.
SELECT FOR SHARE).
Some ORM frameworks (e.g. JPA/Hibernate) offer application-level repeatable reads. The first snapshot of any retrieved entity is cached in the currently running Persistence Context.
Any successive query returning the same database row is going to use the very same object that was previously cached. This way, the fuzzy reads may be prevented even in Read Committed isolation level.
This phenomenon is typical for both Read Uncommitted and Read Committed isolation levels. The problem is that Read Committed is the default isolation level for many RDBMS like Oracle, SQL Server or PostgreSQL, so this phenomenon can occur if nothing is done to prevent it.
Nevertheless, preventing this anomaly is fairly simple. All you need to do is use a higher isolation level like Repeatable Read (which is the default in MySQL) or Serializable. Or, you can simply lock the database record using a share(read) lock or an exclusive lock if the underlying database does not support shared locks (e.g. Oracle).
Download free ebook sample
If you subscribe to my newsletter, you'll get:
- A free sample of my Video Course about running Integration tests at warp-speed using Docker and tmpfs
- 3 chapters from my book, High-Performance Java Persistence,
- a 10% discount coupon for my book.