Observing data changed by a concurrent transaction
If one transaction reads a database row without applying a shared lock on the newly fetched record, then a concurrent transaction might change this row before the first transaction has ended.
In the diagram above, the flow of statements goes like this:
Alice and Bob start two database transactions.
Bob’s reads the post record and title column value is Transactions.
Alice modifies the title of a given post record to the value of ACID.
Alice commits her database transaction.
If Bob’s re-reads the post record, he will observe a different version of this table row.
This phenomenon is problematic when the current transaction makes a business decision based on the first value of the given database row (a client might order a product based on a stock quantity value that is no longer a positive integer).
How the database prevents it
If a database uses a 2PL (Two-Phase Locking) and shared locks are taken on every read, this phenomenon will be prevented since no concurrent transaction would be allowed to acquire an exclusive lock on the same database record.
By verifying the current row version, a transaction can be aborted if a previously fetched record has changed in the meanwhile.
Repeatable Read and Serializable prevent this anomaly by default. With Read Committed, it is possible to avoid non-repeatable (fuzzy) reads if the shared locks are acquired explicitly (e.g. SELECT FOR SHARE).
Some ORM frameworks (e.g. JPA/Hibernate) offer application-level repeatable reads. The first snapshot of any retrieved entity is cached in the currently running Persistence Context.
Any successive query returning the same database row is going to use the very same object that was previously cached. This way, the fuzzy reads may be prevented even in Read Committed isolation level.
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
This phenomenon is typical for both Read Uncommitted and Read Committed isolation levels. The problem is that Read Committed is the default isolation level for many RDBMS like Oracle, SQL Server or PostgreSQL, so this phenomenon can occur if nothing is done to prevent it.
Nevertheless, preventing this anomaly is fairly simple. All you need to do is use a higher isolation level like Repeatable Read (which is the default in MySQL) or Serializable. Or, you can simply lock the database record using a share(read) lock or an exclusive lock if the underlying database does not support shared locks (e.g. Oracle).
10 000readers have found this blog worth following!
If you subscribeto my newsletter, you'll get:
A free sampleof my Video Course about running Integration tests at warp-speed using Docker and tmpfs
3 chapters from mybook, High-Performance Java Persistence,
Based on my book, High-Performance Java Persistence, this workshop teaches you various data access performance optimizations from JDBC, to JPA, Hibernate and jOOQ for the major rational database systems (e.g. Oracle, SQL Server, MySQL and PostgreSQL).