Imagine having a tool that can automatically detect JPA and Hibernate performance issues.
Hypersistence Optimizer is that tool!
Introduction
In this article, I’m going to explain how the MVCC (Multi-Version Concurrency Control) mechanism works using PostgreSQL as a reference implementation.
In Concurrency Control theory, there are two ways you can deal with conflicts:
You can avoid them, by employing a pessimistic locking mechanism (e.g. Read/Write locks, Two-Phase Locking)
You can allow conflicts to occur, but you need to detect them using an optimistic locking mechanism (e.g. logical clock, MVCC)
Because MVCC (Multi-Version Concurrency Control) is such a prevalent Concurrency Control technique (not only in relational database systems, in this article, I’m going to explain how it works.
What’s the goal
When the ACID transaction properties were first defined, Serializability was assumed. And to provide a Strict Serializable transaction outcome, the 2PL (Two-Phase Locking) mechanism was employed. When using 2PL, every read requires a shared lock acquisition, while a write operation requires taking an exclusive lock.
a shared lock blocks Writers, but it allows other Readers to acquire the same shared lock
an exclusive lock blocks both Readers and Writers concurring for the same lock
However, locking incurs contention, and contention affects scalability. The Amdhal’s Law or the Universal Scalability Law demonstrate how contention can affect response Time speedup.
For this reason, database researchers have come up with a different Concurrency Control model which tries to reduce locking to a bare minimum so that:
Readers don’t block Writers
Writers don’t block Readers
The only use case that can still generate contention is when two concurrent transactions try to modify the same record since, once modified, a row is always locked until the transaction that modified this record either commits or rolls back.
In order to specify the aforementioned Reader/Writer non-locking behavior, the Concurrency Control mechanism must operate on multiple versions of the same record, hence this mechanism is called Multi-Version Concurrency Control (MVCC).
While 2PL is pretty much standard, there’s no standard MVCC implementation, each database taking a slightly different approach. In this article, we are going to use PostgreSQL since its MVCC implementation is the easiest one to visualize.
PostgreSQL
While Oracle and MySQL use the undo log to capture uncommitted changes so that rows can be reconstructed to their previously committed version, PostgreSQL stores all row versions in the table data structure.
What’s even more interesting is that every row has two additional columns:
– which defines the transaction id that inserted the record
– which defines the transaction id that deleted the row
In PostgreSQL, the Transaction Id is a 32-bit integer, and the VACUUM process is responsible (among other things like reclaiming old row versions that are no longer in use) for making sure that the id does not overflow.
MVCC (Multi-Version Concurrency Control) – Inserting a record
To understand how INSERT works in MVCC, consider the following diagram:
Both Alice and Bob start a new transaction, and we can see their transaction ids by calling the txid_current() PostgreSQL function
When Alice inserts a new post row, the column value is set to Alice’s transaction id
Under default Read Committed isolation level, Bob cannot see Alice’s newly inserted record until Alice committs her transaction
After Alice has committed, Bob can now see Alice’s newly inserted row
If the transaction id is higher than the value of a committed row, the transaction is allowed to read this record version.
If the transaction id is lower than the value, then it’s up to the isolation level to decide if a record should be visible or not. For READ COMMITTED, the currently executing statement timestamp becomes the lower boundary for row visibility. For REPEATABLE READ or SERIALIZABLE, all reads are relative to the start timestamp of the currently running transaction.
MVCC (Multi-Version Concurrency Control) – Deleting a record
To understand how DELETE works in MVCC, consider the following diagram:
Both Alice and Bob start a new transaction, and we can see their transaction ids by calling the txid_current() PostgreSQL function
When Bob deletes a post row, the column value is set to Bob’s transaction id
Under default Read Committed isolation level, until Bob manages to commit his transaction, Alice can still see the record that was deleted by ob
After Bob has committed, Alice can no longer see the deleted row
While in 2PL, Bob’s modification would block Alice read statement, in MVCC Alice is still allowed to see the previous version until Bob manages to commit his transaction.
The DELETE operation does not physically remove a record, it just marks it as ready for deletion, and the VACUUM process will collect it when this row is no longer in use by any current running transaction.
If the transaction id is greater than the value of a committed row, the transaction is not allowed to read this record version anymore.
If the transaction id is lower than the value, then it’s up to the isolation level to decide if a record should be visible or not. For READ COMMITTED, the currently executing statement timestamp becomes the lower boundary for row visibility. For REPEATABLE READ or SERIALIZABLE, all reads are relative to the start timestamp of the currently running transaction.
MVCC (Multi-Version Concurrency Control) – Updating a record
To understand how UPDATE works in MVCC, consider the following diagram:
Both Alice and Bob start a new transaction, and we can see their transaction ids by calling the txid_current() PostgreSQL function
When Bob updates a post record, we can see two operations happening: a DELETE and an INSERT.
The previous row version is marked as deleted by setting the column value to Bob’s transaction id, and a new row version is created which has the column value set to Bob’s transaction id
Under default Read Committed isolation level, until Bob manages to commit his transaction, Alice can still see the previous record version
After Bob has committed, Alice can now see the new row version that was updated by Bob
Online Workshops
If you enjoyed this article, I bet you are going to love my upcoming Online Workshops.
By allowing multiple versions of the same record, there is going to be less contention on reading/writing records since Readers will not block writers and Writers will not block Readers as well.
7 Comments on “How does MVCC (Multi-Version Concurrency Control) work”
Why wouldn’t Alice retrieve that record since it hasn’t been committed yet. Bob ca also roll back, which will undo the xMax field. For more details, check out my High-Performance Java Persistence book.
But update means inserting a row first. We should prevent updating xmax field, but also we should prevent inserting. How to lock a row, that doesn’t exist yet?
And I’ve just thought – may be a concurrent transaction is rollbacked when it tries to update (insert+delete) the same row, no?
Update means soft deleting the old version and inserting a new latest version.
We should prevent updating xmax field, but also we should prevent inserting.
I’m not sure where the latch is located internally in PostgreSQL. You should ask a PostgreSQL developer for more details about the actual implementation.
How to lock a row, that doesn’t exist yet?
A locked row always exists even if it’s hidden prior to commit.
And I’ve just thought – may be a concurrent transaction is rolled back when it tries to update (insert+delete) the same row, no?
No. A concurrent transaction will block until the first transaction either commits of rolls back.
So in MVCC the update operation blocks the row untill the transaction completes, but only for writers (whereas in 2PL both for writers and readers). Correct?
In MVCC, the write (e.g., UPDATE or DELETE) on a given row blocks other concurrent write statements affecting the same row until the transaction completes. For 2PL, reads will block a write, and a write will block both reads and writes.
“The only use case that can still generate contention is when two concurrent transactions try to modify the same record since, once modified, a row is always locked until the transaction that modified this record either commits or rolls back.”
But how is locking a row from concurrent writes implemented in MVCC? Is exclusive lock used as well (then for which row version), or what? What makes a concurrent transaction wait untill the end of the first one?
The lock is taken on the latest version. Waiting on a lock is the same as how the Java locks work. It’s probably a C construct.
Why wouldn’t Alice retrieve that record since it hasn’t been committed yet. Bob ca also roll back, which will undo the xMax field. For more details, check out my High-Performance Java Persistence book.
But update means inserting a row first. We should prevent updating xmax field, but also we should prevent inserting. How to lock a row, that doesn’t exist yet?
And I’ve just thought – may be a concurrent transaction is rollbacked when it tries to update (insert+delete) the same row, no?
Update means soft deleting the old version and inserting a new latest version.
I’m not sure where the latch is located internally in PostgreSQL. You should ask a PostgreSQL developer for more details about the actual implementation.
A locked row always exists even if it’s hidden prior to commit.
No. A concurrent transaction will block until the first transaction either commits of rolls back.
So in MVCC the update operation blocks the row untill the transaction completes, but only for writers (whereas in 2PL both for writers and readers). Correct?
In MVCC, the write (e.g., UPDATE or DELETE) on a given row blocks other concurrent write statements affecting the same row until the transaction completes. For 2PL, reads will block a write, and a write will block both reads and writes.
“The only use case that can still generate contention is when two concurrent transactions try to modify the same record since, once modified, a row is always locked until the transaction that modified this record either commits or rolls back.”
But how is locking a row from concurrent writes implemented in MVCC? Is exclusive lock used as well (then for which row version), or what? What makes a concurrent transaction wait untill the end of the first one?
The lock is taken on the latest version. Waiting on a lock is the same as how the Java locks work. It’s probably a C construct.