How does MVCC (Multi-Version Concurrency Control) work

Introduction

In Concurrency Control theory, there are two ways you can deal with conflicts:

  • You can avoid them, by employing a pessimistic locking mechanism (e.g. Read/Write locks, Two-Phase Locking)
  • You can allow conflicts to occur, but you need to detect them using an optimistic locking mechanism (e.g. logical clock, MVCC)

Because MVCC (Multi-Version Concurrency Control) is such a prevalent Concurrency Control technique (not only in relational database systems, in this article, I’m going to explain how it works.

Continue reading “How does MVCC (Multi-Version Concurrency Control) work”

Advertisements

A beginner’s guide to the Phantom Read anomaly, and how it differs between 2PL and MVCC

Introduction

Unlike SQL Server which, by default, relies on the 2PL (Two-Phase Locking) to implement the SQL standard isolation levels, Oracle, PostgreSQL, and MySQL InnoDB engine use MVCC (Multi-Version Concurrency Control).

However, providing a truly Serializable isolation level on top of MVCC is really difficult, and, in this post, I’ll demonstrate that it’s very difficult to prevent the Phantom Read anomaly without resorting to pessimistic locking.

Continue reading “A beginner’s guide to the Phantom Read anomaly, and how it differs between 2PL and MVCC”

A beginner’s guide to database locking and the lost update phenomena

Introduction

A database is highly concurrent system. There’s always a chance of update conflicts, like when two concurring transactions try to update the same record. If there would be only one database transaction at any time then all operations would be executed sequentially. The challenge comes when multiple transactions try to update the same database rows as we still have to ensure consistent data state transitions.

The SQL standard defines three consistency anomalies (phenomena):

  • Dirty reads, prevented by Read Committed, Repeatable Read and Serializable isolation levels
  • Non-repeatable reads, prevented by Repeatable Read and Serializable isolation levels
  • Phantom reads, prevented by the Serializable isolation level

A lesser-known phenomena is the lost updates anomaly and that’s what we are going to discuss in this current article.

Isolation levels

Most database systems use Read Committed as the default isolation level (MySQL using Repeatable Read instead). Choosing the isolation level is about finding the right balance of consistency and scalability for our current application requirements.

All the following examples are going to be run on PostgreSQL 9.3. Other database systems may behave differently according to their specific ACID implementation.

PostgreSQL uses both locks and MVCC (Multiversion Concurrency Control). In MVCC read and write locks are not conflicting, so reading doesn’t block writing and writing doesn’t block reading either.

Continue reading “A beginner’s guide to database locking and the lost update phenomena”