How does a relational database work

Introduction While doing my High-Performance Java Persistence training, I came to realize that it’s worth explaining how a relational database works, as otherwise, it is very difficult to grasp many transaction-related concepts like atomicity, durability, and checkpoints. In this post, I’m going to give a high-level explanation of how a relational database works internally while also hinting some database-specific implementation details.

How does database pessimistic locking interact with INSERT, UPDATE, and DELETE SQL statements

Introduction Relational database systems employ various Concurrency Control mechanisms to provide transactions with ACID property guarantees. While isolation levels are one way of choosing a given Concurrency Control mechanism, you can also use explicit locking whenever you want a finer-grained control to prevent data integrity issues. As previously explained, there are two types of explicit locking mechanisms: pessimistic (physical) and optimistic (logical). In this post, I’m going to explain how explicit pessimistic locking interacts with non-query DML statements (e.g. insert, update, and delete).

A beginner’s guide to the Write Skew anomaly, and how it differs between 2PL and MVCC

Introduction Unlike SQL Server which, by default, relies on the 2PL (Two-Phase Locking) to implement the SQL standard isolation levels, Oracle, PostgreSQL, and MySQL InnoDB engine use MVCC (Multi-Version Concurrency Control), so handling the Write Skew anomaly can differ from one database to the other. However, providing a truly Serializable isolation level on top of MVCC is really difficult, and, in this post, I’ll demonstrate that it’s very difficult to prevent the Write Skew anomaly without resorting to pessimistic locking.

How to map table rows to columns using SQL PIVOT or CASE expressions

Introduction While reading the wonderful SQL Antipatterns book by Bill Karwin, which is a great reference for any developer that needs to interact with a Relational Database System, I found an example where the SQL PIVOT clause would work like a charm. In this post, I’m going to explain how to transpose a ResultSet using PIVOT so that rows become columns.

A beginner’s guide to SQL injection and how you should prevent it

Introduction One of my readers asked me to answer the following StackOverflow question. Right from the start, I noticed that the entity query was constructed by concatenating strings, which can expose your application to SQL Injection attacks. Unfortunately, I’ve been seeing this problem over and over throughout my career, and not all developers are aware of the serious risk associated with SQL Injection. For this reason, this post is going to demonstrate what damage can SQL Injection do to your system.

The best way to log JDBC statements

Introduction Whenever you are using a data access framework that auto-generates statements on your behalf, it is mandatory to log all statements to ensure their effectiveness, as well as to assert the possible performance implications.

A beginner’s guide to read and write skew phenomena

Introduction In my article about ACID and database transactions, I introduced the three phenomena described by the SQL standard: dirty read non-repeatable read phantom read While these are good to differentiate the four isolation levels (Read Uncommitted, Read Committed, Repeatable Read and Serializable), in reality, there are more phenomena to take into consideration as well. The 1995 paper (A Critique of ANSI SQL Isolation Levels) introduces the other phenomena that are omitted from the standard specification. In my High-Performance Java Persistence book, I decided to insist on the Transaction chapter as it… Read More

How to prevent lost updates in long conversations

Introduction All database statements are executed within the context of a physical transaction, even when we don’t explicitly declare transaction boundaries (BEGIN/COMMIT/ROLLBACK). Data integrity is enforced by the ACID properties of database transactions. Logical vs Physical transactions A logical transaction is an application-level unit of work that may span over multiple physical (database) transactions. Holding the database connection open throughout several user requests, including user think time, is definitely an anti-pattern. A database server can accommodate a limited number of physical connections, and often those are reused by using connection pooling. Holding… Read More

A beginner’s guide to database locking and the lost update phenomena

Introduction A database is highly concurrent system. There’s always a chance of update conflicts, like when two concurring transactions try to update the same record. If there would be only one database transaction at any time then all operations would be executed sequentially. The challenge comes when multiple transactions try to update the same database rows as we still have to ensure consistent data state transitions. The SQL standard defines three consistency anomalies (phenomena): Dirty reads, prevented by Read Committed, Repeatable Read and Serializable isolation levels Non-repeatable reads, prevented by Repeatable Read… Read More

A beginner’s guide to natural and surrogate database keys

Types of primary keys All database tables must have one primary key column. The primary key uniquely identifies a row within a table therefore it’s bound by the following constraints: UNIQUE NOT NULL IMMUTABLE When choosing a primary key we must take into consideration the following aspects: the primary key may be used for joining other tables through a foreign key relationship the primary key usually has an associated default index, so the more compact the data type the less space the index will take the primary key assignment must ensure uniqueness… Read More