A beginner’s guide to Dirty Read anomaly

Introduction Database transactions are defined by the four properties known as ACID. The Isolation Level (I in ACID) allows you to trade off data integrity for performance. The weaker the isolation level, the more anomalies can occur, and in this article, we are going to describe the Dirty Read phenomenon.

How to extract change data events from MySQL to Kafka using Debezium

Introduction As previously explained, CDC (Change Data Capture) is one of the best ways to interconnect an OLTP database system with other systems like Data Warehouse, Caches, Spark or Hadoop. Debezium is an open-source project developed by Red Hat which aims to simplify this process by allowing you to extract changes from various database systems (e.g. MySQL, PostgreSQL, MongoDB) and push them to Apache Kafka. In this article, we are going to see how you can extract events from MySQL binary logs using Debezium.

Why you should always use hibernate.connection.provider_disables_autocommit for resource-local JPA transactions

Introduction One of my major goals for Hibernate is to make sure we offer all sorts of performance improvements to reduce transaction response time and increase throughput. In Hibernate 5.2.10, we addressed the HHH-11542 Jira issue which allows you now to delay the database connection acquisition for resource-local transactions as well. In this article, I’m going to explain how Hibernate acquires connections and why you want it to delay this process as long as possible.

How does a relational database work

Introduction While doing my High-Performance Java Persistence training, I came to realize that it’s worth explaining how a relational database works, as otherwise, it is very difficult to grasp many transaction-related concepts like atomicity, durability, and checkpoints. In this post, I’m going to give a high-level explanation of how a relational database works internally while also hinting some database-specific implementation details.

How does database pessimistic locking interact with INSERT, UPDATE, and DELETE SQL statements

Introduction Relational database systems employ various Concurrency Control mechanisms to provide transactions with ACID property guarantees. While isolation levels are one way of choosing a given Concurrency Control mechanism, you can also use explicit locking whenever you want a finer-grained control to prevent data integrity issues. As previously explained, there are two types of explicit locking mechanisms: pessimistic (physical) and optimistic (logical). In this post, I’m going to explain how explicit pessimistic locking interacts with non-query DML statements (e.g. insert, update, and delete).

A beginner’s guide to the Write Skew anomaly, and how it differs between 2PL and MVCC

Introduction Unlike SQL Server which, by default, relies on the 2PL (Two-Phase Locking) to implement the SQL standard isolation levels, Oracle, PostgreSQL, and MySQL InnoDB engine use MVCC (Multi-Version Concurrency Control), so handling the Write Skew anomaly can differ from one database to the other. However, providing a truly Serializable isolation level on top of MVCC is really difficult, and, in this post, I’ll demonstrate that it’s very difficult to prevent the Write Skew anomaly without resorting to pessimistic locking.

MySQL metadata locking and database transaction ending

Introduction As previously explained, every SQL statement must be executed in the context of a database transaction. For modifying statements (e.g. INSERT, UPDATE, DELETE), row-level locks must be taken to ensure recoverability and avoid the data anomalies. Next, I’ll demonstrate what can happen when a database transaction is not properly ended.

How does aggressive connection release work in Hibernate

Hibernate connection providers Hibernate needs to operate both in Java EE and stand-alone environments, and the database connectivity configuration can be done either declaratively or programmatically. To accommodate JDBC Driver connections as well as RESOURCE_LOCAL and JTA DataSource configurations, Hibernate defines its own connection factory abstraction, represented by the org.hibernate.engine.jdbc.connections.spi.ConnectionProvider interface.

A beginner’s guide to Read and Write Skew phenomena

Introduction In my article about ACID and database transactions, I introduced the three phenomena described by the SQL standard: dirty read non-repeatable read phantom read While these are good to differentiate the four isolation levels (Read Uncommitted, Read Committed, Repeatable Read and Serializable), in reality, there are more phenomena to take into consideration as well. The 1995 paper (A Critique of ANSI SQL Isolation Levels) introduces the other phenomena that are omitted from the standard specification. In my High-Performance Java Persistence book, I decided to insist on the Transaction chapter as it… Read More

How does Hibernate TRANSACTIONAL CacheConcurrencyStrategy work

Introduction In my previous post, I introduced the READ_WRITE second-level cache concurrency mechanism. In this article, I am going to continue this topic with the TRANSACTIONAL strategy.