How to log the database transaction id using MDC (Mapped Diagnostic Context)

Introduction In this article, I’m going to show you how you can log the database transaction id associated with a given SQL statement using the MDC (Mapped Diagnostic Context) feature offered by many logging frameworks. If you are writing data access code, you have to use logging. As I explained in this article, using a JDBC proxy tool like datasource-proxy or p6spy is the best approach to logging SQL statements. With datasource-proxy, you can easily build a JUnit extension to automatically detect N+1 query issues. For more details, check out the db-util… Read More

Spring read-only transaction Hibernate optimization

Introduction In this article, I’m going to explain how the Spring read-only transaction Hibernate optimization works. After taking a look at what the Spring framework does when enabling the readOnly attribute on the @Transactional annotation, I realized that only the Hibernate flush mode is set to FlushType.MANUAL without propagating the read-only flag further to the Hibernate Session. So, in the true spirit of open-source software developer, I decided it’s time to make a change.

A beginner’s guide to Phantom Read anomaly

Introduction Database transactions are defined by the four properties known as ACID. The Isolation Level (I in ACID) allows you to trade off data integrity for performance. The weaker the isolation level, the more anomalies can occur, and in this article, we are going to describe the Phantom Read phenomenon.

A beginner’s guide to Non-Repeatable Read anomaly

Introduction Database transactions are defined by the four properties known as ACID. The Isolation Level (I in ACID) allows you to trade off data integrity for performance. The weaker the isolation level, the more anomalies can occur, and in this article, we are going to describe the Non-Repeatable Read phenomenon.

A beginner’s guide to Dirty Read anomaly

Introduction Database transactions are defined by the four properties known as ACID. The Isolation Level (I in ACID) allows you to trade off data integrity for performance. The weaker the isolation level, the more anomalies can occur, and in this article, we are going to describe the Dirty Read phenomenon.

How to extract change data events from MySQL to Kafka using Debezium

Introduction As previously explained, CDC (Change Data Capture) is one of the best ways to interconnect an OLTP database system with other systems like Data Warehouse, Caches, Spark or Hadoop. Debezium is an open source project developed by Red Hat which aims to simplify this process by allowing you to extract changes from various database systems (e.g. MySQL, PostgreSQL, MongoDB) and push them to Apache Kafka. In this article, we are going to see how you can extract events from MySQL binary logs using Debezium.

Why you should always use hibernate.connection.provider_disables_autocommit for resource-local JPA transactions

Introduction One of my major goals for Hibernate is to make sure we offer all sorts of performance improvements to reduce transaction response time and increase throughput. In Hibernate 5.2.10, we addressed the HHH-11542 Jira issue which allows you now to delay the database connection acquisition for resource-local transactions as well. In this article, I’m going to explain how Hibernate acquires connections and why you want it to delay this process as long as possible.

How does a relational database work

Introduction While doing my High-Performance Java Persistence training, I came to realize that it’s worth explaining how a relational database works, as otherwise, it is very difficult to grasp many transaction-related concepts like atomicity, durability, and checkpoints. In this post, I’m going to give a high-level explanation of how a relational database works internally while also hinting some database-specific implementation details.

How does database pessimistic locking interact with INSERT, UPDATE, and DELETE SQL statements

Introduction Relational database systems employ various Concurrency Control mechanisms to provide transactions with ACID property guarantees. While isolation levels are one way of choosing a given Concurrency Control mechanism, you can also use explicit locking whenever you want a finer-grained control to prevent data integrity issues. As previously explained, there are two types of explicit locking mechanisms: pessimistic (physical) and optimistic (logical). In this post, I’m going to explain how explicit pessimistic locking interacts with non-query DML statements (e.g. insert, update, and delete).

A beginner’s guide to the Write Skew anomaly, and how it differs between 2PL and MVCC

Introduction Unlike SQL Server which, by default, relies on the 2PL (Two-Phase Locking) to implement the SQL standard isolation levels, Oracle, PostgreSQL, and MySQL InnoDB engine use MVCC (Multi-Version Concurrency Control), so handling the Write Skew anomaly can differ from one database to the other. However, providing a truly Serializable isolation level on top of MVCC is really difficult, and, in this post, I’ll demonstrate that it’s very difficult to prevent the Write Skew anomaly without resorting to pessimistic locking.