How to find which statement failed in a JDBC Batch Update

Introduction Yesterday, my Danish friend, Flemming Harms, asked my a very interesting question related to when a JDBC batch update fails. Basically, considering we are going to group several DML statements in a batch, we need a way to tell which statement is the cause of the failure. This post is going to answer this question in more detail.

How to store date, time, and timestamps in UTC time zone with JDBC and Hibernate

Introduction Dealing with time zones is always challenging. As a rule of thumb, it’s much easier if all date/time values are stored in the UTC format, and, if necessary, dealing with time zone conversions in the UI only. This article is going to demonstrate how you can accomplish this task with JDBC and the awesome hibernate.jdbc.time_zone configuration property.

The best way to soft delete with Hibernate

Introduction Each database application is unique. While most of the time, deleting a record is the best approach, there are times when the application requirements demand that database records should never be physically deleted. So who uses this technique? For instance, StackOverflow does it for all Posts (e.g. Questions and Answers). The StackOverflow Posts table has a ClosedDate column which acts as a soft delete mechanism since it hides an Answer for all users who have less than 10k reputation. If you’re using Oracle, you can take advantage of its Flashback capabilities,… Read More

How does MVCC (Multi-Version Concurrency Control) work

Introduction In this article, I’m going to explain how the MVCC (Multi-Version Concurrency Control) mechanism works using PostgreSQL as a reference implementation. In Concurrency Control theory, there are two ways you can deal with conflicts: You can avoid them, by employing a pessimistic locking mechanism (e.g. Read/Write locks, Two-Phase Locking) You can allow conflicts to occur, but you need to detect them using an optimistic locking mechanism (e.g. logical clock, MVCC) Because MVCC (Multi-Version Concurrency Control) is such a prevalent Concurrency Control technique (not only in relational database systems, in this article,… Read More

How to encrypt and decrypt data with Hibernate

Introduction Today, one of my Twitter followers sent me the following StackOverflow question, and, while answering it, I realized that it definitely deserves a post of its own. In this post, I will explain how you can encrypt and decrypt data with Hibernate.

How to map the latest child of a parent entity using Hibernate JoinFormula

Introduction In this article, I’m going to explain how the Hibernate JoinFormula annotation works and how you can use it to map the latest child of a parent entity. As previously explained, the @JoinFormula is a very awesome annotation which allows you to customize the way you join entities beyond JPA @JoinColumn capabilities.

How does a relational database work

Introduction While doing my High-Performance Java Persistence training, I came to realize that it’s worth explaining how a relational database works, as otherwise, it is very difficult to grasp many transaction-related concepts like atomicity, durability, and checkpoints. In this post, I’m going to give a high-level explanation of how a relational database works internally while also hinting some database-specific implementation details.

How to run integration tests at warp speed using Docker and tmpfs

Introduction In this article, I’m going to show you how to run integration tests on PostgreSQL, MySQL, MariaDB 20 times faster using Docker and mapping the data folder on tmpfs. As previously explained, you can run database integration tests 20 times faster! The trick is to map the data directory in memory, and my previous article showed you what changes you need to do when you have a PostgreSQL or MySQL instance on your machine. In this post, I’m going to expand the original idea, and show you how you can achieve… Read More

How does database pessimistic locking interact with INSERT, UPDATE, and DELETE SQL statements

Introduction Relational database systems employ various Concurrency Control mechanisms to provide transactions with ACID property guarantees. While isolation levels are one way of choosing a given Concurrency Control mechanism, you can also use explicit locking whenever you want a finer-grained control to prevent data integrity issues. As previously explained, there are two types of explicit locking mechanisms: pessimistic (physical) and optimistic (logical). In this post, I’m going to explain how explicit pessimistic locking interacts with non-query DML statements (e.g. insert, update, and delete).

A beginner’s guide to the Write Skew anomaly, and how it differs between 2PL and MVCC

Introduction Unlike SQL Server which, by default, relies on the 2PL (Two-Phase Locking) to implement the SQL standard isolation levels, Oracle, PostgreSQL, and MySQL InnoDB engine use MVCC (Multi-Version Concurrency Control), so handling the Write Skew anomaly can differ from one database to the other. However, providing a truly Serializable isolation level on top of MVCC is really difficult, and, in this post, I’ll demonstrate that it’s very difficult to prevent the Write Skew anomaly without resorting to pessimistic locking.