A beginner’s guide to CDC (Change Data Capture)

Introduction

In OLTP (Online Transaction Processing) systems, data is accessed and changed concurrently by multiple transactions and the database changes from one consistent state to another. An OLTP system always shows the latest state of our data, therefore facilitating the development of front-end applications which require near real time data consistency guarantees.

However, an OLTP system is no island, being just a small part of a larger system that encapsulates all data transformation needs required by a given enterprise. When integrating an OLTP system with a Cache, a Data Warehouse or an In-Memory Data Grid, we need an ETL process to collect the list of events that changed the OLTP system data over a given period of time.

In this article, we are going to see various methods used for capturing events and propagating them to other data processing systems.

Continue reading “A beginner’s guide to CDC (Change Data Capture)”

Advertisements

How to install DB2 Express-C on Docker and set up the JDBC connection properties

Introduction

While developing Hibernate, I need to test the code base against a plethora of relational database systems: Oracle, SQL Server, PostgreSQL, MySQL, MariaDB, Informix, and of course DB2.

However, having all these databases installed on my system is far from ideal, so I rely a lot on Docker for this task. In this article, I’m going to show how easily you can install DB2 on Docker and set up the JDBC connection so that you can run Hibernate tests on DB2.

Continue reading “How to install DB2 Express-C on Docker and set up the JDBC connection properties”

The best way to soft delete with Hibernate

Introduction

Each database application is unique. While most of the time, deleting a record is the best approach, there are times when the application requirements demand that database records should never be physically deleted.

So who uses this technique?

For instance, StackOverflow does it for all Posts (e.g. Questions and Answers). The StackOverflow Posts table has a ClosedDate column which acts as a soft delete mechanism since it hides an Answer for all users who have less than 10k reputation.

If you’re using Oracle, you can take advantage of its Flashback capabilities, so you don’t need to change your application code to offer such a functionality. Another option is to use the SQL Server Temporal Table feature.

However, not all relational database systems support Flashback queries, or they allow you to recover a certain record without having to restore from a database backup. In this case, Hibernate allows you to simplify the implementation of soft deletes, and this article is gong to explain the best way to implement the logical deletion mechanism.

Continue reading “The best way to soft delete with Hibernate”

How does MVCC (Multi-Version Concurrency Control) work

Introduction

In Concurrency Control theory, there are two ways you can deal with conflicts:

  • You can avoid them, by employing a pessimistic locking mechanism (e.g. Read/Write locks, Two-Phase Locking)
  • You can allow conflicts to occur, but you need to detect them using an optimistic locking mechanism (e.g. logical clock, MVCC)

Because MVCC (Multi-Version Concurrency Control) is such a prevalent Concurrency Control technique (not only in relational database systems, in this article, I’m going to explain how it works.

Continue reading “How does MVCC (Multi-Version Concurrency Control) work”

How does a relational database work

Introduction

While doing my High-Performance Java Persistence training, I came to realize that it’s worth explaining how a relational database works, as otherwise, it is very difficult to grasp many transaction-related concepts like atomicity, durability, and checkpoints.

In this post, I’m going to give a high-level explanation of how a relational database works internally while also hinting some database-specific implementation details.

Continue reading “How does a relational database work”