How to extract change data events from MySQL to Kafka using Debezium

Introduction

As previously explained, CDC (Change Data Capture) is one of the best ways to interconnect an OLTP database system with other systems like Data Warehouse, Caches, Spark or Hadoop.

Debezium is an open source project developed by Red Hat which aims to simplify this process by allowing you to extract changes from various database systems (e.g. MySQL, PostgreSQL, MongoDB) and push them to Apache Kafka.

In this article, we are going to see how you can extract events from MySQL binary logs using Debezium.

Continue reading “How to extract change data events from MySQL to Kafka using Debezium”

Advertisements

Book Review – High Performance MySQL (3rd edition)

Introduction

I either have time for reading or writing, but not both. Now that the first edition of High-Performance Java Persistence is done, I can catch up on the many books I planned on reading but didn’t have time to do so.

In this post, I’m going to review High Performance MySQL by Baron Schwartz, Peter Zaitsev, and Vadim Tkachenkoa, which is a must-read book for anyone working with MySQL.

Continue reading “Book Review – High Performance MySQL (3rd edition)”

What’s new in JPA 2.2 – Stream the result of a Query execution

Introduction

Now that the JPA 2.2 Review Ballot was approved, let’s start analyzing some of the new additions to the standard which have been supported by Hibernate for quite some time already.

In this article, we are going to see how Stream query results are supported by Hibernate and the caveats of using database cursors just to limit the amount of data that needs to be fetched.

Continue reading “What’s new in JPA 2.2 – Stream the result of a Query execution”

A beginner’s guide to CDC (Change Data Capture)

Introduction

In OLTP (Online Transaction Processing) systems, data is accessed and changed concurrently by multiple transactions and the database changes from one consistent state to another. An OLTP system always shows the latest state of our data, therefore facilitating the development of front-end applications which require near real time data consistency guarantees.

However, an OLTP system is no island, being just a small part of a larger system that encapsulates all data transformation needs required by a given enterprise. When integrating an OLTP system with a Cache, a Data Warehouse or an In-Memory Data Grid, we need an ETL process to collect the list of events that changed the OLTP system data over a given period of time.

In this article, we are going to see various methods used for capturing events and propagating them to other data processing systems.

Continue reading “A beginner’s guide to CDC (Change Data Capture)”

What’s new in JPA 2.2 – Java 8 Date and Time Types

Introduction

Now that the JPA 2.2 Review Ballot was approved, let’s start analyzing some of the new additions to the standard which have been supported by Hibernate for quite some time already.

In this article, we are going to see how Java 8 Date/Time API is supported and which types you need to use depending on your business case requirements.

Continue reading “What’s new in JPA 2.2 – Java 8 Date and Time Types”