How to extract change data events from MySQL to Kafka using Debezium

Introduction

As previously explained, CDC (Change Data Capture) is one of the best ways to interconnect an OLTP database system with other systems like Data Warehouse, Caches, Spark or Hadoop.

Debezium is an open source project developed by Red Hat which aims to simplify this process by allowing you to extract changes from various database systems (e.g. MySQL, PostgreSQL, MongoDB) and push them to Apache Kafka.

In this article, we are going to see how you can extract events from MySQL binary logs using Debezium.

Continue reading “How to extract change data events from MySQL to Kafka using Debezium”

Advertisements

A beginner’s guide to CDC (Change Data Capture)

Introduction

In OLTP (Online Transaction Processing) systems, data is accessed and changed concurrently by multiple transactions and the database changes from one consistent state to another. An OLTP system always shows the latest state of our data, therefore facilitating the development of front-end applications which require near real time data consistency guarantees.

However, an OLTP system is no island, being just a small part of a larger system that encapsulates all data transformation needs required by a given enterprise. When integrating an OLTP system with a Cache, a Data Warehouse or an In-Memory Data Grid, we need an ETL process to collect the list of events that changed the OLTP system data over a given period of time.

In this article, we are going to see various methods used for capturing events and propagating them to other data processing systems.

Continue reading “A beginner’s guide to CDC (Change Data Capture)”