How to extract change data events from MySQL to Kafka using Debezium

Introduction

As previously explained, CDC (Change Data Capture) is one of the best ways to interconnect an OLTP database system with other systems like Data Warehouse, Caches, Spark or Hadoop.

Debezium is an open source project developed by Red Hat which aims to simplify this process by allowing you to extract changes from various database systems (e.g. MySQL, PostgreSQL, MongoDB) and push them to Apache Kafka.

In this article, we are going to see how you can extract events from MySQL binary logs using Debezium.

Continue reading “How to extract change data events from MySQL to Kafka using Debezium”

Advertisements

Book Review – High Performance MySQL (3rd edition)

Introduction

I either have time for reading or writing, but not both. Now that the first edition of High-Performance Java Persistence is done, I can catch up on the many books I planned on reading but didn’t have time to do so.

In this post, I’m going to review High Performance MySQL by Baron Schwartz, Peter Zaitsev, and Vadim Tkachenkoa, which is a must-read book for anyone working with MySQL.

Continue reading “Book Review – High Performance MySQL (3rd edition)”

How does a relational database work

Introduction

While doing my High-Performance Java Persistence training, I came to realize that it’s worth explaining how a relational database works, as otherwise, it is very difficult to grasp many transaction-related concepts like atomicity, durability, and checkpoints.

In this post, I’m going to give a high-level explanation of how a relational database works internally while also hinting some database-specific implementation details.

Continue reading “How does a relational database work”

How does MySQL result set streaming perform vs fetching the whole JDBC ResultSet at once

Introduction

I read a very interesting article by Krešimir Nesek regarding MySQL result set streaming when it comes to reducing memory usage.

Mark Paluch, from Spring Data, asked if we could turn the MySQL result set streaming by default whenever we are using Query#stream or Query#scroll.

That being said, the HHH-11260 issue was created, and I started working on it. During Peer Review, Steve Ebersole (Hibernate ORM team leader) and Sanne Grinovero (Hibernate Search Team Leader) expressed their concerns regarding making such a change.

First of all, the MySQL result set streaming has the following caveats:

  • the ResultSet must be traversed fully before issuing any other SQL statement
  • the statement is not closed if there are still records to be read in the associated ResultSet
  • the locks associated with the underlying SQL statement that is being streamed are released when the transaction is ended (either commit or rollback).

Continue reading “How does MySQL result set streaming perform vs fetching the whole JDBC ResultSet at once”

How to call MySQL stored procedures and functions with JPA and Hibernate

Introduction

This article is part of a series of posts related to calling various relational database systems stored procedures and database functions from Hibernate. The reason for writing this down is because there are many peculiarities related to the underlying JDBC driver support and not every JPA or Hibernate feature is supported on every relational database.

Continue reading “How to call MySQL stored procedures and functions with JPA and Hibernate”