Vlad Mihalcea

Hibernate and UUID identifiers

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In this article, we are going to see how the UUID entity attributes are persisted when using JPA and Hibernate, for both assigned and auto-generated identifiers.

In my previous post I talked about UUID surrogate keys and the use cases when there are more appropriate than the more common auto-incrementing identifiers.

A UUID database type

There are several ways to represent a 128-bit UUID, and whenever in doubt I like to resort to Stack Exchange for an expert advice.

Because table identifiers are usually indexed, the more compact the database type the less space will the index require. From the most efficient to the least, here are our options:

  1. Some databases (PostgreSQL, SQL Server) offer a dedicated UUID storage type
  2. Otherwise we can store the bits as a byte array (e.g. RAW(16) in Oracle or the standard BINARY(16) type)
  3. Alternatively we can use 2 bigint (64-bit) columns, but a composite identifier is less efficient than a single column one
  4. We can store the hex value in a CHAR(36) column (e.g 32 hex values and 4 dashes), but this will take the most amount of space, hence it’s the least efficient alternative

Hibernate offers many identifier strategies to choose from and for UUID identifiers we have three options:

  • the assigned generator accompanied by the application logic UUID generation
  • the hexadecimal “uuid” string generator
  • the more flexible “uuid2” generator, allowing us to use java.lang.UUID, a 16 byte array or a hexadecimal String value

Read More

The hi/lo algorithm

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In my previous post I talked about various database identifier strategies, you need to be aware of when designing the database model. We concluded that database sequences are very convenient because they are both flexible and efficient for most use cases.

But even with cached sequences, the application requires a database round-trip for every new the sequence value. If your applications demand a high number of insert operations per transaction, the sequence allocation may be optimized with a hi/lo algorithm.

The hi/lo algorithm

The hi/lo algorithms split the sequences domain into “hi” groups. A “hi” value is assigned synchronously. Every “hi” group is given a maximum number of “lo” entries, that can by assigned off-line without worrying about concurrent duplicate entries.

  1. The “hi” token is assigned by the database, and two concurrent calls are guaranteed to see unique consecutive values
  2. Once a “hi” token is retrieved we only need the “incrementSize” (the number of “lo” entries)
  3. The identifiers range is given by the following formula: [(hi -1) * incrementSize) + 1, (hi * incrementSize))

    and the “lo” value will be taken from:

    [0, incrementSize)

    starting from

    [(hi -1) * incrementSize) + 1)

  4. When all “lo” values are used, a new “hi” value is fetched and the cycle continues

Here you can have an example of two concurrent transactions, each one inserting multiple entities:

Read More

A beginner’s guide to natural and surrogate database keys

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Types of primary keys

All database tables must have one primary key column. The primary key uniquely identifies a row within a table therefore it’s bound by the following constraints:

  • UNIQUE
  • NOT NULL
  • IMMUTABLE

When choosing a primary key we must take into consideration the following aspects:

  • the primary key may be used for joining other tables through a foreign key relationship
  • the primary key usually has an associated default index, so the more compact the data type the less space the index will take
  • the primary key assignment must ensure uniqueness even in highly concurrent environments

When choosing a primary key generator strategy the options are:

  • natural keys, using a column combination that guarantees individual rows uniqueness
  • surrogate keys, that are generated independently of the current row data

Read More

What I learned at Topconf Bucharest

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

I’ve got back from Topconf Romania 2014, a developer to developer conference that emerged in Tallinn and for the first time this year it was also held in Bucharest.

As an architect, I assumed I’d be after technical speeches but I got really impressed by some management related presentations as well.

Lessons learned

A conference is a great learning experience. New technologies are being advertised and software paradigms get dissected and questioned by both the speakers and the attendees. There were some great ideas I came back with and I’ll share with you as follows:

It’s all about feedback

Feed-back is the tool of wise people. Every action has an associated reaction and the feedback is a reinforcing factor you should never ignore.

Nothing is perfect but feed-back can help you get better. Feed-back is probably the only suitable learning technique in the ever-changing environment of software development.

We inherently use feedback to build better relationships, to shape our personalities or understand a problem space whose function depends on way too many variables to think of any formula that can always give you the right result.

Read More

A beginner’s guide to Hibernate Types

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

The basic mapping concepts

When learning Hibernate, many like to jump to ParentChild associations without mastering the object relation mapping basics. It’s very important to understand the basic mapping rules for individual Entities before starting modelling Entity associations.

Hibernate types

A Hibernate type is a bridge between an SQL type and a Java primitive/Object type.

Read More

The minimal configuration for testing Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In my previous post I announced my intention of creating a personal Hibernate course. The first thing to start with is a minimal testing configuration.

You only need Hibernate

In a real production environment you won’t use Hibernate alone, as you may integrate it in a Java EE or Spring container. For testing Hibernate features you don’t need a full-blown framework stack, you can simply rely on Hibernate flexible configuration options.

Read More

The data knowledge stack

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Concurrency is not for the faint-hearted

We all know concurrency programming is difficult to get it right. That’s why threading tasks are followed by extensive design and code review sessions.

You never assign concurrent issues to inexperienced developers. The problem space is carefully analyzed, a design emerges and the solution is both documented and reviewed.

That’s how threading related tasks are usually addressed. You will naturally choose a higher level abstraction since you don’t want to get tangled up in low-level details. That’s why the java.util.concurrent is usually better (unless you build a High Frequency Trading system) than hand-made producer/consumer Java 1.2 style thread-safe structures.

Is database programming any different?

In a database system, the data is spread across various structures (SQL tables or NoSQL collections) and multiple users may select/insert/update/delete whatever they choose to. From a concurrency point of view, this is a very challenging task and it’s not just the database system developer’s problem. It’s our problem as well.

A typical RDBMS data layer requires you to master various technologies and your solution is only as strong as your team’s weakest spot.

Read More

The simple scalability equation

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Queuing Theory

The queueing theory allows us to predict queue lengths and waiting times, which is of paramount importance for capacity planning. For an architect, this is a very handy tool since queues are not just the appanage of messaging systems.

To avoid system over loading we use throttling. Whenever the number of incoming requests surpasses the available resources, we basically have two options:

  • discarding all overflowing traffic, therefore decreasing availability
  • queuing requests and wait (for as long as a time out threshold) for busy resources to become available

This behavior applies to thread-per-request web servers, batch processors or connection pools.

What’s in it for us?

Agner Krarup Erlang is the father of queuing theory and traffic engineering, being the first to postulate the mathematical models required to provisioning telecommunication networks.

Erlang formulas are modeled for M/M/k queue models, meaning the system is characterized by:

The Erlang formulas give us the servicing probability for:

This is not strictly applicable to thread pools, as requests are not fairly serviced and servicing times not always follow an exponential distribution.

A general purpose formula, applicable to any stable system (a system where the arrival rate is not greater than the departure rate) is Little’s Law.

Read More

Time to break free from the SQL-92 mindset

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Are you stuck in the 90s?

If you are only using the SQL-92 language reference, then you are overlooking so many great features like:

Some test data

In my previous article I imported some CSV Dropwizard metrics into PostgreSQL for further analysis.

Read More

How to import CSV data into PostgreSQL

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

Many database servers support CSV data transfers and this post will show one way you can import CSV files to PostgreSQL.

SQL aggregation rocks!

My previous post demonstrated FlexyPool metrics capabilities and all connection related statistics were exported in CSV format.

When it comes to aggregation tabular data SQL is at its best. If your database engine supports SQL:2003 windows functions you should definitely make use of this great feature.

Read More