Vlad Mihalcea

A beginner’s guide to database locking and the lost update phenomena

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

A database is highly concurrent system. There’s always a chance of update conflicts, like when two concurring transactions try to update the same record. If there would be only one database transaction at any time then all operations would be executed sequentially. The challenge comes when multiple transactions try to update the same database rows as we still have to ensure consistent data state transitions.

The SQL standard defines three consistency anomalies (phenomena):

  • Dirty reads, prevented by Read Committed, Repeatable Read and Serializable isolation levels
  • Non-repeatable reads, prevented by Repeatable Read and Serializable isolation levels
  • Phantom reads, prevented by the Serializable isolation level

A lesser-known phenomenon is the lost updates anomaly and that’s what we are going to discuss in this current article.

Isolation levels

Most database systems use Read Committed as the default isolation level (MySQL using Repeatable Read instead). Choosing the isolation level is about finding the right balance of consistency and scalability for our current application requirements.

All the following examples are going to be run on PostgreSQL. Other database systems may behave differently according to their specific ACID implementation.

PostgreSQL uses both locks and MVCC (Multiversion Concurrency Control). In MVCC read and write locks are not conflicting, so readers don’t block writers and writers don’t block readers.

Read More

How does the bytecode enhancement dirty checking mechanism work in Hibernate 4.3

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

Now that you know the basics of Hibernate dirty checking, we can dig into enhanced dirty checking mechanisms. While the default graph-traversal algorithm might be sufficient for most use-cases, there might be times when you need an optimized dirty checking algorithm and instrumentation is much more convenient than building your own custom strategy.

Using Ant Hibernate Tools

Traditionally, The Hibernate Tools have been focused on Ant and Eclipse. Bytecode instrumentation has been possible since Hibernate 3, but it required an Ant task to run the CGLIB or Javassist bytecode enhancement routines.

Read More

From mostly interested to most interesting

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

No money can buy this feeling

Being appreciated for my work is what pushes me forward for contributing more. I am proud to be nominated as one of the most interesting developers.

Ever since I started this blog, helping others on Stack Overflow or contributing to Open Source Software many good things have happened.

Being mentioned on the same page with John Sonmez or Christoph Engelbert is more than flattering. With all optimism, I’ve never thought I’d get this nomination.

Thank you Jelastic, you made my day!

For more, you can read the full article.

Transactions and Concurrency Control eBook

How to customize Hibernate dirty checking mechanism

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In my previous article I described the Hibernate automatic dirty checking mechanism. While you should always prefer it, there might be times when you want to add your own custom dirtiness detection strategy.

Custom dirty checking strategies

Hibernate offers the following customization mechanisms:

Read More

The anatomy of Hibernate dirty checking mechanism

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

The persistence context enqueues entity state transitions that get translated to database statements upon flushing. For managed entities, Hibernate can auto-detect incoming changes and schedule SQL UPDATES on our behalf. This mechanism is called automatic dirty checking.

The default dirty checking strategy

By default Hibernate checks all managed entity properties. Every time an entity is loaded, Hibernate makes an additional copy of all entity property values. At flush time, every managed entity property is matched against the loading-time snapshot value:

Default Flush Event Flow

So the number of individual dirty checks is given by the following formula:

N = \sum\limits_{k=1}^n p_{k}

where

n = The number of managed entities
p = The number of properties of a given entity

Even if only one property of a single entity has ever changed, Hibernate will still check all managed entities. For a large number of managed entities, the default dirty checking mechanism may have a significant CPU and memory footprint. Since the initial entity snapshot is held separately, the persistence context requires twice as much memory as all managed entities would normally occupy.

Read More

How does AUTO flush strategy work in JPA and Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

The Hibernate AUTO flush mode behaves differently whether you are bootstrapping Hibernate via JPA or using the stand-alone mechanism.

When using JPA, the AUTO flush mode causes all queries (JPQ, Criteria API, and native SQL) to trigger a flush prior to the query execution. However, this is not the case when bootstrapping Hibernate using the native API.

Not all queries trigger a Session flush

Many would assume that Hibernate always flushes the Session before any executing query. While this might have been a more intuitive approach, and probably closer to the JPA’s AUTO FlushModeType, Hibernate tries to optimize that. If the currently executed query is not going to hit the pending SQL INSERT/UPDATE/DELETE statements then the flush is not strictly required.

As stated in the reference documentation, the AUTO flush strategy may sometimes synchronize the current persistence context prior to a query execution. It would have been more intuitive if the framework authors had chosen to name it FlushMode.SOMETIMES.

Read More

A beginner’s guide to flush strategies in JPA and Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In my previous post I introduced the entity state transitions Object-relational mapping paradigm.

All managed entity state transitions are translated to associated database statements when the current Persistence Context gets flushed. Hibernate’s flush behavior is not always as obvious as one might think.

Write-behind

Hibernate tries to defer the Persistence Context flushing up until the last possible moment. This strategy has been traditionally known as transactional write-behind.

The write-behind is more related to Hibernate flushing rather than any logical or physical transaction. During a transaction, the flush may occur multiple times.

The flushed changes are visible only for the current database transaction. Until the current transaction is committed, no change is visible by other concurrent transactions.

The persistence context, also known as the first level cache, acts as a buffer between the current entity state transitions and the database.

In caching theory, the write-behind synchronization requires that all changes happen against the cache, whose responsibility is to eventually synchronize with the backing store.

Read More

A beginner’s guide to entity state transitions with JPA and Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

Hibernate shifts the developer mindset from SQL statements to entity state transitions. Once an entity is actively managed by Hibernate, all changes are going to be automatically propagated to the database.

Manipulating domain model entities (along with their associations) is much easier than writing and maintaining SQL statements. Without an ORM tool, adding a new column requires modifying all associated INSERT/UPDATE statements.

But Hibernate is no silver bullet either. Hibernate doesn’t free us from ever worrying about the actual executed SQL statements. Controlling Hibernate is not as straightforward as one might think and it’s mandatory to check all SQL statements Hibernate executes on our behalf.

Read More

Hibernate pooled and pooled-lo identifier generators

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

In this post, we’ll uncover a sequence identifier generator combining identifier assignment efficiency and interoperability with other external systems (concurrently accessing the underlying database system).

Traditionally there have been two sequence identifier strategies to choose from: sequence and seqhilo.

The sequence identifier, always hitting the database for every new value assignment. Even with database sequence preallocation, we have a significant database round-trip cost.

The seqhilo identifier, using the hilo algorithm. This generator calculates some identifier values in-memory, therefore reducing the database round-trip calls. The problem with this optimization technique is that the current database sequence value no longer reflects the current highest in-memory generated value.

The database sequence is used as a bucket number, making it difficult for other systems to interoperate with the database table in question. Other applications must know the inner-workings of the hilo identifier strategy to properly generate non-clashing identifiers.

Read More

A beginner’s guide to Hibernate enhanced identifier generators

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

JPA identifier generators

JPA defines the following identifier strategies:

Strategy Description
AUTO The persistence provider picks the most appropriate identifier strategy supported by the underlying database
IDENTITY Identifiers are assigned by a database IDENTITY column
SEQUENCE The persistence provider uses a database sequence for generating identifiers
TABLE The persistence provider uses a separate database table to emulate a sequence object

In my previous post I exampled the pros and cons of all these surrogate identifier strategies.

Read More