EAGER fetching is a code smell

Introduction

Hibernate fetching strategies can really make a difference between an application that barely crawls and a highly responsive one. In this post I’ll explain why you should prefer query based fetching instead of global fetch plans.

Fetching 101

Hibernate defines four association retrieving strategies:

Fetching Strategy Description
Join The association is OUTER JOINED in the original SELECT statement
Select An additional SELECT statement is used to retrieve the associated entity(entities)
Subselect An additional SELECT statement is used to retrieve the whole associated collection. This mode is meant for to-many associations
Batch An additional number of SELECT statements is used to retrieve the whole associated collection. Each additional SELECT will retrieve a fixed number of associated entities. This mode is meant for to-many associations

These fetching strategies might be applied in the following scenarios:

  • the association is always initialized along with its owner (e.g. EAGER FetchType)
  • the uninitialized association (e.g. LAZY FetchType) is navigated, therefore the association must be retrieved with a secondary SELECT

The Hibernate mappings fetching information forms the global fetch plan. At query time, we may override the global fetch plan, but only for LAZY associations. For this we can use the fetch HQL/JPQL/Criteria directive. EAGER associations cannot be overridden, therefore tying your application to the global fetch plan.

Hibernate 3 acknowledged that LAZY should be the default association fetching strategy:

By default, Hibernate3 uses lazy select fetching for collections and lazy proxy fetching for single-valued associations. These defaults make sense for most associations in the majority of applications.

This decision was taken after noticing many performance issues associated with Hibernate 2 default eager fetching. Unfortunately JPA has taken a different approach and decided that to-many associations be LAZY while to-one relationships be fetched eagerly.

Association type Default fetching policy
@OneTMany LAZY
@ManyToMany LAZY
@ManyToOne EAGER
@OneToOne EAGER

EAGER fetching inconsistencies

While it may be convenient to just mark associations as EAGER, delegating the fetching responsibility to Hibernate, it’s advisable to resort to query based fetch plans.

An EAGER association will always be fetched and the fetching strategy is not consistent across all querying techniques.

Next, I’m going to demonstrate how EAGER fetching behaves for all Hibernate querying variants. I will reuse the same entity model I’ve previously introduced in my fetching strategies article:

Product

The Product entity has the following associations:

@ManyToOne(fetch = FetchType.EAGER)
@JoinColumn(name = "company_id", nullable = false)
private Company company;

@OneToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL, mappedBy = "product", optional = false)
private WarehouseProductInfo warehouseProductInfo;

@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "importer_id")
private Importer importer;

@OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, mappedBy = "product", orphanRemoval = true)
@OrderBy("index")
private Set<Image> images = new LinkedHashSet<Image>();

The company association is marked as EAGER and Hibernate will always employ a fetching strategy to initialize it along with its owner entity.

Persistence Context loading

First we’ll load the entity using the Persistence Context API:

Product product = entityManager.find(Product.class, productId);

Which generates the following SQL SELECT statement:

Query:{[
select 
    product0_.id as id1_18_1_, 
    product0_.code as code2_18_1_, 
    product0_.company_id as company_6_18_1_, 
    product0_.importer_id as importer7_18_1_, 
    product0_.name as name3_18_1_, 
    product0_.quantity as quantity4_18_1_, 
    product0_.version as version5_18_1_, 
    company1_.id as id1_6_0_, 
    company1_.name as name2_6_0_ 
from Product product0_ 
inner join Company company1_ on product0_.company_id=company1_.id 
where product0_.id=?][1]

The EAGER company association was retrieved using an inner join. For M such associations the owner entity table is going to be joined M times.

Each extra join adds up to the overall query complexity and execution time. If we don’t even use all these associations, for every possible business scenario, then we’ve just paid the extra performance penalty for nothing in return.

Fetching using JPQL and Criteria

Product product = entityManager.createQuery(
	"select p " +
			"from Product p " +
			"where p.id = :productId", Product.class)
	.setParameter("productId", productId)
	.getSingleResult();

or with

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Product> cq = cb.createQuery(Product.class);
Root<Product> productRoot = cq.from(Product.class);
cq.where(cb.equal(productRoot.get("id"), productId));
Product product = entityManager.createQuery(cq).getSingleResult();

Generates the following SQL SELECT statements:

Query:{[
select 
    product0_.id as id1_18_, 
    product0_.code as code2_18_, 
    product0_.company_id as company_6_18_, 
    product0_.importer_id as importer7_18_, 
    product0_.name as name3_18_, 
    product0_.quantity as quantity4_18_, 
    product0_.version as version5_18_ 
from Product product0_ 
where product0_.id=?][1]} 

Query:{[
select 
    company0_.id as id1_6_0_, 
    company0_.name as name2_6_0_ 
from Company company0_ 
where company0_.id=?][1]}

Both JPQL and Criteria queries default to select fetching, therefore issuing a secondary select for each individual EAGER association. The larger the associations number, the more additional individual SELECTS, the more it will affect our application performance.

Hibernate Criteria API

While JPA 2.0 added support for Criteria queries, Hibernate has long been offering a specific dynamic query implementation.

If the EntityManager implementation delegates method calls the the legacy Session API, the JPA Criteria implementation was written from scratch. That’s the reason why Hibernate and JPA Criteria API behave differently for similar querying scenarios.

The previous example Hibernate Criteria equivalent looks like this:

Product product = (Product) session.createCriteria(Product.class)
	.add(Restrictions.eq("id", productId))
	.uniqueResult();

And the associated SQL SELECT is:

Query:{[
select 
    this_.id as id1_3_1_, 
    this_.code as code2_3_1_, 
    this_.company_id as company_6_3_1_, 
    this_.importer_id as importer7_3_1_, 
    this_.name as name3_3_1_, 
    this_.quantity as quantity4_3_1_, 
    this_.version as version5_3_1_, 
    hibernatea2_.id as id1_0_0_, 
    hibernatea2_.name as name2_0_0_ 
from Product this_ 
inner join Company hibernatea2_ on this_.company_id=hibernatea2_.id 
where this_.id=?][1]}

This query uses the join fetch strategy as opposed to select fetching, employed by JPQL/HQL and Criteria API.

Hibernate Criteria and to-many EAGER collections

Let’s see what happens when the image collection fetching strategy is set to EAGER:

@OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, mappedBy = "product", orphanRemoval = true)
@OrderBy("index")
private Set<Image> images = new LinkedHashSet<Image>();

The following SQL is going to be generated:

Query:{[
select 
    this_.id as id1_3_2_, 
    this_.code as code2_3_2_, 
    this_.company_id as company_6_3_2_, 
    this_.importer_id as importer7_3_2_, 
    this_.name as name3_3_2_, 
    this_.quantity as quantity4_3_2_, 
    this_.version as version5_3_2_, 
    hibernatea2_.id as id1_0_0_, 
    hibernatea2_.name as name2_0_0_, 
    images3_.product_id as product_4_3_4_, 
    images3_.id as id1_1_4_, 
    images3_.id as id1_1_1_, 
    images3_.index as index2_1_1_, 
    images3_.name as name3_1_1_, 
    images3_.product_id as product_4_1_1_ 
from Product this_ 
inner join Company hibernatea2_ on this_.company_id=hibernatea2_.id 
left outer join Image images3_ on this_.id=images3_.product_id 
where this_.id=? 
order by images3_.index][1]}

Hibernate Criteria doesn’t automatically groups the parent entities list. Because of the one-to-many children table JOIN, for each child entity we are going to get a new parent entity object reference (all pointing to the same object in our current Persistence Context):

product.setName("TV");
product.setCompany(company);

Image frontImage = new Image();
frontImage.setName("front image");
frontImage.setIndex(0);

Image sideImage = new Image();
sideImage.setName("side image");
sideImage.setIndex(1);

product.addImage(frontImage);
product.addImage(sideImage);

List products = session.createCriteria(Product.class)
	.add(Restrictions.eq("id", productId))
	.list();
assertEquals(2, products.size());
assertSame(products.get(0), products.get(1));

Because we have two image entities, we will get two Product entity references, both pointing to the same first level cache entry.

To fix it we need to instruct Hibernate Criteria to use distinct root entities:

List products = session.createCriteria(Product.class)
	.add(Restrictions.eq("id", productId))
	.setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY)
	.list();
assertEquals(1, products.size());

Conclusion

The EAGER fetching strategy is a code smell. Most often it’s used for simplicity sake without considering the long-term performance penalties. The fetching strategy should never be the entity mapping responsibility. Each business use case has different entity load requirements and therefore the fetching strategy should be delegated to each individual query.

The global fetch plan should only define LAZY associations, which are fetched on a per query basis. Combined with the always check generated queries strategy, the query based fetch plans can improve application performance and reduce maintaining costs.

Code available for Hibernate and JPA.

If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.

The downside of version-less optimistic locking

Introduction

In my previous post I demonstrated how you can scale optimistic locking through write-concerns splitting.

Version-less optimistic locking is one lesser-known Hibernate feature. In this post I’ll explain both the good and the bad parts of this approach.

Version-less optimistic locking

Optimistic locking is commonly associated with a logical or physical clocking sequence, for both performance and consistency reasons. The clocking sequence points to an absolute entity state version for all entity state transitions.

To support legacy database schema optimistic locking, Hibernate added a version-less concurrency control mechanism. To enable this feature you have to configure your entities with the @OptimisticLocking annotation that takes the following parameters:

Optimistic Locking Type Description
ALL All entity properties are going to be used to verify the entity version
DIRTY Only current dirty properties are going to be used to verify the entity version
NONE Disables optimistic locking
VERSION Surrogate version column optimistic locking

For version-less optimistic locking, you need to choose ALL or DIRTY.

Use case

We are going to rerun the Product update use case I covered in my previous optimistic locking scaling article.

The Product entity looks like this:

OptimisticLockingOneProductEntityNoVersion

First thing to notice is the absence of a surrogate version column. For concurrency control, we’ll use DIRTY properties optimistic locking:

@Entity(name = "product")
@Table(name = "product")
@OptimisticLocking(type = OptimisticLockType.DIRTY)
@DynamicUpdate
public class Product {
//code omitted for brevity
}

By default, Hibernate includes all table columns in every entity update, therefore reusing cached prepared statements. For dirty properties optimistic locking, the changed columns are included in the update WHERE clause and that’s the reason for using the @DynamicUpdate annotation.

This entity is going to be changed by three concurrent users (e.g. Alice, Bob and Vlad), each one updating a distinct entity properties subset, as you can see in the following The following sequence diagram:

OptimisticLockingOneRootEntityNoVersion

The SQL DML statement sequence goes like this:

#create tables
Query:{[create table product (id bigint not null, description varchar(255) not null, likes integer not null, name varchar(255) not null, price numeric(19,2) not null, quantity bigint not null, primary key (id))][]} 
Query:{[alter table product add constraint UK_jmivyxk9rmgysrmsqw15lqr5b  unique (name)][]} 

#insert product
Query:{[insert into product (description, likes, name, price, quantity, id) values (?, ?, ?, ?, ?, ?)][Plasma TV,0,TV,199.99,7,1]} 

#Alice selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 
#Bob selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 
#Vlad selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 

#Alice updates the product
Query:{[update product set quantity=? where id=? and quantity=?][6,1,7]} 

#Bob updates the product
Query:{[update product set likes=? where id=? and likes=?][1,1,0]} 

#Vlad updates the product
Query:{[update product set description=? where id=? and description=?][Plasma HDTV,1,Plasma TV]} 

Each UPDATE sets the latest changes and expects the current database snapshot to be exactly as it was at entity load time. As simple and straightforward as it may look, the version-less optimistic locking strategy suffers from a very inconvenient shortcoming.

The detached entities anomaly

The version-less optimistic locking is feasible as long as you don’t close the Persistence Context. All entity changes must happen inside an open Persistence Context, Hibernate translating entity state transitions into database DML statements.

Detached entities changes can be only persisted if the entities rebecome managed in a new Hibernate Session, and for this we have two options:

Both operations require a database SELECT to retrieve the latest database snapshot, so changes will be applied against the latest entity version. Unfortunately, this can also lead to lost updates, as we can see in the following sequence diagram:

OptimisticLockingOneRootEntityNoVersionLostUpdate

Once the original Session is gone, we have no way of including the original entity state in the UPDATE WHERE clause. So newer changes might be overwritten by older ones and this is exactly what we wanted to avoid in the very first place.

Let’s replicate this issue for both merging and reattaching.

Merging

The merge operation consists in loading and attaching a new entity object from the database and update it with the current given entity snapshot. Merging is supported by JPA too and it’s tolerant to already managed Persistence Context entity entries. If there’s an already managed entity then the select is not going to be issued, as Hibernate guarantees session-level repeatable reads.

#Alice inserts a Product and her Session is closed
Query:{[insert into Product (description, likes, name, price, quantity, id) values (?, ?, ?, ?, ?, ?)][Plasma TV,0,TV,199.99,7,1]} 

#Bob selects the Product and changes the price to 21.22
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_ from Product optimistic0_ where optimistic0_.id=?][1]}
OptimisticLockingVersionlessTest - Updating product price to 21.22
Query:{[update Product set price=? where id=? and price=?][21.22,1,199.99]} 

#Alice changes the Product price to 1 and tries to merge the detached Product entity
c.v.h.m.l.c.OptimisticLockingVersionlessTest - Merging product, price to be saved is 1
#A fresh copy is going to be fetched from the database
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_ from Product optimistic0_ where optimistic0_.id=?][1]} 
#Alice overwrites Bob therefore loosing an update
Query:{[update Product set price=? where id=? and price=?][1,1,21.22]} 

Reattaching

Reattaching is a Hibernate specific operation. As opposed to merging, the given detached entity must become managed in another Session. If there’s an already loaded entity, Hibernate will throw an exception. This operation also requires an SQL SELECT for loading the current database entity snapshot. The detached entity state will be copied on the freshly loaded entity snapshot and the dirty checking mechanism will trigger the actual DML update:

#Alice inserts a Product and her Session is closed
Query:{[insert into Product (description, likes, name, price, quantity, id) values (?, ?, ?, ?, ?, ?)][Plasma TV,0,TV,199.99,7,1]} 

#Bob selects the Product and changes the price to 21.22
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_ from Product optimistic0_ where optimistic0_.id=?][1]}
OptimisticLockingVersionlessTest - Updating product price to 21.22
Query:{[update Product set price=? where id=? and price=?][21.22,1,199.99]} 

#Alice changes the Product price to 1 and tries to merge the detached Product entity
c.v.h.m.l.c.OptimisticLockingVersionlessTest - Reattaching product, price to be saved is 10
#A fresh copy is going to be fetched from the database
Query:{[select optimistic_.id, optimistic_.description as descript2_0_, optimistic_.likes as likes3_0_, optimistic_.name as name4_0_, optimistic_.price as price5_0_, optimistic_.quantity as quantity6_0_ from Product optimistic_ where optimistic_.id=?][1]} 
#Alice overwrites Bob therefore loosing an update
Query:{[update Product set price=? where id=?][10,1]} 

Conclusion

The version-less optimistic locking is a viable alternative as long as you can stick to a non-detached entities policy. Combined with extended persistence contexts, this strategy can boost writing performance even for a legacy database schema.

Code available on GitHub.

If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.

Spring request-level memoization

Introduction

Memoization is a method-level caching technique for speeding-up consecutive invocations.

This post will demonstrate how you can achieve request-level repeatable reads for any data source, using Spring AOP only.

Spring Caching

Spring offers a very useful caching abstracting, allowing you do decouple the application logic from the caching implementation details.

Spring Caching uses an application-level scope, so for a request-only memoization we need to take a DIY approach.

Request-level Caching

A request-level cache entry life-cycle is always bound to the current request scope. Such cache is very similar to Hibernate Persistence Context that offers session-level repeatable reads.

Repeatable reads are mandatory for preventing lost updates, even for NoSQL solutions.

Step-by-step implementation

First we are going to define a Memoizing marker annotation:

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface Memoize {
}

This annotation is going to explicitly mark all methods that need to be memoized.

To distinguish different method invocations we are going to encapsulate the method call info into the following object type:

public class InvocationContext {

    public static final String TEMPLATE = "%s.%s(%s)";

    private final Class targetClass;
    private final String targetMethod;
    private final Object[] args;

    public InvocationContext(Class targetClass, String targetMethod, Object[] args) {
        this.targetClass = targetClass;
        this.targetMethod = targetMethod;
        this.args = args;
    }

    public Class getTargetClass() {
        return targetClass;
    }

    public String getTargetMethod() {
        return targetMethod;
    }

    public Object[] getArgs() {
        return args;
    }

    @Override
    public boolean equals(Object that) {
        return EqualsBuilder.reflectionEquals(this, that);
    }

    @Override
    public int hashCode() {
        return HashCodeBuilder.reflectionHashCode(this);
    }

    @Override
    public String toString() {
        return String.format(TEMPLATE, targetClass.getName(), targetMethod, Arrays.toString(args));
    }
}

Few know about the awesomeness of Spring Request/Session bean scopes.

Because we require a request-level memoization scope, we can simplify our design with a Spring request scope that hides the actual HttpSession resolving logic:

@Component
@Scope(proxyMode = ScopedProxyMode.TARGET_CLASS, value = "request")
public class RequestScopeCache {

    public static final Object NONE = new Object();

    private final Map<InvocationContext, Object> cache = new HashMap<InvocationContext, Object>();

    public Object get(InvocationContext invocationContext) {
        return cache.containsKey(invocationContext) ? cache.get(invocationContext) : NONE;
    }

    public void put(InvocationContext methodInvocation, Object result) {
        cache.put(methodInvocation, result);
    }
}

Since a mere annotation means nothing without a runtime processing engine, we must therefore define a Spring Aspect implementing the actual memoization logic:

@Aspect
public class MemoizerAspect {

    @Autowired
    private RequestScopeCache requestScopeCache;

    @Around("@annotation(com.vladmihalcea.cache.Memoize)")
    public Object memoize(ProceedingJoinPoint pjp) throws Throwable {
        InvocationContext invocationContext = new InvocationContext(
                pjp.getSignature().getDeclaringType(),
                pjp.getSignature().getName(),
                pjp.getArgs()
        );
        Object result = requestScopeCache.get(invocationContext);
        if (RequestScopeCache.NONE == result) {
            result = pjp.proceed();
            LOGGER.info("Memoizing result {}, for method invocation: {}", result, invocationContext);
            requestScopeCache.put(invocationContext, result);
        } else {
            LOGGER.info("Using memoized result: {}, for method invocation: {}", result, invocationContext);
        }
        return result;
    }
}

Testing time

Let’s put all this to a test. For simplicity sake, we are going to emulate the request-level scope memoization requirements with a Fibonacci number calculator:

@Component
public class FibonacciServiceImpl implements FibonacciService {

    @Autowired
    private ApplicationContext applicationContext;

    private FibonacciService fibonacciService;

    @PostConstruct
    private void init() {
        fibonacciService = applicationContext.getBean(FibonacciService.class);
    }

    @Memoize
    public int compute(int i) {
        LOGGER.info("Calculate fibonacci for number {}", i);
        if (i == 0 || i == 1)
            return i;
        return fibonacciService.compute(i - 2) + fibonacciService.compute(i - 1);
    }
}

If we are to calculate the 10th Fibonnaci number, we’ll get the following result:

Calculate fibonacci for number 10
Calculate fibonacci for number 8
Calculate fibonacci for number 6
Calculate fibonacci for number 4
Calculate fibonacci for number 2
Calculate fibonacci for number 0
Memoizing result 0, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([0])
Calculate fibonacci for number 1
Memoizing result 1, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([1])
Memoizing result 1, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([2])
Calculate fibonacci for number 3
Using memoized result: 1, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([1])
Using memoized result: 1, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([2])
Memoizing result 2, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([3])
Memoizing result 3, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([4])
Calculate fibonacci for number 5
Using memoized result: 2, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([3])
Using memoized result: 3, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([4])
Memoizing result 5, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([5])
Memoizing result 8, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([6])
Calculate fibonacci for number 7
Using memoized result: 5, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([5])
Using memoized result: 8, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([6])
Memoizing result 13, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([7])
Memoizing result 21, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([8])
Calculate fibonacci for number 9
Using memoized result: 13, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([7])
Using memoized result: 21, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([8])
Memoizing result 34, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([9])
Memoizing result 55, for method invocation: com.vladmihalcea.cache.FibonacciService.compute([10])

Conclusion

Memoization is a cross-cutting concern and Spring AOP allows you to decouple the caching details from the actual application logic code.

Code available on GitHub.

If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.

Java Performance Workshop with Peter Lawrey

Peter Lawrey at IT Days

I’ve just come back from a Java Performance Workshop held by Peter Lawrey at Cluj-Napoca IT Days.

Peter Lawrey is a well-known Java StackOverflow user and the creator of Java Chronicle open-source library.

Of Java and low latency

Little’s Law defines concurrency as:

Throughput = \dfrac{Served Requests}{Latency}

To increase throughput we can either:

  • increase server resources (scaling vertically or horizontally)
  • decrease latency (improve performance)

While enterprise applications are usually designed for scaling, trading systems focus on lowering latencies.

As surprising as it may sound, most trading systems, I’ve heard of, are actually written in Java. Java automatic memory management is a double-edged sword, since it trades simplicity for flexibility.

A low latency system worse nightmare is a stop-of-the-world process, like a garbage collector major collection. So in order to avoid such situations, you must go off-heap.

Off-heap processing

Java doesn’t directly offer off-heap memory management so you have to resort to unorthodox methods, like interacting with the sun.misc.Unsafe class.

As it’s name clearly suggests, you shouldn’t be using this class in the very first place. This class allows you to bypass many JVM safety mechanisms, so you can:

  • allocate off-heap memory
  • define classes without an actual ClassLoader
  • reassign memory data, even constants

This class is not meant to be used by most Java professionals, being mostly useful for spoon bending libraries such as:

Java Chronicle

For low latency trading systems, you cannot use any off-the-shelf messaging system. To reach 20 millions per seconds you need a high performance queuing solution.

Java Chronicle is a much better alternative to using memory-mapped files with the risky sun.misc.Unsafe API. In short, Java Chronicle allows a producer process to write data to a shared memory location, only to be consumed by some other process.

So, Java Chronicle is more like a memory-mapped file utility than an actual messaging system. So if you plan on connecting two separate applications running in the same box, then you should definitely give a try to this low latency library.

Enterprise messaging

While you should always strive for the least possible latency, an enterprise application requires a totally different queuing solution. An enterprise system is built of several applications that need to be inter-connected and these applications might reside on different nodes.

So an enterprise messaging system must be designed for networking in the very first place. Compared to trading systems latencies (microseconds), network latencies are much higher (milliseconds).

An enterprise queue must deal with both multiple producers as well as multiple consumers. Some systems use brokers, to acknowledge consumed messages (e.g. JMS), while other systems delegate message reading offsets to the consumer-side (Apache Kafka).

Apache Kafka is a publish-subscribe messaging solution based on a distributed commit log. So while Java Chronicle increases throughput by reducing latency, Kafka uses distributed scaling instead.

If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.

A beginner’s guide to Java time zone handling

Basic time notions

Most web applications have to support different time-zones and properly handling time-zones is no way easy. To make matters worse, you have to make sure that timestamps are consistent across various programming languages (e.g. JavaScript on the front-end, Java in the middle-ware and MongoDB as the data repository). This post aims to explain the basic notions of absolute and relative time.

Epoch

An epoch is a an absolute time reference. Most programming languages (e.g Java, JavaScript, Python) use the Unix epoch (Midnight 1 January 1970) when expressing a given timestamp as the number of milliseconds elapsed since a fixed point-in-time reference.

Relative numerical timestamp

The relative numerical timestamp is expressed as the number of milliseconds elapsed since epoch.

Time zone

The coordinated universal time (UTC) is the most common time standard. The UTC time zone (equivalent to GMT) represents the time reference all other time zones relate to (through a positive/negative offset).

UTC time zone is commonly refereed as Zulu time (Z) or UTC+0. Japan time zone is UTC+9 and Honolulu time zone is UTC-10. At the time of Unix epoch (1 January 1970 00:00 UTC time zone) it was 1 January 1970 09:00 in Tokyo and 31 December 1969 14:00 in Honolulu.

ISO 8601

ISO 8601 is the most widespread date/time representation standard and it uses the following date/time formats:

Time zone Notation
UTC 1970-01-01T00:00:00.000+00:00
UTC Zulu time 1970-01-01T00:00:00.000+Z
Tokio 1970-01-01T09:00:00.000+09:00
Honolulu 1969-12-31T14:00:00.000-10:00

Java time basics

java.util.Date

java.util.Date is definitely the most common time-related class. It represents a fixed point in time, expressed as the relative number of milliseconds elapsed since epoch. java.util.Date is time zone independent, except for the toString method which uses a the local time zone for generating a String representation.

java.util.Calendar

The java.util.Calendar is both a Date/Time factory as well as a time zone aware timing instance. It’s one of the least user-friendly Java API class to work with and we can demonstrate this in the following example:

@Test
public void testTimeZonesWithCalendar() throws ParseException {
	assertEquals(0L, newCalendarInstanceMillis("GMT").getTimeInMillis());
	assertEquals(TimeUnit.HOURS.toMillis(-9), newCalendarInstanceMillis("Japan").getTimeInMillis());
	assertEquals(TimeUnit.HOURS.toMillis(10), newCalendarInstanceMillis("Pacific/Honolulu").getTimeInMillis());
	Calendar epoch = newCalendarInstanceMillis("GMT");
	epoch.setTimeZone(TimeZone.getTimeZone("Japan"));
	assertEquals(TimeUnit.HOURS.toMillis(-9), epoch.getTimeInMillis());
}

private Calendar newCalendarInstance(String timeZoneId) {
	Calendar calendar = new GregorianCalendar();
	calendar.set(Calendar.YEAR, 1970);
	calendar.set(Calendar.MONTH, 0);
	calendar.set(Calendar.DAY_OF_MONTH, 1);
	calendar.set(Calendar.HOUR_OF_DAY, 0);
	calendar.set(Calendar.MINUTE, 0);
	calendar.set(Calendar.SECOND, 0);
	calendar.set(Calendar.MILLISECOND, 0);
	calendar.setTimeZone(TimeZone.getTimeZone(timeZoneId));
	return calendar;
}

At the time of Unix epoch (the UTC time zone), Tokyo time was nine hours ahead, while Honolulu was ten hours behind.

Changing a Calendar time zone preserves the actual time while shifting the zone offset. The relative timestamp changes along with the Calendar time zone offset.

Joda-Time and Java 8 Date Time API simply make java.util.Calandar obsolete so you no longer have to employ this quirky API.

org.joda.time.DateTime

Joda-Time aims to fix the legacy Date/Time API by offering:

With Joda-Time this is how our previous test case looks like:


@Test
public void testTimeZonesWithDateTime() throws ParseException {
	assertEquals(0L, newDateTimeMillis("GMT").toDate().getTime());
	assertEquals(TimeUnit.HOURS.toMillis(-9), newDateTimeMillis("Japan").toDate().getTime());
	assertEquals(TimeUnit.HOURS.toMillis(10), newDateTimeMillis("Pacific/Honolulu").toDate().getTime());
	DateTime epoch = newDateTimeMillis("GMT");
	assertEquals("1970-01-01T00:00:00.000Z", epoch.toString());
	epoch = epoch.toDateTime(DateTimeZone.forID("Japan"));
	assertEquals(0, epoch.toDate().getTime());
	assertEquals("1970-01-01T09:00:00.000+09:00", epoch.toString());
	MutableDateTime mutableDateTime = epoch.toMutableDateTime();
	mutableDateTime.setChronology(ISOChronology.getInstance().withZone(DateTimeZone.forID("Japan")));
	assertEquals("1970-01-01T09:00:00.000+09:00", epoch.toString());
}


private DateTime newDateTimeMillis(String timeZoneId) {
	return new DateTime(DateTimeZone.forID(timeZoneId))
			.withYear(1970)
			.withMonthOfYear(1)
			.withDayOfMonth(1)
			.withTimeAtStartOfDay();
}

The DateTime fluent API is much easier to use than java.util.Calendar#set. DateTime is immutable but we can easily switch to a MutableDateTime if it’s appropriate for our current use case.

Compared to our Calendar test case, when changing the time zone the relative timestamp doesn’t change a bit, therefore remaining the same original point in time.

It’s only the human time perception that changes (1970-01-01T00:00:00.000Z and 1970-01-01T09:00:00.000+09:00 pointing to the very same absolute time).

Relative vs Absolute time instances

When supporting time zones, you basically have two main alternatives: a relative timestamp and an absolute time info.

Relative timestamp

The numeric timestamp representation (the numbers of milliseconds since epoch) is a relative info. This value is given against the UTC epoch but you still need a time zone to properly represent the actual time on a particular region.

Being a long value, it’s the most compact time representation and it’s ideal when exchanging huge amounts of data.

If you don’t know the original event time zone, you risk of displaying a timestamp against the current local time zone and this is not always desirable.

Absolute timestamp

The absolute timestamp contains both the relative time as well as the time zone info. It’s quite common to express timestamps in their ISO 8601 string representation.

Compared to the numerical form (a 64 bit long) the string representation is less compact and it might take up to 25 characters (200 bits in UTF-8 encoding).

The ISO 8601 is quite common in XML files because the XML schema uses a lexical format inspired by the ISO 8601 standard.

An absolute time representation is much more convenient when we want to reconstruct the time instance against the original time zone. An e-mail client might want to display the email creation date using the sender’s time zone, and this can only be achieved using absolute timestamps.

Puzzles

The following exercise aims to demonstrate how difficult is to properly handle an ISO 8601 compliant date/time structure using the ancient java.text.DateFormat utilities.

java.text.SimpleDateFormat

First we are going to test the java.text.SimpleDateFormat parsing capabilities using the following test logic:

/**
 * DateFormat parsing utility
 * @param pattern date/time pattern
 * @param dateTimeString date/time string value
 * @param expectedNumericTimestamp expected millis since epoch 
 */
private void dateFormatParse(String pattern, String dateTimeString, long expectedNumericTimestamp) {
	try {
		Date utcDate = new SimpleDateFormat(pattern).parse(dateTimeString);
		if(expectedNumericTimestamp != utcDate.getTime()) {
			LOGGER.warn("Pattern: {}, date: {} actual epoch {} while expected epoch: {}", new Object[]{pattern, dateTimeString, utcDate.getTime(), expectedNumericTimestamp});
		}
	} catch (ParseException e) {
		LOGGER.warn("Pattern: {}, date: {} threw {}", new Object[]{pattern, dateTimeString, e.getClass().getSimpleName()});
	}
}

Use case 1

Let’s see how various ISO 8601 patterns behave against this first parser:

dateFormatParse("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", "1970-01-01T00:00:00.200Z", 200L);

Yielding the following outcome:

Pattern: yyyy-MM-dd'T'HH:mm:ss.SSS'Z', date: 1970-01-01T00:00:00.200Z actual epoch -7199800 while expected epoch: 200

This pattern is not ISO 8601 compliant. The single quote character is an escape sequence so the final ‘Z’ symbol is not treated as a time directive (e.g. Zulu time). After parsing, we’ll simply get a local time zone Date reference.

This test was run using my current system default Europe/Athens time zone, which as of writing this post, it’s two hours ahead of UTC.

Use case 2

According to java.util.SimpleDateFormat documentation the following pattern: yyyy-MM-dd’T’HH:mm:ss.SSSZ should match an ISO 8601 date/time string value:

dateFormatParse("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "1970-01-01T00:00:00.200Z", 200L);

But instead we got the following exception:

Pattern: yyyy-MM-dd'T'HH:mm:ss.SSSZ, date: 1970-01-01T00:00:00.200Z threw ParseException

So this pattern doesn’t seem to parse the Zulu time UTC string values.

Use case 3

The following patterns works just fine for explicit offsets:

dateFormatParse("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "1970-01-01T00:00:00.200+0000", 200L);

Use case 4

This pattern is also compatible with other time zone offsets:

dateFormatParse("yyyy-MM-dd'T'HH:mm:ss.SSSZ", "1970-01-01T00:00:00.200+0100", 200L - 1000 * 60 * 60);

Use case 5

To match the Zulu time notation we need to use the following pattern:

dateFormatParse("yyyy-MM-dd'T'HH:mm:ss.SSSXXX", "1970-01-01T00:00:00.200Z", 200L);

Use case 6

Unfortunately, this last pattern is not compatible with explicit time zone offsets:

dateFormatParse("yyyy-MM-dd'T'HH:mm:ss.SSSXXX", "1970-01-01T00:00:00.200+0000", 200L);

Ending-up with the following exception:

Pattern: yyyy-MM-dd'T'HH:mm:ss.SSSXXX, date: 1970-01-01T00:00:00.200+0000 threw ParseException

org.joda.time.DateTime

As opposed to java.text.SimpleDateFormat, Joda-Time is compatible with any ISO 8601 pattern. The following test case is going to be used for the upcoming test cases:

/**
 * Joda-Time parsing utility
 * @param dateTimeString date/time string value
 * @param expectedNumericTimestamp expected millis since epoch
 */
private void jodaTimeParse(String dateTimeString, long expectedNumericTimestamp) {
	Date utcDate = DateTime.parse(dateTimeString).toDate();
	if(expectedNumericTimestamp != utcDate.getTime()) {
		LOGGER.warn("date: {} actual epoch {} while expected epoch: {}", new Object[]{dateTimeString, utcDate.getTime(), expectedNumericTimestamp});
	}
}

Joda-Time is compatible with all standard ISO 8601 date/time formats:

jodaTimeParse("1970-01-01T00:00:00.200Z", 200L);
jodaTimeParse("1970-01-01T00:00:00.200+0000", 200L);
jodaTimeParse("1970-01-01T00:00:00.200+0100", 200L - 1000 * 60 * 60);

Conclusion

As you can see, the ancient Java Date/Time utilities are not easy to work with. Joda-Time is a much better alternative, offering better time handling features.

If you happen to work with Java 8, it’s worth switching to the Java 8 Date/Time API, being designed from scratch but very much inspired by Joda-Time.

Code available on GitHub.

If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.

An entity modelling strategy for scaling optimistic locking

Introduction

Application-level repeatable reads are suitable for preventing lost updates in web conversations. Enabling entity-level optimistic locking is fairly easy. You just have to mark one logical-clock property (usually an integer counter) with the JPA @Version annotation and Hibernate takes care of the rest.

The catch

Optimistic locking discards all incoming changes that are relative to an older entity version. But everything has a cost and optimistic locking makes no difference.

The optimistic concurrency control mechanism takes an all-or-nothing approach even for non-overlapping changes. If two concurrent transactions are changing distinct entity property subsets then there’s no risk of losing updates.

Two concurrent updates, starting from the same entity version are always going to collide. It’s only the first update that’s going to succeed, the second one failing with an optimistic locking exception. This strict policy acts as if all changes are overlapping. For highly concurrent write scenarios, this single-version check strategy can lead to a large number of roll-backed updates.

Time for testing

Let’s say we have the following Product entity:

OptimisticLockingOneProductEntityOneVersion

This entity is updated by three users (e.g. Alice, Bob and Vlad), each one updating a distinct property subset. The following diagram depicts their actions:

OptimisticLockingOneRootEntityOneVersion

The SQL DML statement sequence goes like this:

#create tables
Query:{[create table product (id bigint not null, description varchar(255) not null, likes integer not null, name varchar(255) not null, price numeric(19,2) not null, quantity bigint not null, version integer not null, primary key (id))][]} 
Query:{[alter table product add constraint UK_jmivyxk9rmgysrmsqw15lqr5b  unique (name)][]} 

#insert product
Query:{[insert into product (description, likes, name, price, quantity, version, id) values (?, ?, ?, ?, ?, ?, ?)][Plasma TV,0,TV,199.99,7,0,1]} 

#Alice selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_, optimistic0_.version as version7_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 
#Bob selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_, optimistic0_.version as version7_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 
#Vlad selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.likes as likes3_0_0_, optimistic0_.name as name4_0_0_, optimistic0_.price as price5_0_0_, optimistic0_.quantity as quantity6_0_0_, optimistic0_.version as version7_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 

#Alice updates the product
Query:{[update product set description=?, likes=?, name=?, price=?, quantity=?, version=? where id=? and version=?][Plasma TV,0,TV,199.99,6,1,1,0]} 

#Bob updates the product
Query:{[update product set description=?, likes=?, name=?, price=?, quantity=?, version=? where id=? and version=?][Plasma TV,1,TV,199.99,7,1,1,0]} 
c.v.h.m.l.c.OptimisticLockingOneRootOneVersionTest - Bob: Optimistic locking failure
org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.vladmihalcea.hibernate.masterclass.laboratory.concurrency.OptimisticLockingOneRootOneVersionTest$Product#1]

#Vlad updates the product
Query:{[update product set description=?, likes=?, name=?, price=?, quantity=?, version=? where id=? and version=?][Plasma HDTV,0,TV,199.99,7,1,1,0]} 
c.v.h.m.l.c.OptimisticLockingOneRootOneVersionTest - Vlad: Optimistic locking failure
org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.vladmihalcea.hibernate.masterclass.laboratory.concurrency.OptimisticLockingOneRootOneVersionTest$Product#1]

Because there’s only one entity version, it’s just the first transaction that’s going to succeed. The second and the third updates are discarded since they reference an older entity version.

Divide et impera

If there are more than one writing patterns, we can divide the original entity into several sub-entities. Instead of only one optimistic locking counter, we now have one distinct counter per each sub-entity. In our example, the quantity can be moved to ProductStock and the likes to ProductLiking.

OptimisticLockingOneProductEntityMultipleVersions

Whenever we change the product quantity, it’s only the ProductStock version that’s going to be checked, so other competing quantity updates are prevented. But now, we can concurrently update both the main entity (e.g. Product) and each individual sub-entity (e.g. ProductStock and ProductLiking):

OptimisticLockingOneRootEntityMultipleVersions

Running the previous test case yields the following output:

#create tables
Query:{[create table product (id bigint not null, description varchar(255) not null, name varchar(255) not null, price numeric(19,2) not null, version integer not null, primary key (id))][]}
Query:{[create table product_liking (likes integer not null, version integer not null, product_id bigint not null, primary key (product_id))][]} 
Query:{[create table product_stock (quantity bigint not null, version integer not null, product_id bigint not null, primary key (product_id))][]} #insert product
Query:{[alter table product add constraint UK_jmivyxk9rmgysrmsqw15lqr5b  unique (name)][]} Query:{[insert into product (description, name, price, version, id) values (?, ?, ?, ?, ?)][Plasma TV,TV,199.99,0,1]} 
Query:{[alter table product_liking add constraint FK_4oiot8iambqw53dwcldltqkco foreign key (product_id) references product][]} Query:{[insert into product_liking (likes, product_id) values (?, ?)][0,1]} 
Query:{[alter table product_stock add constraint FK_hj4kvinsv4h5gi8xi09xbdl46 foreign key (product_id) references product][]} Query:{[insert into product_stock (quantity, product_id) values (?, ?)][7,1]} 

#insert product
Query:{[insert into product (description, name, price, version, id) values (?, ?, ?, ?, ?)][Plasma TV,TV,199.99,0,1]}
Query:{[insert into product_liking (likes, version, product_id) values (?, ?, ?)][0,0,1]} 
Query:{[insert into product_stock (quantity, version, product_id) values (?, ?, ?)][7,0,1]} #Alice selects the product

#Alice selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.name as name3_0_0_, optimistic0_.price as price4_0_0_, optimistic0_.version as version5_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 
Query:{[select optimistic0_.product_id as product_3_1_0_, optimistic0_.likes as likes1_1_0_, optimistic0_.version as version2_1_0_ from product_liking optimistic0_ where optimistic0_.product_id=?][1]}
Query:{[select optimistic0_.product_id as product_3_2_0_, optimistic0_.quantity as quantity1_2_0_, optimistic0_.version as version2_2_0_ from product_stock optimistic0_ where optimistic0_.product_id=?][1]}

#Bob selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.name as name3_0_0_, optimistic0_.price as price4_0_0_, optimistic0_.version as version5_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 
Query:{[select optimistic0_.product_id as product_3_1_0_, optimistic0_.likes as likes1_1_0_, optimistic0_.version as version2_1_0_ from product_liking optimistic0_ where optimistic0_.product_id=?][1]}
Query:{[select optimistic0_.product_id as product_3_2_0_, optimistic0_.quantity as quantity1_2_0_, optimistic0_.version as version2_2_0_ from product_stock optimistic0_ where optimistic0_.product_id=?][1]}

#Vlad selects the product
Query:{[select optimistic0_.id as id1_0_0_, optimistic0_.description as descript2_0_0_, optimistic0_.name as name3_0_0_, optimistic0_.price as price4_0_0_, optimistic0_.version as version5_0_0_ from product optimistic0_ where optimistic0_.id=?][1]} 
Query:{[select optimistic0_.product_id as product_3_1_0_, optimistic0_.likes as likes1_1_0_, optimistic0_.version as version2_1_0_ from product_liking optimistic0_ where optimistic0_.product_id=?][1]}
Query:{[select optimistic0_.product_id as product_3_2_0_, optimistic0_.quantity as quantity1_2_0_, optimistic0_.version as version2_2_0_ from product_stock optimistic0_ where optimistic0_.product_id=?][1]}

#Alice updates the product
Query:{[update product_stock set quantity=?, version=? where product_id=? and version=?][6,1,1,0]} 

#Bob updates the product
Query:{[update product_liking set likes=?, version=? where product_id=? and version=?][1,1,1,0]} 

#Vlad updates the product
Query:{[update product set description=?, name=?, price=?, version=? where id=? and version=?][Plasma HDTV,TV,199.99,1,1,0]}

All three concurrent transactions are successful because we no longer have only one logical-clock version but three of them, according to three distinct write responsibilities.

Conclusion

When designing the persistence domain model, you have to take into consideration both the querying and writing responsibility patterns.

Breaking a larger entity into several sub-entities can help you scale updates, while reducing the chance of optimistic locking failures. If you wary of possible performance issues (due to entity state fragmentation) you should then know that Hibernate offers several optimization techniques for overcoming the scattered entity info side-effect.

You can always join all sub-entities in a single SQL query, in case you need all entity related data.

The second-level caching is also a good solution for fetching sub-entities without hitting the database. Because we split the root entity in several entities, the cache can be better utilized. A stock update is only going to invalidate the associated ProductStock cache entry, without interfering with Product and ProductLiking cache regions.

Code available on GitHub.

If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.

Hibernate collections optimistic locking

Introduction

Hibernate provides an optimistic locking mechanism to prevent lost updates even for long-conversations. In conjunction with an entity storage, spanning over multiple user requests (extended persistence context or detached entities) Hibernate can guarantee application-level repeatable-reads.

The dirty checking mechanism detects entity state changes and increments the entity version. While basic property changes are always taken into consideration, Hibernate collections are more subtle in this regard.

Owned vs Inverse collections

In relational databases, two records are associated through a foreign key reference. In this relationship, the referenced record is the parent while the referencing row (the foreign key side) is the child. A non-null foreign key may only reference an existing parent record.

In the Object-oriented space this association can be represented in both directions. We can have a many-to-one reference from a child to parent and the parent can also have a one-to-many children collection.

Because both sides could potentially control the database foreign key state, we must ensure that only one side is the owner of this association. Only the owning side state changes are propagated to the database. The non-owning side has been traditionally referred as the inverse side.

Next I’ll describe the most common ways of modelling this association.

The unidirectional parent-owning-side-child association mapping

Only the parent side has a @OneToMany non-inverse children collection. The child entity doesn’t reference the parent entity at all.

@Entity(name = "post")
public class Post {
	...
	@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
	private List<Comment> comments = new ArrayList<Comment>();
	...
}	

The unidirectional parent-owning-side-child component association mapping mapping

The child side doesn’t always have to be an entity and we might model it as a component type instead. An Embeddable object (component type) may contain both basic types and association mappings but it can never contain an @Id. The Embeddable object is persisted/removed along with its owning entity.

The parent has an @ElementCollection children association. The child entity may only reference the parent through the non-queryable Hibernate specific @Parent annotation.

@Entity(name = "post")
public class Post {
	...
	@ElementCollection
	@JoinTable(name = "post_comments", joinColumns = @JoinColumn(name = "post_id"))
	@OrderColumn(name = "comment_index")
	private List<Comment> comments = new ArrayList<Comment>();
	...

	public void addComment(Comment comment) {
		comment.setPost(this);
		comments.add(comment);
	}
}	

@Embeddable
public class Comment {
	...
	@Parent
	private Post post;
	...
}

The bidirectional parent-owning-side-child association mapping

The parent is the owning side so it has a @OneToMany non-inverse (without a mappedBy directive) children collection. The child entity references the parent entity through a @ManyToOne association that’s neither insertable nor updatable:

@Entity(name = "post")
public class Post {
	...
	@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
	private List<Comment> comments = new ArrayList<Comment>();
	...

	public void addComment(Comment comment) {
		comment.setPost(this);
		comments.add(comment);
	}
}	

@Entity(name = "comment")
public class Comment {
	...
	@ManyToOne
	@JoinColumn(name = "post_id", insertable = false, updatable = false)
	private Post post;
	...
}

The bidirectional child-owning-side-parent association mapping

The child entity references the parent entity through a @ManyToOne association, and the parent has a mappedBy @OneToMany children collection. The parent side is the inverse side so only the @ManyToOne state changes are propagated to the database.

Even if there’s only one owning side, it’s always a good practice to keep both sides in sync by using the add/removeChild() methods.

@Entity(name = "post")
public class Post {
	...
	@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "post")
	private List<Comment> comments = new ArrayList<Comment>();
	...

	public void addComment(Comment comment) {
		comment.setPost(this);
		comments.add(comment);
	}
}	

@Entity(name = "comment")
public class Comment {
	...
	@ManyToOne
	private Post post;	
	...
}

The unidirectional child-owning-side-parent association mapping

The child entity references the parent through a @ManyToOne association. The parent doesn’t have a @OneToMany children collection so the child entity becomes the owning side. This association mapping resembles the relational data foreign key linkage.

@Entity(name = "comment")
public class Comment {
    ...
    @ManyToOne
    private Post post;	
    ...
}

Collection versioning

The 3.4.2 section of the JPA 2.1 specification defines optimistic locking as:

The version attribute is updated by the persistence provider runtime when the object is written to the database. All non-relationship fields and proper ties and all relationships owned by the entity are included in version checks[35].

[35] This includes owned relationships maintained in join tables

N.B. Only owning-side children collection can update the parent version.

Testing time

Let’s test how the parent-child association type affects the parent versioning. Because we are interested in the children collection dirty checking, the unidirectional child-owning-side-parent association is going to be skipped, as in that case the parent doesn’t contain a children collection.

Test case

The following test case is going to be used for all collection type use cases:

protected void simulateConcurrentTransactions(final boolean shouldIncrementParentVersion) {
	final ExecutorService executorService = Executors.newSingleThreadExecutor();

	doInTransaction(new TransactionCallable<Void>() {
		@Override
		public Void execute(Session session) {
			try {
				P post = postClass.newInstance();
				post.setId(1L);
				post.setName("Hibernate training");
				session.persist(post);
				return null;
			} catch (Exception e) {
				throw new IllegalArgumentException(e);
			}
		}
	});

	doInTransaction(new TransactionCallable<Void>() {
		@Override
		public Void execute(final Session session) {
			final P post = (P) session.get(postClass, 1L);
			try {
				executorService.submit(new Callable<Void>() {
					@Override
					public Void call() throws Exception {
						return doInTransaction(new TransactionCallable<Void>() {
							@Override
							public Void execute(Session _session) {
								try {
									P otherThreadPost = (P) _session.get(postClass, 1L);
									int loadTimeVersion = otherThreadPost.getVersion();
									assertNotSame(post, otherThreadPost);
									assertEquals(0L, otherThreadPost.getVersion());
									C comment = commentClass.newInstance();
									comment.setReview("Good post!");
									otherThreadPost.addComment(comment);
									_session.flush();
									if (shouldIncrementParentVersion) {
										assertEquals(otherThreadPost.getVersion(), loadTimeVersion + 1);
									} else {
										assertEquals(otherThreadPost.getVersion(), loadTimeVersion);
									}
									return null;
								} catch (Exception e) {
									throw new IllegalArgumentException(e);
								}
							}
						});
					}
				}).get();
			} catch (Exception e) {
				throw new IllegalArgumentException(e);
			}
			post.setName("Hibernate Master Class");
			session.flush();
			return null;
		}
	});
}

The unidirectional parent-owning-side-child association testing

#create tables
Query:{[create table comment (id bigint generated by default as identity (start with 1), review varchar(255), primary key (id))][]} 
Query:{[create table post (id bigint not null, name varchar(255), version integer not null, primary key (id))][]} 
Query:{[create table post_comment (post_id bigint not null, comments_id bigint not null, comment_index integer not null, primary key (post_id, comment_index))][]} 
Query:{[alter table post_comment add constraint UK_se9l149iyyao6va95afioxsrl  unique (comments_id)][]} 
Query:{[alter table post_comment add constraint FK_se9l149iyyao6va95afioxsrl foreign key (comments_id) references comment][]} 
Query:{[alter table post_comment add constraint FK_6o1igdm04v78cwqre59or1yj1 foreign key (post_id) references post][]} 

#insert post in primary transaction
Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} 

#select post in secondary transaction
Query:{[select entityopti0_.id as id1_1_0_, entityopti0_.name as name2_1_0_, entityopti0_.version as version3_1_0_ from post entityopti0_ where entityopti0_.id=?][1]} 

#insert comment in secondary transaction
#optimistic locking post version update in secondary transaction
Query:{[insert into comment (id, review) values (default, ?)][Good post!]} 
Query:{[update post set name=?, version=? where id=? and version=?][Hibernate training,1,1,0]} 
Query:{[insert into post_comment (post_id, comment_index, comments_id) values (?, ?, ?)][1,0,1]} 

#optimistic locking exception in primary transaction
Query:{[update post set name=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]}
org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.vladmihalcea.hibernate.masterclass.laboratory.concurrency.EntityOptimisticLockingOnUnidirectionalCollectionTest$Post#1] 

The unidirectional parent-owning-side-child component association testing

#create tables
Query:{[create table post (id bigint not null, name varchar(255), version integer not null, primary key (id))][]} 
Query:{[create table post_comments (post_id bigint not null, review varchar(255), comment_index integer not null, primary key (post_id, comment_index))][]} 
Query:{[alter table post_comments add constraint FK_gh9apqeduab8cs0ohcq1dgukp foreign key (post_id) references post][]} 

#insert post in primary transaction
Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} 

#select post in secondary transaction
Query:{[select entityopti0_.id as id1_0_0_, entityopti0_.name as name2_0_0_, entityopti0_.version as version3_0_0_ from post entityopti0_ where entityopti0_.id=?][1]} 
Query:{[select comments0_.post_id as post_id1_0_0_, comments0_.review as review2_1_0_, comments0_.comment_index as comment_3_0_ from post_comments comments0_ where comments0_.post_id=?][1]} 

#insert comment in secondary transaction
#optimistic locking post version update in secondary transaction
Query:{[update post set name=?, version=? where id=? and version=?][Hibernate training,1,1,0]} 
Query:{[insert into post_comments (post_id, comment_index, review) values (?, ?, ?)][1,0,Good post!]} 

#optimistic locking exception in primary transaction
Query:{[update post set name=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} 
org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.vladmihalcea.hibernate.masterclass.laboratory.concurrency.EntityOptimisticLockingOnComponentCollectionTest$Post#1]	

The bidirectional parent-owning-side-child association testing

#create tables
Query:{[create table comment (id bigint generated by default as identity (start with 1), review varchar(255), post_id bigint, primary key (id))][]} 
Query:{[create table post (id bigint not null, name varchar(255), version integer not null, primary key (id))][]} 
Query:{[create table post_comment (post_id bigint not null, comments_id bigint not null)][]} 
Query:{[alter table post_comment add constraint UK_se9l149iyyao6va95afioxsrl  unique (comments_id)][]} 
Query:{[alter table comment add constraint FK_f1sl0xkd2lucs7bve3ktt3tu5 foreign key (post_id) references post][]} 
Query:{[alter table post_comment add constraint FK_se9l149iyyao6va95afioxsrl foreign key (comments_id) references comment][]} 
Query:{[alter table post_comment add constraint FK_6o1igdm04v78cwqre59or1yj1 foreign key (post_id) references post][]} 

#insert post in primary transaction
Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} 

#select post in secondary transaction
Query:{[select entityopti0_.id as id1_1_0_, entityopti0_.name as name2_1_0_, entityopti0_.version as version3_1_0_ from post entityopti0_ where entityopti0_.id=?][1]} 
Query:{[select comments0_.post_id as post_id1_1_0_, comments0_.comments_id as comments2_2_0_, entityopti1_.id as id1_0_1_, entityopti1_.post_id as post_id3_0_1_, entityopti1_.review as review2_0_1_, entityopti2_.id as id1_1_2_, entityopti2_.name as name2_1_2_, entityopti2_.version as version3_1_2_ from post_comment comments0_ inner join comment entityopti1_ on comments0_.comments_id=entityopti1_.id left outer join post entityopti2_ on entityopti1_.post_id=entityopti2_.id where comments0_.post_id=?][1]} 

#insert comment in secondary transaction
#optimistic locking post version update in secondary transaction
Query:{[insert into comment (id, review) values (default, ?)][Good post!]} 
Query:{[update post set name=?, version=? where id=? and version=?][Hibernate training,1,1,0]} 
Query:{[insert into post_comment (post_id, comments_id) values (?, ?)][1,1]} 

#optimistic locking exception in primary transaction
Query:{[update post set name=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} 
org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.vladmihalcea.hibernate.masterclass.laboratory.concurrency.EntityOptimisticLockingOnBidirectionalParentOwningCollectionTest$Post#1]

The bidirectional child-owning-side-parent association testing

#create tables
Query:{[create table comment (id bigint generated by default as identity (start with 1), review varchar(255), post_id bigint, primary key (id))][]} 
Query:{[create table post (id bigint not null, name varchar(255), version integer not null, primary key (id))][]} 
Query:{[alter table comment add constraint FK_f1sl0xkd2lucs7bve3ktt3tu5 foreign key (post_id) references post][]} 

#insert post in primary transaction
Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} 

#select post in secondary transaction
Query:{[select entityopti0_.id as id1_1_0_, entityopti0_.name as name2_1_0_, entityopti0_.version as version3_1_0_ from post entityopti0_ where entityopti0_.id=?][1]} 

#insert comment in secondary transaction
#post version is not incremented in secondary transaction
Query:{[insert into comment (id, post_id, review) values (default, ?, ?)][1,Good post!]} 
Query:{[select count(id) from comment where post_id =?][1]} 

#update works in primary transaction
Query:{[update post set name=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} 

Overruling default collection versioning

If the default owning-side collection versioning is not suitable for your use case, you can always overrule it with Hibernate @OptimisticLock annotation.

Let’s overrule the default parent version update mechanism for bidirectional parent-owning-side-child association:

@Entity(name = "post")
public class Post {
	...
	@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
	@OptimisticLock(excluded = true)
	private List<Comment> comments = new ArrayList<Comment>();
	...

	public void addComment(Comment comment) {
		comment.setPost(this);
		comments.add(comment);
	}
}	

@Entity(name = "comment")
public class Comment {
	...
	@ManyToOne
	@JoinColumn(name = "post_id", insertable = false, updatable = false)
	private Post post;
	...
}

This time, the children collection changes won’t trigger a parent version update:

#create tables
Query:{[create table comment (id bigint generated by default as identity (start with 1), review varchar(255), post_id bigint, primary key (id))][]} 
Query:{[create table post (id bigint not null, name varchar(255), version integer not null, primary key (id))][]} 
Query:{[create table post_comment (post_id bigint not null, comments_id bigint not null)][]} 
Query:{[alter table post_comment add constraint UK_se9l149iyyao6va95afioxsrl  unique (comments_id)][]} 
Query:{[alter table comment add constraint FK_f1sl0xkd2lucs7bve3ktt3tu5 foreign key (post_id) references post][]} 
Query:{[alter table post_comment add constraint FK_se9l149iyyao6va95afioxsrl foreign key (comments_id) references comment][]} 
Query:{[alter table post_comment add constraint FK_6o1igdm04v78cwqre59or1yj1 foreign key (post_id) references post][]} 

#insert post in primary transaction
Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} 

#select post in secondary transaction
Query:{[select entityopti0_.id as id1_1_0_, entityopti0_.name as name2_1_0_, entityopti0_.version as version3_1_0_ from post entityopti0_ where entityopti0_.id=?][1]} 
Query:{[select comments0_.post_id as post_id1_1_0_, comments0_.comments_id as comments2_2_0_, entityopti1_.id as id1_0_1_, entityopti1_.post_id as post_id3_0_1_, entityopti1_.review as review2_0_1_, entityopti2_.id as id1_1_2_, entityopti2_.name as name2_1_2_, entityopti2_.version as version3_1_2_ from post_comment comments0_ inner join comment entityopti1_ on comments0_.comments_id=entityopti1_.id left outer join post entityopti2_ on entityopti1_.post_id=entityopti2_.id where comments0_.post_id=?][1]} 

#insert comment in secondary transaction
Query:{[insert into comment (id, review) values (default, ?)][Good post!]} 
Query:{[insert into post_comment (post_id, comments_id) values (?, ?)][1,1]} 

#update works in primary transaction
Query:{[update post set name=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} 

Conclusion

It’s very important to understand how various modelling structures impact concurrency patterns. The owning-side collections changes are taken into consideration when incrementing the parent version number, and you can always bypass it using the @OptimisticLock annotation.

Code available on GitHub.

If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.