JPA and Hibernate FetchType EAGER is a code smell

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

Hibernate fetching strategies can really make a difference between an application that barely crawls and a highly responsive one. In this post, I’ll explain why you should prefer query-based fetching instead of global fetch plans.

Fetching 101

Hibernate defines four association retrieving strategies:

Fetching Strategy Description
Join The association is OUTER JOINED in the original SELECT statement
Select An additional SELECT statement is used to retrieve the associated entity(entities)
Subselect An additional SELECT statement is used to retrieve the whole associated collection. This mode is meant for to-many associations
Batch An additional number of SELECT statements is used to retrieve the whole associated collection. Each additional SELECT will retrieve a fixed number of associated entities. This mode is meant for to-many associations

These fetching strategies might be applied in the following scenarios:

  • the association is always initialized along with its owner (e.g. EAGER FetchType)
  • the uninitialized association (e.g. LAZY FetchType) is navigated, therefore the association must be retrieved with a secondary SELECT

The Hibernate mappings fetching information forms the global fetch plan. At query time, we may override the global fetch plan, but only for LAZY associations. For this we can use the fetch HQL/JPQL/Criteria directive. EAGER associations cannot be overridden, therefore tying your application to the global fetch plan.

Hibernate 3 acknowledged that LAZY should be the default association fetching strategy:

By default, Hibernate3 uses lazy select fetching for collections and lazy proxy fetching for single-valued associations. These defaults make sense for most associations in the majority of applications.

This decision was taken after noticing many performance issues associated with Hibernate 2 default eager fetching. Unfortunately, JPA has taken a different approach and decided that to-many associations be LAZY while to-one relationships are fetched eagerly.

Association type Default fetching policy
@OneToMany LAZY
@ManyToMany LAZY
@ManyToOne EAGER
@OneToOne EAGER

EAGER fetching inconsistencies

While it may be convenient to just mark associations as EAGER, delegating the fetching responsibility to Hibernate, it’s advisable to resort to query-based fetch plans.

An EAGER association will always be fetched and the fetching strategy is not consistent across all querying techniques.

Next, I’m going to demonstrate how EAGER fetching behaves for all Hibernate querying variants. I will reuse the same entity model I’ve previously introduced in my fetching strategies article:

Product entity

The Product entity has the following associations:

@ManyToOne(fetch = FetchType.EAGER)
@JoinColumn(
    name = "company_id", 
    nullable = false
)
private Company company;

@OneToOne(
    mappedBy = "product", 
    fetch = FetchType.LAZY, 
    cascade = CascadeType.ALL, 
    optional = false
)
private WarehouseProductInfo warehouseProductInfo;

@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "importer_id")
private Importer importer;

@OneToMany(
    mappedBy = "product", 
    fetch = FetchType.LAZY, 
    cascade = CascadeType.ALL, 
    orphanRemoval = true
)
@OrderBy("index")
private Set<Image> images = new LinkedHashSet<>();

The company association is marked as EAGER and Hibernate will always employ a fetching strategy to initialize it along with its owner entity.

Persistence Context loading

First we’ll load the entity using the Persistence Context API:

Product product = entityManager.find(Product.class, productId);

Which generates the following SQL SELECT statement:

Query:{[
select 
    product0_.id as id1_18_1_, 
    product0_.code as code2_18_1_, 
    product0_.company_id as company_6_18_1_, 
    product0_.importer_id as importer7_18_1_, 
    product0_.name as name3_18_1_, 
    product0_.quantity as quantity4_18_1_, 
    product0_.version as version5_18_1_, 
    company1_.id as id1_6_0_, 
    company1_.name as name2_6_0_ 
from Product product0_ 
inner join Company company1_ on product0_.company_id=company1_.id 
where product0_.id=?][1]

The EAGER company association was retrieved using an inner join. For M such associations the owner entity table is going to be joined M times.

Each extra join adds up to the overall query complexity and execution time. If we don’t even use all these associations, for every possible business scenario, then we’ve just paid the extra performance penalty for nothing in return.

Fetching using JPQL and Criteria

Product product = entityManager.createQuery(
    "select p " +
    "from Product p " +
    "where p.id = :productId", Product.class)
.setParameter("productId", productId)
.getSingleResult();

or with

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Product> cq = cb.createQuery(Product.class);
Root<Product> productRoot = cq.from(Product.class);
cq.where(cb.equal(productRoot.get("id"), productId));
Product product = entityManager.createQuery(cq).getSingleResult();

Writing JPA Criteria API queries is not very easy. The Codota IDE plugin can guide you on how to write such queries, therefore increasing your productivity.

For more details about how you can use Codota to speed up the process of writing Criteria API queries, check out this article.

Generates the following SQL SELECT statements:

Query:{[
select 
    product0_.id as id1_18_, 
    product0_.code as code2_18_, 
    product0_.company_id as company_6_18_, 
    product0_.importer_id as importer7_18_, 
    product0_.name as name3_18_, 
    product0_.quantity as quantity4_18_, 
    product0_.version as version5_18_ 
from Product product0_ 
where product0_.id=?][1]} 

Query:{[
select 
    company0_.id as id1_6_0_, 
    company0_.name as name2_6_0_ 
from Company company0_ 
where company0_.id=?][1]}

Both JPQL and Criteria queries default to select fetching, therefore issuing a secondary select for each individual EAGER association. The larger the associations’ number, the more additional individual SELECTS, the more it will affect our application performance.

Hibernate Criteria API

While JPA 2.0 added support for Criteria queries, Hibernate has long been offering a specific dynamic query implementation.

If the EntityManager implementation delegates method calls the legacy Session API, the JPA Criteria implementation was written from scratch. That’s the reason why Hibernate and JPA Criteria API behave differently for similar querying scenarios.

The previous example Hibernate Criteria equivalent looks like this:

Product product = (Product) session
    .createCriteria(Product.class)
    .add(Restrictions.eq("id", productId))
    .uniqueResult();

And the associated SQL SELECT is:

Query:{[
select 
    this_.id as id1_3_1_, 
    this_.code as code2_3_1_, 
    this_.company_id as company_6_3_1_, 
    this_.importer_id as importer7_3_1_, 
    this_.name as name3_3_1_, 
    this_.quantity as quantity4_3_1_, 
    this_.version as version5_3_1_, 
    hibernatea2_.id as id1_0_0_, 
    hibernatea2_.name as name2_0_0_ 
from Product this_ 
inner join Company hibernatea2_ on this_.company_id=hibernatea2_.id 
where this_.id=?][1]}

This query uses the join fetch strategy as opposed to select fetching, employed by JPQL/HQL and Criteria API.

Hibernate Criteria and EAGER collections

Let’s see what happens when the image collection fetching strategy is set to EAGER:

@OneToMany(
    mappedBy = "product", 
    fetch = FetchType.EAGER, 
    cascade = CascadeType.ALL, 
    orphanRemoval = true
)
@OrderBy("index")
private Set<Image> images = new LinkedHashSet<>();

The following SQL is going to be generated:

Query:{[
select 
    this_.id as id1_3_2_, 
    this_.code as code2_3_2_, 
    this_.company_id as company_6_3_2_, 
    this_.importer_id as importer7_3_2_, 
    this_.name as name3_3_2_, 
    this_.quantity as quantity4_3_2_, 
    this_.version as version5_3_2_, 
    hibernatea2_.id as id1_0_0_, 
    hibernatea2_.name as name2_0_0_, 
    images3_.product_id as product_4_3_4_, 
    images3_.id as id1_1_4_, 
    images3_.id as id1_1_1_, 
    images3_.index as index2_1_1_, 
    images3_.name as name3_1_1_, 
    images3_.product_id as product_4_1_1_ 
from Product this_ 
inner join Company hibernatea2_ on this_.company_id=hibernatea2_.id 
left outer join Image images3_ on this_.id=images3_.product_id 
where this_.id=? 
order by images3_.index][1]}

Hibernate Criteria doesn’t automatically group the parent entities list. Because of the one-to-many children table JOIN, for each child entity we are going to get a new parent entity object reference (all pointing to the same object in our current Persistence Context):

product.setName("TV");
product.setCompany(company);

Image frontImage = new Image();
frontImage.setName("front image");
frontImage.setIndex(0);

Image sideImage = new Image();
sideImage.setName("side image");
sideImage.setIndex(1);

product.addImage(frontImage);
product.addImage(sideImage);

List products = session
    .createCriteria(Product.class)
    .add(Restrictions.eq("id", productId))
    .list();
    
assertEquals(2, products.size());
assertSame(products.get(0), products.get(1));

Because we have two image entities, we will get two Product entity references, both pointing to the same first-level cache entry.

To fix it we need to instruct Hibernate Criteria to use distinct root entities:

List products = session
    .createCriteria(Product.class)
    .add(Restrictions.eq("id", productId))
    .setResultTransformer(
        CriteriaSpecification.DISTINCT_ROOT_ENTITY
    )
    .list();

assertEquals(1, products.size());

I'm running an online workshop on the 20-21 and 23-24 of November about High-Performance Java Persistence.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

The EAGER fetching strategy is a code smell. Most often it’s used for simplicity sake without considering the long-term performance penalties. The fetching strategy should never be the entity mapping responsibility. Each business use case has different entity load requirements and therefore the fetching strategy should be delegated to each individual query.

The global fetch plan should only define LAZY associations, which are fetched on a per-query basis. Combined with the always check generated queries strategy, the query-based fetch plans can improve application performance and reduce maintaining costs.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.