The best way to lazy load entity attributes using JPA and Hibernate

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

When fetching an entity, all attributes are going to be loaded as well. This is because every entity attribute is implicitly marked with the @Basic annotation whose default fetch policy is FetchType.EAGER.

However, the attribute fetch strategy can be set to FetchType.LAZY, in which case the entity attribute is loaded with a secondary select statement upon being accessed for the first time.

@Basic(fetch = FetchType.LAZY)

This configuration alone is not sufficient because Hibernate requires bytecode instrumentation to intercept the attribute access request and issue the secondary select statement on demand.

Bytecode enhancement

When using the Maven bytecode enhancement plugin, the enableLazyInitialization configuration property must be set to true as illustrated in the following example:

<plugin>
    <groupId>org.hibernate.orm.tooling</groupId>
    <artifactId>hibernate-enhance-maven-plugin</artifactId>
    <version>${hibernate.version}</version>
    <executions>
        <execution>
            <configuration>
                <failOnError>true</failOnError>
                <enableLazyInitialization>true</enableLazyInitialization>
            </configuration>
            <goals>
                <goal>enhance</goal>
            </goals>
        </execution>
    </executions>
</plugin>

With this configuration in place, all JPA entity classes are going to be instrumented with lazy attribute fetching. This process takes place at build time, right after entity classes are compiled from their associated source files.

The attribute lazy fetching mechanism is very useful when dealing with column types that store large amounts of data (e.g. BLOB, CLOB, VARBINARY). This way, the entity can be fetched without automatically loading data from the underlying large column types, therefore improving performance.

To demonstrate how attribute lazy fetching works, the following example is going to use an Attachment entity which can store any media type (e.g. PNG, PDF, MPEG).

@Entity @Table(name = "attachment")
public class Attachment {

    @Id
    @GeneratedValue
    private Long id;

    private String name;

    @Enumerated
    @Column(name = "media_type")
    private MediaType mediaType;

    @Lob
    @Basic(fetch = FetchType.LAZY)
    private byte[] content;

    //Getters and setters omitted for brevity
}

Properties such as the entity identifier, the name or the media type are to be fetched eagerly on every entity load. On the other hand, the media file content should be fetched lazily, only when being accessed by the application code.

After the Attachment entity is instrumented, the class bytecode is changed as follows:

@Transient
private transient PersistentAttributeInterceptor 
    $$_hibernate_attributeInterceptor;

public byte[] getContent() {
    return $$_hibernate_read_content();
}

public byte[] $$_hibernate_read_content() {
    if ($$_hibernate_attributeInterceptor != null) {
        this.content = ((byte[]) 
            $$_hibernate_attributeInterceptor.readObject(
                this, "content", this.content));
    }
    return this.content;
}

The content attribute fetching is done by the PersistentAttributeInterceptor object reference, therefore providing a way to load the underlying BLOB column only when the getter is called for the first time.

attachment table

When executing the following test case:

Attachment book = entityManager.find(
    Attachment.class, bookId);

LOGGER.debug("Fetched book: {}", book.getName());

assertArrayEquals(
    Files.readAllBytes(bookFilePath), 
    book.getContent()
);

Hibernate generates the following SQL queries:

SELECT a.id AS id1_0_0_,
       a.media_type AS media_ty3_0_0_,
       a.name AS name4_0_0_
FROM   attachment a
WHERE  a.id = 1

-- Fetched book: High-Performance Java Persistence

SELECT a.content AS content2_0_
FROM   attachment a
WHERE  a.id = 1

Because it is marked with the FetchType.LAZY annotation and lazy fetching bytecode enhancement is enabled, the content column is not fetched along with all the other columns that initialize the Attachment entity. Only when the data access layer tries to access the content property, Hibernate issues a secondary select to load this attribute as well.

Just like FetchType.LAZY associations, this technique is prone to N+1 query problems, so caution is advised. One slight disadvantage of the bytecode enhancement mechanism is that all entity properties, not just the ones marked with the FetchType.LAZY annotation, are going to be transformed, as previously illustrated.

Fetching subentities

Another approach to avoid loading table columns that are rather large is to map multiple subentities to the same database table.

Attachment Summary

Both the Attachment entity and the AttachmentSummary subentity inherit all common attributes from a BaseAttachment superclass.

@MappedSuperclass
public class BaseAttachment {

    @Id
    @GeneratedValue
    private Long id;

    private String name;

    @Enumerated
    @Column(name = "media_type")
    private MediaType mediaType;

    //Getters and setters omitted for brevity
}

While AttachmentSummary extends BaseAttachment without declaring any new attribute:

@Entity @Table(name = "attachment")
public class AttachmentSummary 
    extends BaseAttachment {}

The Attachment entity inherits all the base attributes from the BaseAttachment superclass and maps the content column as well.

@Entity @Table(name = "attachment")
public class Attachment 
    extends BaseAttachment {

    @Lob
    private byte[] content;

    //Getters and setters omitted for brevity
}

When fetching the AttachmentSummary subentity:

AttachmentSummary bookSummary = entityManager.find(
    AttachmentSummary.class, bookId);

The generated SQL statement is not going to fetch the content column:

SELECT a.id as id1_0_0_, 
       a.media_type as media_ty2_0_0_, 
       a.name as name3_0_0_ 
FROM attachment a 
WHERE  a.id = 1

However, when fetching the Attachment entity:

Attachment book = entityManager.find(
    Attachment.class, bookId);

Hibernate is going to fetch all columns from the underlying database table:

SELECT a.id as id1_0_0_, 
       a.media_type as media_ty2_0_0_, 
       a.name as name3_0_0_, 
       a.content as content4_0_0_ 
FROM attachment a 
WHERE  a.id = 1

I'm running an online workshopk on the 14th of May about The Best Way to Fetch Data with Java Persistence and Hibernate.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

To lazy fetch entity attributes, you can either use bytecode enhancement or subentities. Although bytecode instrumentation allows you to use only one entity per table, subentities are more flexible and can even deliver better performance since they don’t involve an interceptor call whenever reading an entity attribute.

When it comes to reading data, subentities are very similar to DTO projections. However, unlike DTO projections, subentities can track state changes and propagate them to the database.

Transactions and Concurrency Control eBook

9 Comments on “The best way to lazy load entity attributes using JPA and Hibernate

  1. Hope this article is not too old to comment. Will Hibernate use its L1 cache if I switch from Attachment entity to AttachmentSummary entity? Thanks.

    • Each entity will have its own cache entry, but you should never fetch both in the same Hibernate Session.

      • Im just struggling with this “Fetching subentities” approach in your book.

        Fetching AttachmentSummary and Attachment in the same entityManager results in more than one entity with the same id. EntityManger.find(Attachment.class, id) delivers an instance of AttachmentSummary instead of an Attachment instance.

        Is that why you mentioned above “not in the same session”?

        The primary goal is to have a tree as TableOfContents and to load a branch when selecting a treenode.

      • Yes, you should not fetch two subentities that represent the same table record in the same Session or EntityManager. If you need to display a tree,then maybe a DTO projection is a much better approach. Entities are needed only f you want to propagate changes back to the DB, not for read-only views.

  2. I don’t see this problem as related to lazy-loading, but for the mapping during bootstrap it doesn’t matter, I think, whether you use an empty default-database for this or pick one of tenants at random.
    No actual data will be processed at this point.
    However, having an empty default-schema can come in handy at various points where you might not have a request from which to determine a tenant.
    Also, the empty default-database could at the same time serve as a template for creating new databases for new tenants, without having to run all the schema creation / migration scripts.

    • I don’t have any idea what you are talking about. Anyway, this discussion is not relevant for this article, as it should have been posted on one of my articles about multitenancy. And even there, discussion system architecture in blog comments doesn’t really work very well. If you want my expert advice, I’m available for consulting.

  3. Nice article, thanks for that!

    What I miss is a configuration option to limit this feature on particular entities only.

    Hopefully this will become an option.

    • You already have this feature. Unless you explicitly use @Basic(lazy = true, all entities attributes are fetched eagerly. And even if you set that annotation, you still need to activate bytecode enhancement.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.