The best way to lazy load entity attributes using JPA and Hibernate

Introduction

When fetching an entity, all attributes are going to be loaded as well. This is because every entity attribute is implicitly marked with the @Basic annotation whose default fetch policy is FetchType.EAGER.

However, the attribute fetch strategy can be set to FetchType.LAZY, in which case the entity attribute is loaded with a secondary select statement upon being accessed for the first time.

@Basic(fetch = FetchType.LAZY)

This configuration alone is not sufficient because Hibernate requires bytecode instrumentation to intercept the attribute access request and issue the secondary select statement on demand.

Bytecode enhancement

When using the Maven bytecode enhancement plugin, the enableLazyInitialization configuration property must be set to true as illustrated in the following example:

<plugin>
    <groupId>org.hibernate.orm.tooling</groupId>
    <artifactId>hibernate-enhance-maven-plugin</artifactId>
    <version>${hibernate.version}</version>
    <executions>
        <execution>
            <configuration>
                <failOnError>true</failOnError>
                <enableLazyInitialization>true</enableLazyInitialization>
            </configuration>
            <goals>
                <goal>enhance</goal>
            </goals>
        </execution>
    </executions>
</plugin>

With this configuration in place, all JPA entity classes are going to be instrumented with lazy attribute fetching. This process takes place at build time, right after entity classes are compiled from their associated source files.

The attribute lazy fetching mechanism is very useful when dealing with column types that store large amounts of data (e.g. BLOB, CLOB, VARBINARY). This way, the entity can be fetched without automatically loading data from the underlying large column types, therefore improving performance.

To demonstrate how attribute lazy fetching works, the following example is going to use an Attachment entity which can store any media type (e.g. PNG, PDF, MPEG).

@Entity @Table(name = "attachment")
public class Attachment {

    @Id
    @GeneratedValue
    private Long id;

    private String name;

    @Enumerated
    @Column(name = "media_type")
    private MediaType mediaType;

    @Lob
    @Basic(fetch = FetchType.LAZY)
    private byte[] content;

    //Getters and setters omitted for brevity
}

Properties such as the entity identifier, the name or the media type are to be fetched eagerly on every entity load. On the other hand, the media file content should be fetched lazily, only when being accessed by the application code.

After the Attachment entity is instrumented, the class bytecode is changed as follows:

@Transient
private transient PersistentAttributeInterceptor 
    $$_hibernate_attributeInterceptor;

public byte[] getContent() {
    return $$_hibernate_read_content();
}

public byte[] $$_hibernate_read_content() {
    if ($$_hibernate_attributeInterceptor != null) {
        this.content = ((byte[]) 
            $$_hibernate_attributeInterceptor.readObject(
                this, "content", this.content));
    }
    return this.content;
}

The content attribute fetching is done by the PersistentAttributeInterceptor object reference, therefore providing a way to load the underlying BLOB column only when the getter is called for the first time.

attachment

When executing the following test case:

Attachment book = entityManager.find(
    Attachment.class, bookId);

LOGGER.debug("Fetched book: {}", book.getName());

assertArrayEquals(
    Files.readAllBytes(bookFilePath), 
    book.getContent()
);

Hibernate generates the following SQL queries:

SELECT a.id AS id1_0_0_,
       a.media_type AS media_ty3_0_0_,
       a.name AS name4_0_0_
FROM   attachment a
WHERE  a.id = 1

-- Fetched book: High-Performance Java Persistence

SELECT a.content AS content2_0_
FROM   attachment a
WHERE  a.id = 1

Because it is marked with the FetchType.LAZY annotation and lazy fetching bytecode enhancement is enabled, the content column is not fetched along with all the other columns that initialize the Attachment entity. Only when the data access layer tries to access the content property, Hibernate issues a secondary select to load this attribute as well.

Just like FetchType.LAZY associations, this technique is prone to N+1 query problems, so caution is advised. One slight disadvantage of the bytecode enhancement mechanism is that all entity properties, not just the ones marked with the FetchType.LAZY annotation, are going to be transformed, as previously illustrated.

Fetching subentities

Another approach to avoid loading table columns that are rather large is to map multiple subentities to the same database table.

attachmentsummary

Both the Attachment entity and the AttachmentSummary subentity inherit all common attributes from a BaseAttachment superclass.

@MappedSuperclass
public class BaseAttachment {

    @Id
    @GeneratedValue
    private Long id;

    private String name;

    @Enumerated
    @Column(name = "media_type")
    private MediaType mediaType;

    //Getters and setters omitted for brevity
}

While AttachmentSummary extends BaseAttachment without declaring any new attribute:

@Entity @Table(name = "attachment")
public class AttachmentSummary 
    extends BaseAttachment {}

The Attachment entity inherits all the base attributes from the BaseAttachment superclass and maps the content column as well.

@Entity @Table(name = "attachment")
public class Attachment 
    extends BaseAttachment {

    @Lob
    private byte[] content;

    //Getters and setters omitted for brevity
}

When fetching the AttachmentSummary subentity:

AttachmentSummary bookSummary = entityManager.find(
    AttachmentSummary.class, bookId);

The generated SQL statement is not going to fetch the content column:

SELECT a.id as id1_0_0_, 
       a.media_type as media_ty2_0_0_, 
       a.name as name3_0_0_ 
FROM attachment a 
WHERE  a.id = 1

However, when fetching the Attachment entity:

Attachment book = entityManager.find(
    Attachment.class, bookId);

Hibernate is going to fetch all columns from the underlying database table:

SELECT a.id as id1_0_0_, 
       a.media_type as media_ty2_0_0_, 
       a.name as name3_0_0_, 
       a.content as content4_0_0_ 
FROM attachment a 
WHERE  a.id = 1

If you enjoyed this article, I bet you are going to love my book as well.

Conclusion

To lazy fetch entity attributes, you can either use bytecode enhancement or subentities. Although bytecode instrumentation allows you to use only one entity per table, subentities are more flexible and can even deliver better performance since they don’t involve an interceptor call whenever reading an entity attribute.

When it comes to reading data, subentities are very similar to DTO projections. However, unlike DTO projections, subentities can track state changes and propagate them to the database.

If you liked this article, you might want to subscribe to my newsletter too.

Advertisements

33 thoughts on “The best way to lazy load entity attributes using JPA and Hibernate

  1. Hi Vlad. Mapping byte[] it’s a not a good idea, it can lead to OutOfMemoryError: Java heap space. For instance: when you create a new Attachment with byte[] which weighs 200Mb, it means that the very huge object is created in heap. Could you imagine if many users are attaching files simultaneously?
    You should use Blob instead:

    @Entity
    class Attachment{
    @Column(name = “data”)
    @Lob
    private Blob data;
    }

    and

    File file = new ClassPathResource(“200MB.zip”).getFile();
    Attachment attachment = new Attachment();
    attachment.setData(BlobProxy.generateProxy(new FileInputStream(file), file.length()));
    em.persist(attachment);

    You can test it, just turn on the JVisualVM to figure out what’s going on with Java heap space in two different cases: with Blob and with byte[].

    1. Thanks for the tip. Indeed, for very large files, a bye[] would be overkill. However, if you have an imposed limit for file size (as many applications do), this should not be an issue. Blobs have heir own quirks as well, especially when you retrieve them.

  2. I wonder whether 3rd way is possible: 2 entities mapped to same table (but different columns) connected with lazy @OneToOne and @MapsId. Seems better than your second solution, because the relation between entities is explicit.

    1. @OneToOne demands having two tables. In this example, we only have one table. Of course, you can always move the large column into a separate table, but that’s a totally different discussion. Not to mention that sometimes you cannot do that because you already inherit a legacy schema.

  3. I always thought that every @Lob attribute was LAZY by default, even without byte enhancement enabled [I’ve never used it before]. Normally I use DTO and projections to avoid loading @Lob attributes.

    By the way, I liked your solution using subentities, it was clever and well designed. Indeed, I liked it more than byte enhancement.

  4. What about updating the data? What will happen if, by chance, you end up updating an Attachment and an AttachmentSummary with the same id in the same transaction?
    I suppose you have to be careful using this feature and always update using the subentity with all the data, not mixing loading two subentities with the same id in the same transaction.

      1. Ups, sorry for the delay, I didn’t find out you had already answered.

        Let me explain better with an example: first, you load an Attachment with id = 1. It’s name has a value of “MyName”. Then you change it to “MyChangedName”. Then, in the same transaction, you load an AttachmentSummary with the same id. As it is another entity and the changes haven’t been persisted, I suppose its name property value will still be “MyName”. ¿Am I wrong?

        Then, for example, you change the AttachmentSummary’s name and media_type property to another values. Once the transaction finishes, ¿will both entities be persisted, overwriting one the values of the other one? If you use a version property, I suppose you will get an exception.

        If I’m not wrong, those are problems you will have if you mix updates to subentities with the same id. Of course, you can avoid them with a bit of care, but you have to be aware of them.

      2. Mixing multiple modifying entities for the same table is not advisable when you use subentities. You have to be aware of these issues, of course. Although I haven’t tested it, I suppose that optimistic locking will catch these issues.

  5. Hi Vlad. What about @NamedEntityGraph or @FetchProfile for such lazy attributes? For example I want to eager load lazy attributes for getById operation and lazily load it for getAll operation.
    It seems it doesn’t work like for @ManyToOne associations.

  6. Hi, Vlad.
    How Hibernate’s second level cache is working having two entities mapped to a one table in same context?
    For example I have entity1 and entity2(let it be read only, and it shared two field with entity1).
    Would entity2’s cache region be updated after entity1 was updated or even deleted

    1. Hibernate is not going to do any syncronization across regions because it does not know about any overlapping. In this case, it’s probably better not to use the 2nd-level cache at all. Any way, the applicability of the second-level cache is only justified for reducing load on the Master node. If you’re using it to provide better read throughput, then you’re doing it all wrong because you can do a much better job if you tune the DB buffers correctly and redirect read traffic to Slave nodes.

  7. Thanks Vlad! Sorry i am new to this. Do i need to raise a bug on hibernate and then submit the test case or is there a certain process to this?

    1. There’s already a bug created as indicated in the answer you got on StackOverflow. You only need to create a replicating test case and attach it to the JIRA issue. It’s really simple.

  8. I can’t get the Bytecode enhancement solution to work on my project with hibernate 4.3.5 and postgres 9.3. It still loads the Blob, I guess I will use the subentities solution.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s