Hypersistence Optimizer User Guide

Introduction

Hypersistence Optimizer is a dynamic analysis tool that can scan your JPA EntityManagerFactory or Hibernate SessionFactory and provides you tips about how what changes you need to make to the entity mappings and Hibernate configurations in order to gain better application performance.

Configuring Hypersistence Optimizer

Now, depending on whether you are using JPA or the native Hibernate API, you can configure Hypersistence Optimizer as follows.

Hypersistence Optimizer JPA configuration

If you’re using Hibernate via JPA, then you can initialize Hypersistence Optimizer like this:

new HypersistenceOptimizer(
    new JpaConfig(
        entityManagerFactory
    )   
).init();

The entityManagerFactory is a reference to the current JPA EntityManagerFactory object.

The hypersistence-optimizer-spring-jpa-example GitHub project demonstrates how you can use Hypersistence Optimizer in a Spring JPA project.

Hypersistence Optimizer Hibernate configuration

If you’re using Hibernate via the native API, then you can initialize Hypersistence Optimizer like this:

new HypersistenceOptimizer(
    new HibernateConfig(
        sessionFactory
    )  
).init();

The sessionFactory is a reference to the current Hibernate-specific SessionFactory object.

The hypersistence-optimizer-spring-hibernate-example GitHub project demonstrates how you can use Hypersistence Optimizer in a Spring Hibernate project.

Hypersistence Optimizer Spring Boot configuration

If you’re using Spring Boot, then you can initialize Hypersistence Optimizer like this:

@RunWith(SpringRunner.class)
@DataJpaTest
public class ApplicationTest {

    @PersistenceUnit
    private EntityManagerFactory entityManagerFactory;

    @Test
    public void testOptimizer() {
        new HypersistenceOptimizer(
            new JpaConfig(entityManagerFactory)
        ).init();
    }
}

The hypersistence-optimizer-spring-boot-example GitHub project demonstrates how you can use Hypersistence Optimizer in a Spring Boot project.

Hypersistence Optimizer Play Framework configuration

If you’re using Play Framework, then you can initialize Hypersistence Optimizer like this:

@Inject
private JPAApi jpaApi;

@Test
public void testOptimizer() {
    EntityManagerFactory entityManagerFactory = jpaApi
    .withTransaction(entityManager -> {
        return entityManager.getEntityManagerFactory();
    });
    
    new HypersistenceOptimizer(
        new JpaConfig(entityManagerFactory)
    ).init();
}

Notice that we are fetching the EntityManagerFactory from the JPAApi utility. This way, we can pass the EntityManagerFactory reference to the HypersistenceOptimizer object.

EventHandler

The EventHandler allows you to customize the way events are processed by Hypersistence Optimizer. For instance, by default, all events are handled by the LogEventHandler which logs every incoming Event.

Collecting events using the ListEventHandler

If you want to collect all the events generated by Hypersistence Optimizer, you can use the ListEventListener as illustrated by the following example:

ListEventHandler listEventHandler = new ListEventHandler();

new HypersistenceOptimizer(
    new JpaConfig(entityManagerFactory())
    .setEventHandler(listEventHandler)
).init();

List<Event> events = listEventHandler.getEvents();
assertEquals(1, events.size());
assertTrue(events.get(0) instanceof EagerFetchingEvent);

Notice that we created an instance of the ListEventHandler which was provided to the Hypersistence Optimizer configuration via the setEventHandler method.

This way, after calling init, we can fetch all events that were generated by calling the getEvents method of the ListEventHandler object instance.

If you want to test this functionality, check out the ListEventHandlerTest in the hypersistence-optimizer GitHub repository.

Execute multiple event handlers using the ChainEventHandler

If you want to execute multiple event handler, then you need to use the ChainEventHandler. For instance, you might want to log events as well as to collect them for further processing. The following example shows you how to do that.

ListEventHandler listEventHandler = new ListEventHandler();

new HypersistenceOptimizer(
    new JpaConfig(entityManagerFactory())
    .setEventHandler(new ChainEventHandler(
        Arrays.asList(
            LogEventHandler.INSTANCE,
            listEventHandler
        )
    ))
).init();

List<Event> events = listEventHandler.getEvents();
assertEquals(1, events.size());
assertTrue(events.get(0) instanceof EagerFetchingEvent);

If you want to test this functionality, check out the ChainEventHandlerTest in the hypersistence-optimizer GitHub repository.

EventFilter

The EventFilter allows you to accept or reject certain events that your not interested in capturing and propagating. For instance, if you’re using MySQL, using the IDENTITY identifier generator is the only reasonable option.

However, you might want to reject the PostInsertGeneratorEvent that is associated with the IDENTITY generator, therefore preventing it from propagating further.

To do so, you need to provide a custom EventFilter as in the following example:

new HypersistenceOptimizer(
    new JpaConfig(entityManagerFactory())
    .setEventFilter(new EventFilter() {
        @Override
        public boolean accept(Event event) {
            if(event instanceof PostInsertGeneratorEvent) {
                return false;
            }
            return true;
        }
    })
).init();

Notice the setEventFilter method call that passes a custom EventFilter to the current Hypersistence Optimizer configuration.

If you want to test this functionality, check out the EventFilterTest in the hypersistence-optimizer GitHub repository.

Events

Hypersistence Optimizer generates events while scanning the application metadata.

Mapping events

The mapping events are related to JPA and Hibernate mappings, and Hypersistence Optimizer can detect the following mapping issues.

Identitifer mapping

PostInsertGeneratorEvent

The PostInsertGeneratorEvent is triggered when an entity identifier is mapped as post-insert strategy, be it IDENTITY or SELECT, meaning that the only way to know the entity identifier is if the associated row is inserted right away.

The post-insert identifier strategy prevents Hibernate from batching inserts automatically, and, for this reason, it is much more efficient to use the SEQUENCE strategy if the underlying database supports database sequence objects.

For more details, check out this article.

PrimitiveIdentifierEvent

The PrimitiveIdentifierEvent instructs you that the entity identifier is mapped as a Java primitive (e.g. int, long) instead of a Java wrapper type (e.g. Integer or Long). The wrapper type allows nullable values, hence it works better with assigned entity identifiers since Hibernate knows that an entity with null identifier does not have an associated table record.

HiLoOptimizerEvent

The hi/lo algorithm, while it may reduce the number of database roundtrips required to assign the sequence-based identifiers, it does not interoperate well with other systems which are unaware of the underlying hi/lo strategy being used.

For this reason, you are better off using the pooled or pooled-lo optimizer generators.

TableGeneratorEvent

The TABLE identifier generator is very inefficient. Not only that it requires a separate database transaction as well as a separate database connection to make sure that the identifier generation process is not linked to the calling transaction, but it employes the use of row-level locks which are heavy compared to the lightweight latches used by the IDENTITY or SEQUENCE database identifier generation strategies.

The TABLE generator does not even scale well when increasing the number of threads that generate entity identifiers concurrently since it puts pressure on the underlying database connection pool, and it introduces a bottleneck at the identifier table level. You should use a SEQUENCE strategy if the database supports this feature, or use an IDENTITY column instead.

For more details about why you should avoid the TABLE generator, check out this article.

Basic mapping

EnumTypeStringEvent

The Java Enum can be mapped in two ways with JPA and Hibernate. By default, the Enum ordinal is used to materialize the Enum value in the database. However, to make it more readable, some developers choose to store the Enum name instead, which has a higher memory and disk footprint.

For more details about this topic, check out this article.

LargeColumnEvent

Table columns should be as compact as possible. For this reason, you should always strive for using the most compact types which can accommodate all possible values stored in a given column.

Unless you can choose more compact types, you might consider moving the large columns to a separate table. You could use a one-to-one table relationship to store the large columns, especially if they are not used frequently. Also, you could use multiple entities mapped to the same database table so that you can choose which properties are to be loaded from the database based on the entity type that you are currently fetching.

In case you have annotated the large column with @Basic(fetch=LAZY), you need to also activate the bytecode enhancement lazy loading mechanism as, otherwise, the column is going to be fetched eagerly when the entity is loaded.

You should use the @DynamicUpdate annotation on the entity level so that the UPDATE statement contains only the columns that have been modified by the currently running Persistence Context. This will reduce the impact of large columns on the underlying transaction log. Otherwise, not only that the transaction log gets bigger, but this can also affect replication since the log entries are propagated to follower nodes.

Version mapping

TimestampVersionEvent

When implementing a Concurrency Control algorithm, it is not a good idea to rely on physical or walk clock time because it is not reliable. Due to the NTP protocol, the clock time can jump backward, hence compromising the monotonic time reading property required by a concurrency control mechanism.

Therefore, you should favor using logical clocks instead, like the optimistic locking numerical-based version property.

For more details about this topic, check out this article.

IntegerVersionColumnSizeEvent

The version property used for optimistic locking should be as compact as possible since its goal is to make sure that the underlying table row version matches the current entity version property. Since an integer column type takes 4 bytes, this could be a waste of space if the entity is not changed frequently.

A version property mapped to a smallint is sufficient for the vast majority of use cases since it allows the version to change 65000 times in between the read and the write of a given database table record.

LongVersionColumnSizeEvent

The version property used for optimistic locking should be as compact as possible since its goal is to make sure that the underlying table row version matches the current entity version property. Since a long or bigint column takes 8 bytes, this could be a waste of space if the entity is not changed frequently.

A version property mapped to a smallint is sufficient for the vast majority of use cases since it allows the version to change 65000 times in between the read and the write of a given database table record.

Relationship mapping

UnidirectionalOneToManyEvent

The unidirectional one-to-many association uses a link table and an additional Foreign Key column, which is not very efficient from a mapping perspective. Even worse, the generated SQL statements are not very efficient either especially when using a Java List. You should consider using a bidirectional one-to-many association instead.

For more details about this topic, check out this article.

UnidirectionalOneToManyJoinTableEvent

The unidirectional one-to-many association that uses an explicit join column does not render very efficient SQL statements since the child entity insert is executed first while the collection is processed later at the Persistence Context flush time. For this reason, instead of a single INSERT, you’d see an INSERT and an UPDATE statement being executed. You should consider using a bidirectional one-to-many association instead.

For more details about this topic, check out this article.

ElementCollectionEvent

The JPA element-collection requires all child operations to be executed from the parent side, hence, if you want to add or remove a child element, you always need to fetch the collection. For this reason, you should consider using a bidirectional one-to-many association instead or just use a @ManyToOne association and replace the collection with a query.

For more details about this topic, check out this article.

ElementCollectionArrayEvent

The JPA element-collection mapped as a Java array is treated like a bag, meaning that elements are removed and reinserted back whenever the array is changed. For this reason, you are better off using a java.util.Set which generates much more efficient SQL statements.

For more details about this topic, check out this article.

ElementCollectionListEvent

The JPA element-collection mapped as a java.util.List is treated like a bag, meaning that elements are removed and reinserted back whenever an element is removed from the List. For this reason, you are better off using a java.util.Set which generates much more efficient SQL statements.

For more details about this topic, check out this article.

OneToOneWithoutMapsIdEvent

The one-to-one table relationship should be mapped so that the child table Primary Key is also a Foreign Key to the parent table Primary Key. This means that the parent and the child table records share their Primary Key values.

Without using @MapsId, the one-to-one association looks more like a one-to-many table relationship where the separate Foreign Key column has a unique key constraint. For this reason, you should always favor using @MapsId when mapping a one-to-one relationship.

For more details about this topic, check out this article.

OneToOneParentSideEvent

When mapping a one-to-one relationship, you can choose either a unidirectional or a bidirectional association. The unidirectional association will map the parent entity on the child side only while the bidirectional association will also map the child entity reference in the parent entity.

Because at load time, Hibernate has no idea whether to assign null or a Proxy for the child entity reference on the parent side, a secondary query will be executed when fetching the parent entity without join fetching the child entity association. This can lead to N+1 query issues which can, in turn, hurt application performance.

To fix this issue, you should either use unidirectional one-to-one association or use bytecode enhancement to load the entity attributes lazily.

For more details about this topic, check out this article.

ManyToManyListEvent

When mapping a many-to-many association, you should use a Java Set, not a List because the List is treated like a bag in Hibernate terminology, and a bag may generate inefficient SQL statements.

For more details about this topic, check out this article.

ManyToManyCascadeRemoveEvent

When mapping a many-to-many association, the CascadeType.REMOVE which can also be inherited from CascadeType.ALL will propagate the remove entity state transition to the other entity, which is a parent, not a child entity.

Therefore, for many-to-many associations, we are only interested in removing the associated link table records, which is done automatically when dereferencing a child from the many-to-many collection.

For more details about this topic, check out this article.

Fetch strategy mapping

EagerFetchingEvent

Using the FetchType.EAGER strategy is a very bad idea because, at mapping time, you have no indication of what might be needed when fetching an entity. The FetchType.EAGER strategy can also lead to N+1 query issues if the FetchType.EAGER association is not fetched in every JPQL or Criteria API query involving the entity defining this association.

For this reason, you should prefer using FetchType.LAZY instead, and override the lazy loading strategy at query time if that’s needed by the current business use case.

For more details about this topic, check out this article.

ExtraLazyCollectionEvent

The use of the LazyCollectionOption.EXTRA strategy indicates that the underlying collection is too large to be mapped as an entity association and that it should be replaced with an entity query instead.

For more details about this topic, check out this article.

BatchFetchingEvent

Using batch fetching to initialize an association so to avoid the N+1 query issue is not very efficient and might indicate that the underlying collection is too large to be mapped as an entity association and that it should be replaced with an entity query instead.

SubSelectFetchingEvent

Using subselect fetching to initialize an association so to avoid the N+1 query issue might indicate that the underlying collection is too large to be mapped as an entity association and that it should be replaced with an entity query instead.

Also, you should consider using DTO projections, which can also be fetched as graphs, if you don’t plan to modify the entities being fetched.

Cache mapping

NaturalIdCacheEvent

The natural id mapping allows you to fetch an entity by its associated business key. However, this is a two-stage process. First, the entity identifier is resolved by its associated natural id. Second, with the entity identifier resolved, the entity is fetched from the Persistence Context.

To save the first query, you can use the NaturalIdCache entity mapping.

For more details about this topic, check out this article.

Inheritance mapping

StringDiscriminatorTypeEvent

The entity identifier discriminator column can be mapped as a numeric value, a char or a String. The String strategy, although more readable, is the least compact strategy. For this reason, you should favor the numeric-based strategy, which can also be designed to be readable via a separate description table.

For more details about this topic, check out this article.

TablePerClassInheritanceEvent

The TABLE_PER_CLASS inheritance strategy is not very efficient, especially for polymorphic queries. You should use either SINGLE_TABLE or JOINED instead if you need to materialize the entity inheritance tree in the database. Otherwise, use @MappedSuperclass if the entity inheritance is only needed on the domain model side.

Configuration events

The configuration events are related to Hibernate, and Hypersistence Optimizer can detect the following configuration issues.

Connection configuration

DriverManagerConnectionProviderEvent

The DriverManagerConnectionProvider is not suitable to be used in a production environment. Although it features a rudimentary connection pooling solution, you should consider using a professional pooling framework, like HikariCP.

DataSourceConnectionProviderEvent

The DataSourceConnectionProvider is the most flexible ConnectionProvider implementation as it allows you to chain as many DataSource proxies as you need, like connection pooling, logging, and monitoring.

Without using the DataSourceConnectionProvider, it’s hard to configure a tool like datasource-proxy which, not only that it gives you advanced logging capabilities, but you can use it to automatically detect N+1 query issues.

More, when using the DataSourceConnectionProvider, you can easily integrate FlexyPool which allows you to monitor the database connection usage and determine the right connection pool size.

ConnectionReleaseAfterStatementEvent

When using JTA, Hibernate releases the associated database connection after every executing statement. As explained in this article, this strategy can incur a response time overhead. For this reason, if you are using a stand-alone JTA transaction manager (e.g. Bitronix, Atomikos) or a Java EE application server that does not require this aggressive connection release mode, then you should switch to releasing the connection when the transaction is ended.

Since Hibernate 5.2, you can set the connection acquisition and release strategy via the hibernate.connection.handling_mode configuration property. So, if you’re using JTA, you might want to provide the following configuration property which instructs Hibernate to release the database connection only after the JPA transaction is ended.

<property 
    name="hibernate.connection.handling_mode" 
    value="delayed_acquisition_and_release_after_transaction"
/>

For Hibernate versions that are older than 5.2, you should use the hibernate.connection.release_mode configuration property instead. So, if you’re using JTA, you might want to provide the following configuration property which instructs Hibernate to release the database connection only after the JPA transaction is ended.

<property 
    name="hibernate.connection.release_mode" 
    value="after_transaction"
/>
SkipAutoCommitCheckEvent

As I explained in this article, when using RESOURCE_LOCAL transactions, which is the default for Spring and Spring Boot applications, the database connection is acquired eagerly at the beginning of the JPA transaction (when entering the @Transactional service method) because the auto-commit flag has to be checked and disable when set to true.

If you already configured the underlying connection pool to disable the auto-commit flag when acquiring a JDBC Connection, then you should instruct Hibernate to skip the auto-commit check via the following configuration property:

<property
    name="hibernate.connection.provider_disables_autocommit"
    value="true"
/>

Dialect configuration

DialectVersionEvent

This event tells you that your application uses an older Hibernate Dialect, and you should switch to the latest Hibernate Dialect associated with the database server version your application is using.

Want to run your data access layer at warp speed?