How does AUTO flush strategy work in JPA and Hibernate
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
The Hibernate AUTO flush mode behaves differently whether you are bootstrapping Hibernate via JPA or using the stand-alone mechanism.
When using JPA, the AUTO flush mode causes all queries (JPQL, Criteria API, and native SQL) to trigger a flush prior to the query execution. However, this is not the case when bootstrapping Hibernate using the native API.
Not all queries trigger a Session flush
Many would assume that Hibernate always flushes the Session before any executing query. While this might have been a more intuitive approach, and probably closer to the JPA’s AUTO FlushModeType, Hibernate tries to optimize that. If the currently executed query is not going to hit the pending SQL INSERT/UPDATE/DELETE statements then the flush is not strictly required.
As stated in the reference documentation, the AUTO flush strategy may sometimes synchronize the current persistence context prior to a query execution. It would have been more intuitive if the framework authors had chosen to name it FlushMode.SOMETIMES.
JPQL/HQL and SQL
The entity query language is translated to SQL by the current database dialect and so it must offer the same functionality across different database products. Since most database systems are SQL-92-complaint, the Entity Query Language is an abstraction of the most common database querying syntax.
While you can use the Entity Query Language in many use cases (selecting Entities and even projections), there are times when its limited capabilities are no match for an advanced querying request. Whenever we want to make use of some specific querying techniques, such as:
we have no other option, but to run native SQL queries.
Hibernate is a persistence framework. Hibernate was never meant to replace SQL. If some query is better expressed in a native query, then it’s not worth sacrificing application performance on the altar of database portability.
AUTO flush and HQL/JPQL
First we are going to test how the AUTO flush mode behaves when an HQL query is about to be executed. For this we define the following unrelated entities:
The test will execute the following actions:
Productentity is going to be persisted.
- Fetching a
Userentity should not trigger a Persistence Context flush.
- Querying for the
Productentity, the AUTO flush should trigger the entity state transition synchronization (an INSERT statement for the
producttable row should be executed prior to executing the SELECT query).
Product product = new Product(); product.setColor("Blue"); session.persist(product); assertEquals( 0L, session.createQuery("select count(id) from User").getSingleResult() ); assertEquals( product.getId(), session.createQuery("select p.id from Product p").getSingleResult() );
Giving the following SQL output:
SELECT count(user0_.id) AS col_0_0_ FROM USER user0_ INSERT INTO product (color, id) VALUES ('Blue', 'f76f61e2-f3e3-4ea4-8f44-82e9804ceed0') SELECT product0_.id AS col_0_0_ FROM product product0_
As you can see, the User select hasn’t triggered the Session flush. This is because Hibernate inspects the current query space against the pending table statements. If the currently executing query doesn’t overlap with the unflushed table statements, the flush can be safely ignored.
HQL can trigger the
Product flush even for subselects:
session.persist(product); assertEquals( 0L, session.createQuery( "select count(*) " + "from User u " + "where u.favoriteColor in (" + " select distinct(p.color) from Product p" + ")" ).getSingleResult() );
Resulting in a proper flush call:
INSERT INTO product (color, id) VALUES ('Blue', '2d9d1b4f-eaee-45f1-a480-120eb66da9e8') SELECT count(*) AS col_0_0_ FROM USER user0_ WHERE user0_.favoriteColor IN ( SELECT DISTINCT product1_.color FROM product product1_ )
Hibernate can also trigger the
Product flush even for a theta-style join query:
session.persist(product); assertEquals( 0L, session.createQuery( "select count(*) " + "from User u, Product p " + "where u.favoriteColor = p.color" ).getSingleResult() );
Triggering the expected flush :
INSERT INTO product (color, id) VALUES ('Blue', '4af0b843-da3f-4b38-aa42-1e590db186a9') SELECT count(*) AS col_0_0_ FROM USER user0_ CROSS JOIN product product1_ WHERE user0_.favoriteColor=product1_.color
The reason why it works is that entity queries are parsed and translated to SQL queries. Hibernate cannot reference a non-existing table, therefore it always knows the database tables an HQL/JPQL query will hit.
So, Hibernate is only aware of those tables we explicitly referenced in our HQL query. If the current pending DML statements imply database triggers or database level cascading, Hibernate won’t be aware of those. So even for HQL, the AUTO flush mode can cause consistency issues.
AUTO flush and native SQL queries
When it comes to native SQL queries, things are getting much more complicated. Hibernate cannot parse SQL queries because it only supports a limited database query syntax. Many database systems offer proprietary features that are beyond Hibernate Entity Query capabilities.
Product table, with a native SQL query is not going to trigger the flush, causing an inconsistency issue:
Product product = new Product(); product.setColor("Blue"); session.persist(product); assertEquals( 0, session.createNativeQuery("SELECT COUNT(*) FROM product").getSingleResult() );
SELECT COUNT(*) FROM product INSERT INTO product (color, id) VALUES ('Blue', '718b84d8-9270-48f3-86ff-0b8da7f9af7c')
The newly persisted Product was only inserted during transaction commit because the native SQL query didn’t trigger the flush. This is a major consistency problem, one that’s hard to debug or even foreseen by many developers. That’s one more reason for always inspecting auto-generated SQL statements.
The same behavior is observed even for named native queries:
@NamedNativeQuery(name = "product_ids", query = "SELECT COUNT(*) FROM product")
In which case, we can’t see the newly added produce
So even if the SQL query is pre-loaded, Hibernate won’t extract the associated query space for matching it against the pending DML statements.
It’s worth noting that this behavior applies to Hibernate-specific API, and not to JPA AUTO flush mode.
Check out this article for more details.
Overruling the current flush strategy
Even if the current Session defines a default flush strategy, you can always override it on a query basis.
Query flush mode
The ALWAYS mode is going to flush the persistence context before any query execution (HQL or SQL). This time, Hibernate applies no optimization and all pending entity state transitions are going to be synchronized with the current database transaction.
assertEquals( product.getId(), session.createNativeQuery("select id from product") .setFlushMode(FlushMode.ALWAYS) .getSingleResult() );
Instructing Hibernate which tables should be synchronized
You could also add a synchronization rule to your current executing SQL query. Hibernate will then know what database tables need to be synchronized prior to executing the query. This is also useful for second level caching as well.
assertEquals( product.getId(), session.createNativeQuery( "select id from product") .addSynchronizedEntityClass(Product.class) .getSingleResult());
I'm running an online workshop on the 11th of October about High-Performance SQL.
The AUTO flush mode is tricky and fixing consistency issues on a query basis is a maintainer’s nightmare. If you decide to add a database trigger, you’ll have to check all Hibernate queries to make sure they won’t end up running against stale data.
My suggestion is to use the ALWAYS flush mode, since it’s closer to how JPA defines the
Inconsistency is much more of an issue that some occasional premature flushes. While mixing DML operations and queries may cause unnecessary flushing this situation is not that difficult to mitigate. During a transaction, it’s best to execute queries at the beginning (when no pending entity state transitions are to be synchronized) and towards the end of the transaction (when the current persistence context is going to be flushed anyway).
The entity state transition operations should be pushed towards the end of the transaction, trying to avoid interleaving them with query operations (therefore preventing a premature flush trigger).