How to customize an entity association JOIN ON clause with Hibernate @JoinFormula
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?
Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.
So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!
Introduction
As I explained in this previous article, you can map calculated properties using Hibernate @Formula
, and the value is generated at query time.
In this post, you’ll see how you can use a custom SQL fragment to customize the JOIN relationship between two entities, or an entity and a collection of embeddable types.
Domain Model
Assuming we have the following entities:
The Language
entity is used so that we know how to render the application UI. The Country
entity is used so that we can load various country-specific settings, like the associated vatRate
.
However, the Account
table does not contain any FOREIGN KEY to the Language
or Country
tables. Instead, the Account
entity only defines a Locale
attribute, and the Account
table looks as follows:
CREATE TABLE Account ( id int8 NOT NULL , credit float8 , locale VARCHAR(255) , rate float8 , PRIMARY KEY (id) )
While having two FOREIGN KEY columns: country_id
and locale_id
would be better, we’ll assume a legacy database schema that cannot be easily modified. For this reason, we need to use the
locale
column since it stores info about both language and country. What we need to do is to parse it and extract the country and language codes which can then be used to join the associated Country
and Language
tables.
While JPA doesn’t offer any support for such mapping, Hibernate has long been offering the @JoinFormula
annotation.
Therefore, the Account
mapping becomes:
@Entity(name = "Account") public class Account { @Id private Long id; private Double credit; private Double rate; private Locale locale; @ManyToOne(fetch = FetchType.LAZY) @JoinFormula( "REGEXP_REPLACE(locale, '\\w+_(\\w+)[_]?', '\\1')" ) private Country country; @ManyToOne(fetch = FetchType.LAZY) @JoinFormula( "REGEXP_REPLACE(locale, '(\\w+)_.*', '\\1')" ) private Language language; //Getters and setters omitted for brevity }
The @ManyToOne
associations are using FetchType.LAZY
because EAGER fetching is a code smell.
The @JoinFormula
uses the awesome REGEXP_REPLACE
function which is supported by PostgreSQL, Oracle or MariaDB.
For the following tests, we are going to use PostgreSQL.
Testing time
Assuming we have the following entities:
Country _US = new Country(); _US.setId( "US" ); _US.setName( "United States" ); _US.setVatRate(0.1); Country _UK = new Country(); _UK.setId( "UK" ); _UK.setName( "United Kingdom" ); _UK.setVatRate(0.2); Country _Spain = new Country(); _Spain.setId( "ES" ); _Spain.setName( "Spain" ); _Spain.setVatRate(0.21); Country _Mexico = new Country(); _Mexico.setId( "MX" ); _Mexico.setName( "Mexico" ); _Mexico.setVatRate(0.16); Language _English = new Language(); _English.setId( "en" ); _English.setName( "English" ); Language _Spanish = new Language(); _Spanish.setId( "es" ); _Spanish.setName( "Spanish" ); doInJPA( entityManager -> { entityManager.persist( _US ); entityManager.persist( _UK ); entityManager.persist( _Spain ); entityManager.persist( _Mexico ); entityManager.persist( _English ); entityManager.persist( _Spanish ); } ); doInJPA( entityManager -> { Account account1 = new Account( ); account1.setId( 1L ); account1.setCredit( 5000d ); account1.setRate( 1.25 / 100 ); account1.setLocale( Locale.US ); entityManager.persist( account1 ); Account account2 = new Account( ); account2.setId( 2L ); account2.setCredit( 200d ); account2.setRate( 1.25 / 100 ); account2.setLocale( new Locale( "es", "MX" ) ); entityManager.persist( account2 ); } );
Lazy fetching
When fetching the Account
entity, Hibernate managed to fetch the associated Country
and Language
entities:
doInJPA( entityManager -> { LOGGER.info("Fetch first Account"); Account account1 = entityManager.find( Account.class, 1L ); assertEquals( _English, account1.getLanguage()); assertEquals( _US, account1.getCountry()); LOGGER.info("Fetch second Account"); Account account2 = entityManager.find( Account.class, 2L ); assertEquals( _Spanish, account2.getLanguage()); assertEquals( _Mexico, account2.getCountry()); } );
Behind the scenes, Hibernate executes the following SQL queries:
-- Fetch first Account SELECT a.id AS id1_0_0_, a.credit AS credit2_0_0_, a.locale AS locale3_0_0_, a.rate AS rate4_0_0_, REGEXP_REPLACE( a.locale, 'w+_(w+)[_]?', '\1' ) AS formula2_0_, REGEXP_REPLACE( a.locale, '(w+)_.*', '\1' ) AS formula3_0_ FROM Account a WHERE a.id = 1 SELECT l.id AS id1_2_0_, l.name AS name2_2_0_ FROM Language l WHERE l.id = 'en' SELECT c.id AS id1_1_0_, c.name AS name2_1_0_, c.vatRate AS vatRate3_1_0_ FROM Country c WHERE c.id = 'US' -- Fetch second Account SELECT a.id AS id1_0_0_, a.credit AS credit2_0_0_, a.locale AS locale3_0_0_, a.rate AS rate4_0_0_, REGEXP_REPLACE( a.locale, 'w+_(w+)[_]?', '\1' ) AS formula2_0_, REGEXP_REPLACE( a.locale, '(w+)_.*', '\1' ) AS formula3_0_ FROM Account a WHERE a.id = 2 SELECT l.id AS id1_2_0_, l.name AS name2_2_0_ FROM Language l WHERE l.id = 'es' SELECT c.id AS id1_1_0_, c.name AS name2_1_0_, c.vatRate AS vatRate3_1_0_ FROM Country c WHERE c.id = 'MX'
Eager fetching
When JOIN FETCH
-ing the language
and the country
associations:
Account account1 = entityManager.createQuery( "select a " + "from Account a " + "join a.language l " + "join a.country c " + "where a.id = :accountId", Account.class ) .setParameter("accountId", 1L) .getSingleResult(); assertEquals( _English, account1.getLanguage()); assertEquals( _US, account1.getCountry());
Hibernate executes the following SQL query:
SELECT a.id AS id1_0_, a.credit AS credit2_0_, a.locale AS locale3_0_, a.rate AS rate4_0_, REGEXP_REPLACE(a.locale, '\w+_(\w+)[_]?', '\1') AS formula2_, REGEXP_REPLACE(a.locale, '(\w+)_.*', '\1') AS formula3_ FROM Account a INNER JOIN Language l ON REGEXP_REPLACE(a.locale, '(\w+)_.*', '\1') = l.id INNER JOIN Country c ON REGEXP_REPLACE(a.locale, '\w+_(\w+)[_]?', '\1') = c.id WHERE a.id = 1
Make sure that you use a function index for the
@JoinFormula
expressions to speed up the JOIN query execution.Otherwise, the JOIN ON clause will require a full-table scan, hence it will be slow.
Therefore, we need to add the following two indexes to our database migration scripts:
CREATE INDEX account_language_idx ON Account (REGEXP_REPLACE(locale, '(\w+)_.*', '\1')); CREATE INDEX account_country_idx ON Account (REGEXP_REPLACE(locale, '\w+_(\w+)[_]?', '\1'));
This way, when we EXPLAIN ANALYZE
the previous SQL query which joins the Language
and Country
tables with the Account
one,
PostgreSQL generates the following execution plan:
QUERY PLAN |
---|
Nested Loop (cost=0.43..24.51 rows=1 width=540) (actual time=0.094..0.095 rows=1 loops=1) |
-> Nested Loop (cost=0.29..16.34 rows=1 width=540) (actual time=0.064..0.065 rows=1 loops=1) |
-> Index Scan using account_pkey on account a (cost=0.14..8.16 rows=1 width=540) (actual time=0.015..0.015 rows=1 loops=1) |
Index Cond: (id = 1) |
-> Index Only Scan using language_pkey on language l (cost=0.14..8.16 rows=1 width=516) (actual time=0.012..0.012 rows=1 loops=1) |
Index Cond: (id = regexp_replace((a.locale)::text, ‘(\w+)_.*’::text, ‘\1’::text)) |
Heap Fetches: 1 |
-> Index Only Scan using country_pkey on country c (cost=0.14..8.16 rows=1 width=516) (actual time=0.005..0.005 rows=1 loops=1) |
-> Index Cond: (id = regexp_replace((a.locale)::text, ‘\w+_(\w+)[_]?’::text, ‘\1’::text)) |
-> Heap Fetches: 1 |
Planning time: 0.229 ms |
Execution time: 0.171 ms |
As illustrated by the above execution plan, PostgreSQL can benefit from these two function indexes as the execution plan uses an Index Only Scan
for each JOIN
relationship.
I'm running an online workshop on the 11th of October about High-Performance SQL.If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
While most of the time the @JoinColumn
annotation is the best way to express an entity association, if the database schema doesn’t have a FOREIGN KEY column you could use, then @JoinFormula
becomes very useful.
Code available on GitHub.
