Hibernate Facts: Always check Criteria API SQL queries

Criteria API is very useful for dynamically building queries, but that’s the only use case where I’d use it. Whenever you have an UI with N filters that may arrive in any M combinations, it makes sense to have an API to construct queries dynamically, since concatenating strings is always a path I’m running away from.

The question is, are you aware of the SQL queries your Criteria API generates behind the scenes? I’ve been reviewing many such queries lately, and I’ve been struck by how easy it is to get it wrong.

Let’s start from the following entities diagram:

Product

So we have a Product with a ToOne association to a WareHouseProductInfo and a ToMany association to an Image entity.

Now let’s start with this Criteria API query:

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Product> query = cb.createQuery(Product.class);
Root<Product> productRoot = query.from(Product.class);

query.select(productRoot)
	 .where(cb.and(cb.equal(productRoot.get(Product_.code), "tvCode"), 
				   cb.gt(productRoot.get(Product_.warehouseProductInfo).get(WarehouseProductInfo_.quantity), 50)));
Product product = entityManager.createQuery(query).getSingleResult();

Can you spot any issue with this previous query? Let’s check the generated SQL:

SELECT product0_.id          AS id1_14_,
       product0_.code        AS code2_14_,
       product0_.company_id  AS company_5_14_,
       product0_.importer_id AS importer6_14_,
       product0_.name        AS name3_14_,
       product0_.version     AS version4_14_
FROM   product product0_
       CROSS JOIN warehouseproductinfo warehousep1_
WHERE  product0_.id = warehousep1_.id
       AND product0_.code = ?
       AND warehousep1_.quantity > 50  

I was expecting an INNER JOIN and I got a CROSS JOIN instead. The Cartesian Product is very inefficient, and that’s what you get if you forget to properly join the associations you’re interested in filtering with on your where clauses. So, writing Criteria API is not a walk in the park after all.

Luckily, this example can be fixed as follows:

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Product> query = cb.createQuery(Product.class);
Root<Product> productRoot = query.from(Product.class);
Join<Product, WarehouseProductInfo> warehouseProductInfoJoin = productRoot.join(Product_.warehouseProductInfo);

query.select(productRoot)
	 .where(cb.and(cb.equal(productRoot.get(Product_.code), "tvCode"),
				   cb.gt(warehouseProductInfoJoin.get(WarehouseProductInfo_.quantity), 50)));
Product product = entityManager.createQuery(query).getSingleResult();

which yields the expected SQL query:

SELECT product0_.id          AS id1_14_,
       product0_.code        AS code2_14_,
       product0_.company_id  AS company_5_14_,
       product0_.importer_id AS importer6_14_,
       product0_.name        AS name3_14_,
       product0_.version     AS version4_14_
FROM   product product0_
       INNER JOIN warehouseproductinfo warehousep1_
               ON product0_.id = warehousep1_.id
WHERE  product0_.code = ?
       AND warehousep1_.quantity > 50  

So, be careful with how you define your joins in Criteria API. Now, let’s compare the previous Criteria API query to its JPAQL counterpart:

Product product = entityManager.createQuery(
	"select p " +
	"from Product p " +
	"inner join p.warehouseProductInfo w " +
	"where " +
	"   p.code = :code and " +
	"   w.quantity > :quantity ", Product.class)
.setParameter("code", "tvCode")
.setParameter("quantity", 50)
.getSingleResult();

I’ve always found the JPAQL more descriptive than Criteria API, but there are projects where the Criteria API is the default JPA querying mechanism, so it’s not only employed for dynamic filters queries, but even for the ones with fixed where clauses.

Well, you can eventually achieve the same results, but while I can predict the SQL query out of a JPAQL one, when it comes to Criteria API I am completely clueless. Whenever I review a Criteria query, I always have to run an Integration Test to check the outputted SQL, for small changes can really make big differences.

Even if Criteria API usage is imposed, you can still work around it, considering you are extra careful and you review all your queries.

Now let’s get back to one of the most exotic joining subterfuge (yet sub-optimal) Criteria query I happened to bumped into recently. If you work on a large project with many developers, you’ll inevitably run into this type of constructs. That’s one extra reason why I prefer JPAQL over Criteria API. With JPAQL you can’t get it so off-the-rails like in the following example:

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Product> query = cb.createQuery(Product.class);
Root<Product> product = query.from(Product.class);
query.select(product);
query.distinct(true);

List<Predicate> criteria = new ArrayList<Predicate>();
criteria.add(cb.like(cb.lower(product.get(Product_.name)), "%tv%"));

Subquery<Long> subQuery = query.subquery(Long.class);
Root<Image> infoRoot = subQuery.from(Image.class);
Join<Image, Product> productJoin = infoRoot.join("product");
subQuery.select(productJoin.<Long>get(Product_.id));

subQuery.where(cb.gt(infoRoot.get(Image_.index), 0));
criteria.add(cb.in(product.get(Product_.id)).value(subQuery));
query.where(cb.and(criteria.toArray(new Predicate[criteria.size()])));
return entityManager.createQuery(query).getResultList();

I find these types of queries too difficult to parse through viewing them only, but there is a sub-select smelling like trouble, so let’s see the generated SQL query:

 SELECT DISTINCT product0_.id          AS id1_14_,
                product0_.code        AS code2_14_,
                product0_.company_id  AS company_5_14_,
                product0_.importer_id AS importer6_14_,
                product0_.name        AS name3_14_,
                product0_.version     AS version4_14_
FROM   product product0_
WHERE  ( Lower(product0_.name) LIKE ? )
       AND ( product0_.id IN (SELECT product2_.id
                              FROM   image image1_
                                     INNER JOIN product product2_
                                             ON image1_.product_id =
                                                product2_.id
                              WHERE  image1_.index > 0) )  

While some use-cases call for a SQL sub-query, here it’s just completely unnecessary and it only slows your query down. But this time we were actually requiring a dynamic filtering query, so JPAQL was out of question. The only way to fix it was to write the proper Criteria query.

So here it is, after a refactoring:

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Product> query = cb.createQuery(Product.class);
Root<Image> imageRoot = query.from(Image.class);
Join<Image, Product> productJoin = imageRoot.join("product");
query.select(productJoin);
query.distinct(true);
List<Predicate> criteria = new ArrayList<Predicate>();
criteria.add(cb.like(cb.lower(productJoin.get(Product_.name)), "%tv%"));
criteria.add(cb.gt(imageRoot.get(Image_.index), 0));
query.where(cb.and(criteria.toArray(new Predicate[criteria.size()])));
return entityManager.createQuery(query).getResultList();

And now our SQL query looks much better:

 SELECT DISTINCT product1_.id          AS id1_14_,
                product1_.code        AS code2_14_,
                product1_.company_id  AS company_5_14_,
                product1_.importer_id AS importer6_14_,
                product1_.name        AS name3_14_,
                product1_.version     AS version4_14_
FROM   image image0_
       INNER JOIN product product1_
               ON image0_.product_id = product1_.id
WHERE  ( Lower(product1_.name) LIKE ? )
       AND image0_.index > 0  

I’d reasoned on why the developer would have chosen the sub-query in this particular context, and I believed it was because he hadn’t known that he could project a different entity than the Root one, in a similar fashion with a JPAQL query.

Now let’s do a DTO projection, since there are times when we don’t need fetching entire Entities, but just enough info to serve our business needs. This time we will create the following query:

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<ImageProductDTO> query = cb.createQuery(ImageProductDTO.class);
Root<Image> imageRoot = query.from(Image.class);
Join<Image, Product> productJoin = imageRoot.join(Image_.product);
query.distinct(true);
List<Predicate> criteria = new ArrayList<Predicate>();
criteria.add(cb.like(cb.lower(productJoin.get(Product_.name)), "%tv%"));
criteria.add(cb.gt(imageRoot.get(Image_.index), 0));
query.where(cb.and(criteria.toArray(new Predicate[criteria.size()])));
query.select(cb.construct(ImageProductDTO.class, imageRoot.get(Image_.name), productJoin.get(Product_.name)))
		.orderBy(cb.asc(imageRoot.get(Image_.name)));
return entityManager.createQuery(query).getResultList();

Generating a clean SQL:

SELECT DISTINCT image0_.name   AS col_0_0_,
                product1_.name AS col_1_0_
FROM   image image0_
       INNER JOIN product product1_
               ON image0_.product_id = product1_.id
WHERE  ( Lower(product1_.name) LIKE ? )
       AND image0_.index > 0
ORDER  BY image0_.name ASC  

But check the previous Criteria Query to how JOOQ builds such a query:

jooqContext
.select(IMAGE.NAME, PRODUCT.NAME)
.from(IMAGE)
.join(PRODUCT).on(IMAGE.PRODUCT_ID.equal(PRODUCT.ID))
.where(PRODUCT.NAME.likeIgnoreCase("%tv%"))
	.and(IMAGE.INDEX.greaterThan(0))
.orderBy(IMAGE.NAME.asc())
.fetch().into(ImageProductDTO.class);

This is way more readable, you don’t really have to guess what’s the output SQL query, and it even generates binding parameters, that I found extremely valuable:

SELECT "PUBLIC"."image"."name",
       "PUBLIC"."product"."name"
FROM   "PUBLIC"."image"
       JOIN "PUBLIC"."product"
         ON "PUBLIC"."image"."product_id" = "PUBLIC"."product"."id"
WHERE  ( Lower("PUBLIC"."product"."name") LIKE Lower('%tv%')
         AND "PUBLIC"."image"."index" > 0 )
ORDER  BY "PUBLIC"."image"."name" ASC  

Conclusion

The first case I showed you is one of the very first mistakes I did while trying to learn the Criteria API. I discovered that I have to be extra caution when writing such queries, since you can easily get unexpected SQL queries.

If you’ve chosen to use Criteria API for all your queries, then you might be interested in checking JOOQ too. Even if you choose JPAQL, whenever you want to build advanced dynamic filtered queries, JOOQ can better help you with that.

You will still be using a fluent API, you won’t write any String and you will get more SQL features than what Hibernate currently offers. So, whenever your use cases don’t require querying for managed Entities, you can use JOOQ instead. I like it because I can predict the generated SQL much better than with Criteria API, and when an API is easier to use, there are less “surprises” waiting to “wow” you.

Code available on GitHub.

If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.

About these ads

8 thoughts on “Hibernate Facts: Always check Criteria API SQL queries

  1. It always surprises me how people can write or worse: read queries with the Criteria API. Considering how much logic is also offloaded into annotations (e.g. http://ctmmc.net/8), there could hardly be anything farther from SQL than JPA…

    • Criteria API is not that easy, indeed, and most of the time you don’t really need to return managed Entities for a dynamic query that’s needed for some specific UI use case. Usually you can get along with a simple data-snapshot, which is more efficient but then people don’t want to look outside of JPAQL ot Criteria API. We always have to be flexible, and never trade-off readability, especially in enterprise software.

      • We always have to be flexible, and never trade-off readability, especially in enterprise software.

        Interestingly, that’s precisely Gavin King’s point (and always has been). Hibernate only covers 50-70% of the problem domain:
        https://plus.google.com/+GavinKing/posts/LGJU1NorAvY

      • It’s so interesting when you discover you’ve been sharing the same view with other people, in relation to frameworks and their usage. After using Hibernate you either love it and see its strengths and weaknesses or you go bashing it for not solving all DB issues you’d have even without using JPA.

      • Yes, it’s crazy. People always rant. Given that Hibernate is completely free software, I sometimes feel ashamed of our industry. Ranting but not improving things (through contributions) is certainly not a reputable option.

        What’s most crazy is the fact that people forget Hibernate’s predecessor(s): EJB 2.0 / JDBC. No one wants to go back to that combo, right?

        I always refer to our marketing site to tell people to stop the hatred and choose the right tool for the right job: http://www.hibernate-alternative.com

      • The overall trend is faborable, since tools have been improving significantly in the past 10 years, but the core concepts haven’t changed much, that’s why I write about Transactions, Rdbms, etc.

  2. You mentioned “I was expecting an INNER JOIN and I got a CROSS JOIN instead. The Cartesian Product is very inefficient”. I think that CROSS JOIN with a WHERE clause is equivalent to INNER JOIN in almost all RDBMS and there is no performance difference too.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s