The best way to fix the Hibernate “HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!” warning message
Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Hypersistence Optimizer is that tool!
Introduction
If you’ve been using Hibernate long enough, then you surely must have seen this WARN log message when doing pagination while join-fetching multiple entities.
HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!
In this article, I’m going to show two ways you can to fix this issue.
Domain Model
Considering we have the following entities:
The Post
entity has a bidirectional @OneToMany
association with the PostComment
child entity.
Notice that both entities use the Fluent-style API. For more details about building entities using a Fluent-style API with JPA and Hibernate, check out this article.
Now, let’s assume we create 50 Post
entities, each one with several PostComment
child entities.
LocalDateTime timestamp = LocalDateTime .of( 2018, 10, 9, 12, 0, 0, 0 ); LongStream.rangeClosed(1, 50) .forEach(postId -> { Post post = new Post() .setId(postId) .setTitle( String.format("High-Performance Java Persistence - Chapter %d", postId) ) .setCreatedOn( Timestamp.valueOf(timestamp.plusMinutes(postId)) ); LongStream.rangeClosed(1, COMMENT_COUNT) .forEach(commentOffset -> { long commentId = ((postId - 1) * COMMENT_COUNT) + commentOffset; post.addComment( new PostComment() .setId(commentId) .setReview( String.format("Comment nr. %d - A must read!", commentId) ) .setCreatedOn( Timestamp.valueOf(timestamp.plusMinutes(commentId)) ) ); }); entityManager.persist(post); });
The problem
We want to fetch all Post
entities whose titles match a given pattern. However, we want to fetch the associated PostComment
entities as well.
As I explained in this article, you might be tempted to use a JPA pagination query to fetch the Post
entities while also join fetching the PostComment
entities as illustrated by the following JPQL query:
List<Post> posts = entityManager.createQuery(""" select p from Post p left join fetch p.comments where p.title like :titlePattern order by p.createdOn """, Post.class) .setParameter( "titlePattern", "High-Performance Java Persistence %" ) .setMaxResults(5) .getResultList();
We want to fetch the Post
entities along with their comments
and limit the result set to a maximum number of entries.
When running the JPQL query above, Hibernate limits the number of Post
entries, but it issues the aforementioned warning while executing the following SQL query:
-- HHH000104: firstResult/maxResults specified with collection fetch; -- applying in memory! SELECT p.id AS id1_0_0_, pc.id AS id1_1_1_, p.created_on AS created_2_0_0_, p.title AS title3_0_0_, pc.created_on AS created_2_1_1_, pc.post_id AS post_id4_1_1_, pc.review AS review3_1_1_, pc.post_id AS post_id4_1_0__, pc.id AS id1_1_0__ FROM post p LEFT OUTER JOIN post_comment pc ON p.id=pc.post_id WHERE p.title LIKE 'High-Performance Java Persistence %' ORDER BY p.created_on
Notice that the SQL query uses no pagination whatsoever. As mentioned by the HHH000104 warning message, the pagination is done in memory, which is bad.
The reason why Hibernate does the pagination in memory is that it cannot just truncate the result set using SQL-level pagination. If it did that, then the result set would be truncated in the middle of the PostComments
rows, therefore returning a Post
entity with just a subset of comments
.
Because Hibernate favors consistency, it fetches the entire result set and does the pagination in memory. However, that can be suboptimal, so what can we do about it?
Fixing the issue with two SQL queries that can fetch entities in read-write mode
The easiest way to fix this issue is to execute two queries:
. The first query will fetch the Post
entity identifiers matching the provided filtering criteria.
. The second query will use the previously extracted Post
entity identifiers to fetch the Post
and the PostComment
entities.
This approach is very easy to implement and looks as follows:
List<Long> postIds = entityManager.createQuery(""" select p.id from Post p where p.title like :titlePattern order by p.createdOn """, Long.class) .setParameter( "titlePattern", "High-Performance Java Persistence %" ) .setMaxResults(5) .getResultList(); List<Post> posts = entityManager.createQuery(""" select distinct p from Post p left join fetch p.comments where p.id in (:postIds) order by p.createdOn """, Post.class) .setParameter("postIds", postIds) .setHint( QueryHints.HINT_PASS_DISTINCT_THROUGH, false ) .getResultList(); assertEquals(5, posts.size()); Post post1 = posts.get(0); List<PostComment> comments = post1.getComments(); for (int i = 0; i < COMMENT_COUNT - 1; i++) { PostComment postComment1 = comments.get(i); assertEquals( String.format( "Comment nr. %d - A must read!", i + 1 ), postComment1.getReview() ); }
Notice the
hibernate.query.passDistinctThrough
JPA query hint we used to instruct Hibernate to prevent passing the JPQLDISTINCT
keyword to the underlying SQL query. For more details about this query hint, check out this article.
The second query also requires the ORDER BY clause, as, without it, the ordering of the Post
records will not be guaranteed.
When executing the two JPQL queries above, Hibernate generates the following SQL queries:
Query:[" SELECT p.id AS col_0_0_ FROM post p WHERE p.title LIKE ? ORDER BY p.created_on LIMIT ? "], Params:[( 'High-Performance Java Persistence %', 5 )] Query:[" SELECT p.id AS id1_0_0_, pc.id AS id1_1_1_, p.created_on AS created_2_0_0_, p.title AS title3_0_0_, pc.created_on AS created_2_1_1_, pc.post_id AS post_id4_1_1_, pc.review AS review3_1_1_, pc.post_id AS post_id4_1_0__, pc.id AS id1_1_0__ FROM post p LEFT OUTER JOIN post_comment pc ON p.id = pc.post_id WHERE p.id IN (?, ?, ?, ?, ?) ORDER BY p.created_on "], Params:[( 1, 2, 3, 4, 5 )]
That’s the easiest way to fix the issue causing the HHH000104
warning message.
Fixing the issue with one SQL query that can only fetch entities in read-only mode
As I already explained, Window Functions are the answer to many query-related problems.
So, we just need to calculate the DENSE_RANK over the result set of post
and post_comments
that match our filtering criteria and restrict the output for the first N post entries only.
For that, we need to define the following @NamedNativeQuery
along with its associated @SqlResultSetMapping
:
@NamedNativeQuery( name = "PostWithCommentByRank", query = """ SELECT * FROM ( SELECT *, DENSE_RANK() OVER ( ORDER BY "p.created_on", "p.id" ) rank FROM ( SELECT p.id AS "p.id", p.created_on AS "p.created_on", p.title AS "p.title", pc.post_id AS "pc.post_id", pc.id as "pc.id", pc.created_on AS "pc.created_on", pc.review AS "pc.review" FROM post p LEFT JOIN post_comment pc ON p.id = pc.post_id WHERE p.title LIKE :titlePattern ORDER BY p.created_on ) p_pc ) p_pc_r WHERE p_pc_r.rank <= :rank """, resultSetMapping = "PostWithCommentByRankMapping" ) @SqlResultSetMapping( name = "PostWithCommentByRankMapping", entities = { @EntityResult( entityClass = Post.class, fields = { @FieldResult(name = "id", column = "p.id"), @FieldResult(name = "createdOn", column = "p.created_on"), @FieldResult(name = "title", column = "p.title"), } ), @EntityResult( entityClass = PostComment.class, fields = { @FieldResult(name = "id", column = "pc.id"), @FieldResult(name = "createdOn", column = "pc.created_on"), @FieldResult(name = "review", column = "pc.review"), @FieldResult(name = "post", column = "pc.post_id"), } ) } )
The @NamedNativeQuery
fetches all Post
entities matching the provided title
along with their associated PostComment
child entities. The DENSE_RANK
Window Function is used to assign the rank
for each Post
and PostComment
joined record so that we can later filter just the amount of Post
records we are interested in fetching.
The SqlResultSetMapping
provides the mapping between the SQL-level column aliases and the JPA entity properties that need to be populated.
For more details about the best way to use the JPA
SqlResultSetMapping
annotation, you should read this article.
Now, we can execute the PostWithCommentByRank
@NamedNativeQuery
:
List<Post> posts = entityManager .createNamedQuery("PostWithCommentByRank") .setParameter( "titlePattern", "High-Performance Java Persistence %" ) .setParameter( "rank", 5 ) .setHint(QueryHints.HINT_READONLY, true) .unwrap(NativeQuery.class) .setResultTransformer( new DistinctPostResultTransformer(entityManager) ) .getResultList(); assertEquals(5, posts.size()); Post post1 = posts.get(0); List<PostComment> comments = post1.getComments(); for (int i = 0; i < COMMENT_COUNT - 1; i++) { PostComment postComment1 = comments.get(i); assertEquals( String.format( "Comment nr. %d - A must read!", i + 1 ), postComment1.getReview() ); }
We used the READONLY
JPA query hint to instruct Hibernate to discard the underlying entity detached state. For more details about this optimization, check out this article.
Now, by default, a native SQL query like the PostWithCommentByRank
one would fetch the Post
and the PostComment
in the same JDBC row, so we will end up with an Object[]
containing both entities.
However, we want to transform the tabular Object[]
array into a tree of parent-child entities, and for this reason, we need to use the Hibernate ResultTransformer
For more details about the ResultTransformer
, check out this article.
The DistinctPostResultTransformer
looks as follows:
public class DistinctPostResultTransformer extends BasicTransformerAdapter { private final EntityManager entityManager; public DistinctPostResultTransformer( EntityManager entityManager) { this.entityManager = entityManager; } @Override public List transformList( List list) { Map<Serializable, Identifiable> identifiableMap = new LinkedHashMap<>(list.size()); for (Object entityArray : list) { if (Object[].class.isAssignableFrom(entityArray.getClass())) { Post post = null; PostComment comment = null; Object[] tuples = (Object[]) entityArray; for (Object tuple : tuples) { if(tuple instanceof Identifiable) { entityManager.detach(tuple); if (tuple instanceof Post) { post = (Post) tuple; } else if (tuple instanceof PostComment) { comment = (PostComment) tuple; } else { throw new UnsupportedOperationException( "Tuple " + tuple.getClass() + " is not supported!" ); } } } if (post != null) { if (!identifiableMap.containsKey(post.getId())) { identifiableMap.put(post.getId(), post); post.setComments(new ArrayList<>()); } if (comment != null) { post.addComment(comment); } } } } return new ArrayList<>(identifiableMap.values()); } }
The DistinctPostResultTransformer
must detach the entities being fetched because we are overwriting the child collection and we don’t want that to be propagated as an entity state transition:
post.setComments(new ArrayList<>());
Now, not only that we can fetch both the Post
and its PostComments
with a single query, but we can even later modify these entities and merge them back in a subsequent read-write transaction:
List<Post> posts = doInJPA(entityManager -> { return entityManager .createNamedQuery("PostWithCommentByRank") .setParameter( "titlePattern", "High-Performance Java Persistence %" ) .setParameter( "rank", 2 ) .unwrap(NativeQuery.class) .setResultTransformer( new DistinctPostResultTransformer(entityManager) ) .getResultList(); }); assertEquals(2, posts.size()); Post post1 = posts.get(0); post1.addComment( new PostComment() .setId((post1.getId() - 1) * COMMENT_COUNT) .setReview("Awesome!") .setCreatedOn( Timestamp.valueOf(LocalDateTime.now()) ) ); Post post2 = posts.get(1); post2.removeComment(post2.getComments().get(0)); doInJPA(entityManager -> { entityManager.merge(post1); entityManager.merge(post2); });
And Hibernate will properly propagate the changes to the database:
INSERT INTO post_comment ( created_on, post_id, review, id ) VALUES ( '2019-01-09 10:47:32.134', 1, 'Awesome!', 0 ) DELETE FROM post_comment WHERE id = 6
Awesome, right?
If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.
Conclusion
So, to fix the HHH000104
issue, you have two options. Either you execute two queries and fetch the entities in either read-write or read-only mode, or you use a single query with Window Functions to fetch the entities in read-only mode.

Vlad, you sad “We used the READONLY JPA query hint to instruct Hibernate to discard the underlying entity detached state”, but I couldn’t find that hint being passed on the sample code.
Then, on the ResultTransformer, you called “entityManager.detach(tuple)” and I got confused. Is there any relation between the hint and this call to detach?
Thanks for spotting it. I added the hint. There is no correlation between the two. The
detach
is needed because I rewrite the collection and don’t want Hibernate to think that I want to update it.Is it possible to use the “single query” solution using HQL instead of native SQL? I am trying to use createQuery and HQL instead of NamedQuery but when this line is executed: “post.setComments (new ArrayList<>());” I get this error: “A collection with cascade = ‘all-delete-orphan’ was no longer referenced by the owning entity instance”; if I just add the “Comments” without executing that line I get an: “EntityExistsException” even though the list is empty; And if I use “clear()” in the list before adding the new “Comments” then Hibernate brings all the “Comments” from the database post again.
Use the 2 query option with JQPL. The 1 query solution uses Window Functions that are not supported by JPQL.
Which of the 2 options would have better performance?
Only the SQL execution plans can answer this question. Anything but the execution plan is just a hunch.
I thought the same, but actually you can not remove sorting from the first query since it would break sorting across multiple pages. By the way, Vlad thanks a lot for your articles, it is awesome as usual.
If you liked this article, you are going to love my High-Performance Java Persistence book.