The best way to fix the Hibernate “HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!” warning message

Introduction

If you’ve been using Hibernate long enough, then you surely must have seen this WARN log message when doing pagination while join fetching multiple entities.

HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!

In this article, I’m going to show a very elegant solution to fix this issue.

Domain Model

Considering we have the following entities:

The Post entity has a bidirectional @OneToMany association with PostComment.

@OneToMany(mappedBy = "post", cascade = CascadeType.ALL)
private List<PostComment> comments = new ArrayList<>();

The problem

Now, consider the following JPQL query:

List<Post> posts = entityManager.createQuery(
    "select p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title like :title " +
    "order by p.id", Post.class)
.setParameter("title", titlePattern)
.setMaxResults(maxResults)
.getResultList();

We want to fetch the Post entities along with their comments and limit the result set to a maximum number of entries.

When running the JPQL query above, Hibernate limits the number of Post entries, but it issues the aforementioned warning:

HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!

The reason why Hibernate does the pagination in memory as it cannot just truncate the result set using SQL-level pagination. If it did that, then the result set be truncated in the middle of the PostComments rows, therefore returning a Post entity with just a subset of comments.

Because Hibernate favors consistency, it fetches the entire result set and does the pagination in memory. However, that can be suboptimal, so what can we do about it?

The solution

As I already explained, Window Functions are the answer to many Query-related problems.

So, we just need to calculate the DENSE_RANK over the result set of post and post_comments that match our filtering criteria and restrict the output for the first N post entries only.

List<Post> posts = entityManager.createNativeQuery(
    "select p_pc_r.* " +
    "from (   " +
    "    select *, dense_rank() OVER (ORDER BY post_id) rank " +
    "    from (   " +
    "        select p.*, pc.* " +
    "        from post p  " +
    "        left join post_comment pc on p.id = pc.post_id  " +
    "        where p.title like :title " +
    "        order by p.id " +
    "    ) p_pc " +
    ") p_pc_r " +
    "where p_pc_r.rank <= :rank", Post.class)
.setParameter("title", titlePattern)
.setParameter("rank", maxResults)
.unwrap( NativeQuery.class )
.addEntity( "p", Post.class )
.addEntity( "pc", PostComment.class )
.setResultTransformer( DistinctPostResultTransformer.INSTANCE )
.getResultList();

Because this is a native SQL query, we need the DistinctPostResultTransformer to rebuild the entity graph from the tabular JDBC ResultSet:

public class DistinctPostResultTransformer 
        extends BasicTransformerAdapter {

    private static final DistinctPostResultTransformer INSTANCE  = 
            new DistinctPostResultTransformer();

    @Override
    public List transformList(List list) {
        Map<Serializable, Identifiable> identifiableMap = 
                new LinkedHashMap<>( list.size() );
                
        for ( Object entityArray : list ) {
            if ( Object[].class.isAssignableFrom( 
                    entityArray.getClass() ) ) {
                Post post = null;
                PostComment comment = null;

                Object[] tuples = (Object[]) entityArray;

                for ( Object tuple : tuples ) {
                    if(tuple instanceof Post) {
                        post = (Post) tuple;
                    }
                    else if(tuple instanceof PostComment) {
                        comment = (PostComment) tuple;
                    }
                    else {
                        throw new UnsupportedOperationException(
                            "Tuple " + tuple.getClass() + " is not supported!"
                        );
                    }
                }
                Objects.requireNonNull(post);
                Objects.requireNonNull(comment);

                if ( !identifiableMap.containsKey( post.getId() ) ) {
                    identifiableMap.put( post.getId(), post );
                    post.setComments( new ArrayList<>() );
                }
                post.addComment( comment );
            }
        }
        return new ArrayList<>( identifiableMap.values() );
    }
}

That’s it! This way, we can fetch both the Post and its PostComments with a single query.

If you enjoyed this article, I bet you are going to love my book as well.

Conclusion

Although some might fix this issue with two queries: one for selecting the Post entity identifiers for the current page and another one which fetches Post along with its comments by the previously fetched identifiers, it’s way more efficient to use a single query filtered by DENSE_RANK.

If you liked this article, you might want to subscribe to my newsletter too.

Advertisements

5 thoughts on “The best way to fix the Hibernate “HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!” warning message

  1. Another way for the DBs that have a “limit” or equivalent clause below.
    (can you markup code in comments?):

    select p_r.*, pc.*
    from (
    select p.* from post p
    where p.title like :title
    order by p.id
    fetch next :maxresults rows only
    ) p_r
    left join post_comment pc on p_r.id = pc.post_id;

    1. … but it’s actually pretty weak for Hibernate not to do that by itself. I mean, if we have to use native queries for the simplest things, why should we use Hibernate at all?

    2. When you say DB, you mean the ones that support the SQL:2008 query limit syntax. Oracle 12 is the first version to support this syntax. That being said, although Hibernate supports various limit clauses, it would have to also adapt to this depending on what joins are needed. Maybe in Hibernate 6, we will add support for this.

      1. Yeah sorry, I meant for any DB that supports the limit or fetch x rows or equivalent syntax in a subquery. A quick look through the internet showed me that it should be supported on Oracle 12, DB2, but also MySQL (some subquery types), Postgres (both with the keyword LIMIT).

        It is just an alternative way to achieve what you showed in your post. I don’t actually know which is “better”.

        And yes, I haven’t read your book indeed 🙂 I will put it on my to-read list.

        Cheers
        Janne

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s