Red Hat Bugzilla – Bug 620603
investigate root cause of "firstResult/maxResults specified with collection fetch; applying in memory!"
Last modified: 2014-01-02 15:39:47 EST
Lately, developers have started to notice the following message printed in the server log:
2010-08-02 09:32:47,478 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
According to several articles, this could be a performance bottleneck as it would indicate that **much** more data than we think is being fetched across the line, in order to apply the in-memory pagination (firstResult/maxResults) filter. My suspicions are that we're doing the exact same thing mentioned in . That is, we're trying to pagination a data set where we've also included 'FETCH JOIN' in the JPQL. From the EJB3 spec:
"The effect of applying setMaxResults or setFirstResult to a query involving fetch joins over collections is undefined" (section 3.6.1 - Query Interface)
So the error we're getting shouldn't come as a surprise if we're doing something that is explicitly undefined by spec. In this case, hibernate knows how to deal with the scenario (it fetches all the data, and applies the pagination controls in-memory), but that solution could be very expensive if the data set we're dealing with is large.
We should, thus, investigate where these potentially expensive queries are being used, so we can determine whether a fix is warranted, and what the fix should be depending on uses.
 - http://java.dzone.com/articles/hibernate-tuning-queries-using
(1:24:32 PM) ccrouch: joseph42: I wonder if this is related to the issue you were talking about this morning
(1:24:33 PM) ccrouch: https://bugzilla.redhat.com/show_bug.cgi?id=535592
(1:29:04 PM) joseph42: 535592 is definitely bad, but probably not related to 620603. keep in mind, 603 is a more recent issue, so likely due to moving parts of the back-end over to using the query generation stuff instead of hand-crafting all the JPQL like we used to.
Worth a try :-)
This warning comes out when we have paging set in the hibernate query, but the query also has join fetches. This requires hibernate to ask the DB for all records, and it applies the paging in memory. It sounds like there is an area here where we can eek out some more performance.
[master 2e5c0fe] temporarily turns the log level down from WARN to ERROR for org.hibernate.hql.ast.QueryTranslatorImpl to suppress the warning messages, since the Server log was getting flooded with them.
QE, please verify that the warnings are no longer in the Server log and that there are no other obvious regressions, then flip this back to ON_DEV so I can cherry-pick the commit to the JON3 release branch.
Tested on latest master build#787
( https://hudson.qa.jboss.com/hudson/view/Jon-2.x/job/rhq-master/787 )
Still the warning message is observed in server log:
2011-11-24 15:29:25,214 WARN [org.hibernate.hql.ast.QueryTranslatorImpl] firstResult/maxResults specified with collection fetch; applying in memory!
It seems the fix is not in the master branch.
Sorry about that. I committed it but forgot to push. It's in now:
> [master 2e5c0fe] temporarily turns the log level down from WARN to ERROR for
that git hash is not valid. this must have been the hash from the local branch??
ah.. from comment #10, the hash is f1475db
committed to the release_jon3.x branch : 3f33130
Verified in latest master build#796
The warning messages are not observed in the Server log.
Will verify in next release_jon3.x branch build.
Verified in build#111 in release_jon3.x branch
The warning messages are not observed in the Server log.
Marking as verified.
Suppressing the log messages was just a temporary fix. I'm flipping this back to NEW, so we can look into fixing the underlying issue post JON 3.0.
We use join fetch in quite a lot of places, but I would expect that Criteria fetches are typically the number one offender. When requesting optional data via someCriteria.fetchXXX(true) we add a join fetch. Also, we page by default (200 entries).
We don't always use paging and even when we do it's not always needed.
Do we know is this message is generated whenever there is a query executed with paging and join fetch? Or only when it's actually performing the extra work (i.e. fetching page 2..N)?
The obvious workaround is to, only in places identified as a real problem, avoid the join fetch by potentially making an additional round trip to gather the optional data.
We need to be cautious here.
I've been studying the impact of our use of JOIN FETCH in queries to which we
simulatenously apply limits (captured by
There are two of areas where such queries get executed:
1) Any remote API method with a PageControl can potentially cause such query
to get executed (if the query to be run by the API method includes a JOIN
FETCH - this is something I don't have a clear idea about yet - there are 122
places in the codebase that apply ordering and paging using the PageControl
and 70 occurances of "JOIN FETCH" in our domain classes, but obviously not all
of these will result in a possible JOIN FETCH with limit).
2) Any criteria query for criterias that support a fetch of additional
The first type is quite hard to determine the impact of, because the vast number of uses of the code that applies the limits to the query that are spread all over the codebase.
The second type is very easy to analyze, because it is concentrated into 2 classes: CriteriaQueryGenerator and CriteriaQueryRunner.
To try and determine the impact of the first type, I implemented a custom QueryGenerator for Hibernate that enhances the default one to log some more detailed data about the query than just the fact that it is filtered in memory (which is what the default impl in Hibernate does). Namely I logged the JPA, SQL and the stacktrace where the DB query is generated from.
With this tool, I went ahead and generated a larg(ish) inventory of 30 agents with some 20000 resources in total. 30 of those resources had alert definitions on them with several thousands fired alerts for each resource. While this dataset is a bit artificial I think it can be used to simulate the problems of some larger-scale RHQ inventories, at least when it comes to query performance.
By "casually clicking" in the UI, trying to do basic tasks like paging through alert history, firing operations, etc, I've found no evidence of occurrences of the first type of potential limited-JOIN-FETCH queries (the reason for that is that the application of limits is co-located with the application of ordering to the queries and it seems that ordering is why that method is called in the majority of cases). On the other hand, even while browsing the UI there were a number criteria queries invoked that featured a limited-JOIN-FETCH query. (For example querying for the 5 most recent alerts on a resource summary page took approx. 2.5s).
This brought me to the conclusion that the criteria queries are the major contributor to our problem.
This is a quite fortunate coincidence because fixing the issue for criteria queries alone is much simpler than trying to fix it globally.
The fix is now in a bug branch https://git.fedorahosted.org/cgit/rhq/rhq.git/log/?h=bug/620603 and I would very much like you to comment review it before merging it into master.
The merit of the fix is the following:
1) Replacing the JOIN FETCH with a manual initialization of the lazily fetched fields.
2) Optimizing the loading of such lazy fields and lazy collections by configuring the "hibernate.default_batch_fetch_size" property in persistence.xml to load the data in batches (if this setting is set to a number N then Hibernate will load the data for a lazy field for N "owning" entities at once instead of 1 by 1. I.e. setting this number to 2 will cut the number of DB roundtrips required load "children" of N "parent" entities in half compared to the trivial case where access to a "children" of a single "parent" entity would require a DB call).
To test the affects of that change, I created a very simple benchmark that basically compared the times required to load collections with differently set limits in the server with or without the patch.
The basis for the benchmark was a ResourceCriteria that tried to fetch all the resources together with their alert definitions, e.g.:
var crit = new ResourceCriteria
var results = ResourceManager.findResourcesByCriteria(crit)
Without further ado, let me present the results (based on running each test 10 times):
1) Current codebase (without the fix):
No paging: 47417.7
Paging(0, 10): 26225
Paging(0, 50): 25219.8
Paging(0, 200): 24192.1
Paging(0, 500): 24419.2
Paging(0, 2000): 27813.2
2) Fix with batch size 16:
No paging: 52745.2
Paging(0, 10): 83.2
Paging(0, 50): 192.4
Paging(0, 200): 545
Paging(0, 500): 1201
Paging(0, 2000): 4451.7
3) Fix with batch size 32:
No paging: 53401.5
Paging(0, 10): 76
Paging(0, 50): 166.5
Paging(0, 200): 503.8
Paging(0, 500): 1206.4
Paging(0, 2000): 4459
The results show some interesting "features" of the old implementation as well as some interesting facts about the fix:
1) The results of the old implementation are consistent with the behavior of JOIN FETCH with limits as described by Hibernate documentation. The collection is loaded as whole into the memory and only filtered in-memory after that fact. The difference between the "No Paging" and the rest of the results can most probably be atributted to the time required to serialize the huge result set in the "No Paging" case as opposed to the rest of the cases.
2) The times of the results with the increasing page size are much more intuitive here. The speed up over the current codebase is quite impressive. The slowdown in the no-paging scenario is actually not that bad considering the fact that it performs many more DB roundtrips than the current codebase.
3) The improvement over the 2) is within the error margin.
Author: Lukas Krejci <email@example.com>
Date: Thu May 9 15:15:02 2013 +0200
[BZ 620603] - Re-enabling the limited JOIN FETCH reporting
with a more informative message, asking the users to co-operate in
identifying the individual uses of the problematic JOIN FETCH queries that
we potentially missed.
Author: Lukas Krejci <firstname.lastname@example.org>
Date: Thu May 9 13:11:50 2013 +0200
[BZ 620603] - Don't use JOIN FETCH in criteria queries with limits.
Use the "manual" initialization of lazy fields and collections in
that case instead.
Together with the "hibernate.default_batch_fetch_size" setting in the
hibernate configuration, this offers superior performance over the join
fetch and in memory filtering of results when the number of rows in the
"primary" tables times the number of rows in the "join-fetched" table
As this is MODIFIED or ON_QA, setting milestone to ER1.
warning messages no longer observed