Created attachment 787119 [details] PNG from VisualVM Description of problem: For agents with 10,000 resources, VisualVM reports about 30% CPU usage for this one method. See attached screenshot. It's obviously not an issue with much fewer resources, but it's something to consider fixing. There are two approaches: 1. Add Map<Integer, Resource> in addition. 2. Replace the existing Map<String, Resource> map with <Integer, Resource> instead, and then when querying by UUID, just iterate. Basically the reverse of the existing situation. How reproducible: Always. See the Bug 991153 for how I created this many resources.
Created attachment 787233 [details] Proposed patch Elias, good find. Attached is a proposed patch, that I am currently also testing on RHEL. On OS/X I saw getResourceContainer() for ~1600 resources, while afterwards visualvm did not show that method in its sampling anymore. On RHEL I am on ~50ms / invocation for ~2500 resources - with the patch that seems mostly gone. Could you test that on your large result set? A different implementation could also just hold a: Map<resourceId,uuid> and then the getComponent would be String uuid = resourceContainersById.get(resourceId); ResourceContainer rc = resourceContainers.get(uuid); I think it may also be beneticial to initialize the HashSet with a larger size in org.rhq.core.pc.measurement.MeasurementManager#getNextScheduledSet Set<ScheduledMeasurementInfo> nextScheduledSet = new HashSet<ScheduledMeasurementInfo>(this.scheduledRequests.size()); Not sure if the complete original size makes sense, but the default is too small and will end up in repeated enlarging of the set.
( Sampling may not be that scientific, but at the end I also see some 30% of the time spent in getResourceContainer() ). One could try to validate this with a profiler.
I wanted to mention I have roughly about 20,000 metrics per minute per agent, so it's not a typical configuration at all. So to see this as an issue you might have to increase the number of metrics. As for the OS, it is RHEL 6 on HP ProLiant DL360 G7 with 32GB of memory.
Well, the number of calls to getResourceContainer() scales with the number of getValues() calls, but not directly with the number of metrics, as a huge number of them may be gathered in one go and on one resource. I was running that with only 2.5k resources, but 60k metrics per minute. As you mentioned before that you have 10k resources, I was thinking if you could test that in said env.
Created attachment 788404 [details] Proposed patch An updated proposed patch, that uses less memory by keeping a mapping resource id -> uuid instead of resource id -> container. getResourceContainer(resourceId) now first tries to look up by resource id. If that fails, we search through the containers and then add the found mapping resource id -> uuid to above map. Some other adjustments have been made especially in the calls to check if debug logging is enabled. Also hash set allocation size is now much higher to prevent enlarging the backing map during set.add() calls.
This has been pushed to master as 81b37ad -- instead of a classical map, we now use a Trove Int->Object map and key on int instead of Integer. That saves around 1MB heap relative to a classical HashMap for 40k resources. With 40k resources the memory usage of this map is ~1.2MB
Bulk closing of 4.10 issues. If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10.