Bug 788638
Summary: | as-5 plugin consumes too much memory for managed components | |||
---|---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | John Sanda <jsanda> | |
Component: | Plugins | Assignee: | John Sanda <jsanda> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | |
Severity: | high | Docs Contact: | ||
Priority: | urgent | |||
Version: | 4.2 | CC: | ccrouch, hrupp, jlivings, loleary | |
Target Milestone: | --- | |||
Target Release: | RHQ 4.3.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 790908 (view as bug list) | Environment: | ||
Last Closed: | 2013-09-01 10:07:35 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 790908 |
Description
John Sanda
2012-02-08 16:54:04 UTC
Another big property, relatively speaking, for EJB2Component is one named URL. This is another one we do not use. In my local environment it is about 50 KB. To put it in perspective, if we can filter out the URL and BeanMetaData properties, we would reduce the foot print of the EJB2Component by about 205 KB (in my environment). That would be a reduction in heap usage of over 75%. I need to make a slight correction in regards to my previous comment. I got the property name wrong. The correct property name is EJBModule. URL is a nested property inside of EJBModule. The size of 50 KB is correct though. We have seen this issue with EJB 3 resources as well. In one heap dump I saw EJBComponents consuming over 9 MB of heap. The deployment property of the ManagedComponent corresponding to the EJB3Component consumes the bulk of that 9 MB. This issue is not specific to EJB2 or EJB3 resources; however, it is most likely to manifest itself with them because an application will typically consist of a lot more EJB resources as opposed to WAR, EAR, data source, etc. types. The reason the problem is not specific to one or the other of these types is because EJB2Component, EJB3Component, and a number of other types inherit from a common base class - ManagedComponentComponent. It is ManagedComponentComponent where we cache the managed objects that we get back from the profile service. The fix for this has been pushed to master. master commit hashes: 088d622f5f695a84f0f60d7ecfe087182d52b7df 16e393eea505f38dac2e29c72d3ebaec6a477fdd 4ad3388b4081b9689699c2d705f9162ab33fc37e b973af82febbaba656e06209af7922fcb1891df8 There are two changes with these commits. First, ManagedComponentComponent no longer caches the ManagedComponent it gets back from the profile service. ManagedComponentComponent is the base class for most of the service types in the plugin including all EJB 2/3 resource types; so, this change effectively addresses the high memory usage issues describe in previous comments. Secondly, ManagedComponentComponent.getAvailability has been refactored to avoid the high CPU utilization issues that were addressed in bug 647571. There is now a configurable interval that determines when the data used for availability checks needs to be refreshed, i.e., reloading the managed component from the profile service.That data, which is the RunState property of the ManagedComponent, is updated any time the ManagedComponent is retrieved from the profile service, like during metric collections. The interval is reset as well. The interval is configurable from the top-level application server resource. It has a new plugin configuration property named serviceAvailabilityRefreshInterval which defaults to 15 minutes. For QE testing, we first want to make sure these changes have not introduced any regressions. We want to metric collections, availability checks, etc. still behave as expected. If QE wants to do any performance testing involving large numbers of EJBs, please see https://fedorahosted.org/pipermail/rhq-devel/2012-February/001481.html and feel free to contact me directly via email or IRC for additional help with getting set up. QE might also want to test new plugin configuration property, serviceAvailabilityRefreshInterval, which belongs to the JBossAS Server resource type. The following could be done to test it. For some child types of JBossAS Server (e.g., EJB types, data sources, etc.). Disable metric collections for those types. Set serviceAvailabilityRefreshInterval to a really low value, like 1 minute. On your agent, enable debug logging on the class org.rhq.plugins.jbossas5.ManagedComponentComponent. When an availability check is done and the refresh interval is exceeded as it should be with an interval of 1 minute, you should see log messages of the form, "The availability refresh interval for [resourceKey: <resource_key>, type: <resource_type>, name: <component_name>] has been exceeded by....Reloading managed component." This bug is still marked as ON_QA, but fixes appear to be included in RHQ 4.3 and 4.4. Did the status need to be flipped over, or am I mis-reading the git log and the fixes aren't included? (In reply to comment #6) > This bug is still marked as ON_QA, but fixes appear to be included in RHQ > 4.3 and 4.4. Did the status need to be flipped over, or am I mis-reading the > git log and the fixes aren't included? The status is correct. This bug has been fixed in RHQ 4.3. It remains ON_QA until it is VERIFIED as fixed. The product version of this bug is Bug 790908 which has been VERIFIED in JON 3.0.1. I think that the status just did not get updated. Moving to VERIFIED. (In reply to comment #8) > I think that the status just did not get updated. Moving to VERIFIED. QE hasn't VERIFIED this yet. It should remain ON_QA until QE verifies. I changed to VERIFIED on the basis that bug 790908 is a clone of this bug and has already been verified by QE. Since this fix has already been included in a community release and has been verified by QE on a product release branch, should I expect QE to go back and verify this? For now I will move back to ON_QA until this gets sorted out. Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since. |