Bug 811696 - High Agent CPU utilization after enabling certain Metric Collection Templates
High Agent CPU utilization after enabling certain Metric Collection Templates
Status: CLOSED CURRENTRELEASE
Product: JBoss Operations Network
Classification: JBoss
Component: Agent (Show other bugs)
JON 3.0.0
x86_64 Linux
unspecified Severity unspecified
: ER01
: JON 3.2.0
Assigned To: Jay Shaughnessy
Mike Foley
:
Depends On:
Blocks: 812968 813917
  Show dependency treegraph
 
Reported: 2012-04-11 14:39 EDT by David van Balen
Modified: 2014-01-02 15:38 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 812968 813917 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
inventory.xml from lenovo (332.57 KB, application/xml)
2012-04-16 15:45 EDT, David van Balen
no flags Details
RHEL VM inventory file (283.33 KB, application/xml)
2012-04-16 16:02 EDT, David van Balen
no flags Details

  None (edit)
Description David van Balen 2012-04-11 14:39:31 EDT
Description of problem: If certain Metric Collection Templates are enabled for Tomcat Web Application (WAR) -- e.g. "Currently Active Sessions", "Processing Errors" or "Requests served" -- the RHQ agent will start displaying very high CPU utilization soon after. Running top on the agent's VM shows "Cpu(s)" at 51.7%us and the RHQ Agent process has a cpu value of 102-103% (even higher with more than one metric template enabled).

This appears to only apply to an agent that is monitoring a Tomcat/EWS server. An agent on a different server with no Tomcat/EWS instances didn't seem to have the same problem. It also doesn't seem to apply to the "Processing Errors per Minute" and "Requests served per Minute" templates, which I have enabled with a collection interval of 20 minutes and they don't seem to be causing problems.


Version-Release number of selected component (if applicable): JON 3.0.0.GA, JON Agent 4.2.0.JON300.GA, RHEL 5.5


How reproducible: Always


Steps to Reproduce:
1. Install/start JON server and an agent.
2. Install/start Tomcat/EWS (Tomcat 6)
3. Import Agent and Tomcat/EWS into JON server inventory
4. In the JON server UI, navigate to Administration->Metric Collection Templates->Tomcat Server->Tomcat Virtual Host->Tomcat Web Application (WAR)
5. Enable one of the problem metric templates (e.g. "Currently Active Sessions", "Processing Errors" or "Requests served"). Collection interval can be 10, 20 or 40 minutes, and should display the same results.
6. Navigate to the agent in the JON server UI and restart it.
7. Run top on the agent's server/VM
8. Within a minute or two, the agent's CPU utilization should increase substantially.
  
Actual results: RHQ Agent creates very high CPU load


Expected results: RHQ Agent should continue to create reasonable CPU load


Additional info: /proc/cpuinfo on the agent's VM reports two CPUs of type Intel Xeon X5690 @ 3.47GHz.
Comment 1 David van Balen 2012-04-13 19:24:53 EDT
See the same results on a lenovo laptop with quad core i7 (/proc/cpuinfo shows four of the following: Intel(R) Core(TM) i7 CPU       M 620  @ 2.67GHz).
Comment 2 Charles Crouch 2012-04-16 12:39:27 EDT
Hi David
Thanks for the bug report. Can you supply some more info:

-Full version of EWS being monitored
-Java version running JON agent and Java version running EWS
-Can you attach a copy of the inventory.xml from underneath the JON agent install
-How long does the high CPU load last?
Comment 3 David van Balen 2012-04-16 15:05:40 EDT
Where in the agent's directory structure should the inventory.xml file be located? I ran a find in both locations and nothing came up.

As for the other questions:

On RHEL6 VM:

EWS 1.0.2

java version "1.6.0_24"
Java (TM) SE Runtime Environment (build 1.6.0_4-b07)
Java HotSpot (TM) 64-Bit Server VM (build 19.1-b02, mixed mode)

On lenovo laptop running Fedora 15:

EWS 1.0.2-RHEL6-i386

java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.6) (fedora-63.1.10.6.fc15-i386)
OpenJDK Server VM (build 20.0-b11, mixed mode)


CPU load seems to continue indefinitely, as long as those EWS/Tomcat metrics are enabled. Even after disabling them, I had to restart the agent service in order for CPU usage to subside (note that restarting the service involved invoking rhq-agent-wraper.sh on the server/laptop. Simply restarting the agent through the JON server UI didn't work).
Comment 4 David van Balen 2012-04-16 15:45:53 EDT
Created attachment 577812 [details]
inventory.xml from lenovo
Comment 5 David van Balen 2012-04-16 15:46:40 EDT
Sorry, forgot you have to generate the inventory.xml file. I uploaded the file for the lenovo. I'll upload it for the VM in a bit.
Comment 6 David van Balen 2012-04-16 16:02:38 EDT
Created attachment 577819 [details]
RHEL VM inventory file
Comment 9 Jay Shaughnessy 2012-04-18 23:29:02 EDT

This has been fixed upstream. See bug 812968.
Comment 10 Larry O'Leary 2013-09-06 10:31:42 EDT
As this is MODIFIED or ON_QA, setting milestone to ER1.
Comment 11 Armine Hovsepyan 2013-09-19 12:00:23 EDT
bug was verified in upstream, re-tested with jon 3.2 er1 - no regression (only during the start/restart the cpu usage is high and then it gets back to ~1.2%)

Note You need to log in before you can comment on or make changes to this bug.