Bug 630165

Summary: JBoss AS 5 plug-in uses very aggressive metric collection schedule for Queues
Product: [Other] RHQ Project Reporter: Larry O'Leary <loleary>
Component: PluginsAssignee: RHQ Project Maintainer <rhq-maint>
Status: ON_QA --- QA Contact: Mike Foley <mfoley>
Severity: medium Docs Contact:
Priority: low    
Version: 3.0.0   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
JON 2.4.0.GA SOA-P 5.0.0.GA Single host (JON Server, JON Agent, and SOA-P instance) Inventory contains 1 Platform, 1 JON Server, 1 JON Agent, 1 SOA-P Server
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Larry O'Leary 2010-09-03 17:05:53 EDT
Customer has experienced a major performance issue with metric collection.  An agent which only contained 1 RHQ Agent, 1 RHQ Server, and 1 SOA-P Server, was spending more then 50 seconds collecting metrics for the 60 second metric collection schedule.  This resulted in metric collection on a 30 second schedule to be delayed by about 80 seconds.  After investigation, the conclusion is that the number of metrics being collected at the 60 second schedule was just too much.  Although there was no CPU bottle-neck or network bottle-neck, the issue was that there were just too many individual metrics to be collected.  In this case, the cause was directly related to JBoss AS5 Queues.  In this case there were approximately 80.  The default metric schedule configuration for Queues seems to be to collect a lot of metrics very rapidly.  

	Consumer Count	The number of consumers on the queue	MEASUREMENT	Yes	00:01:00
	Count	The total message count since startup or last counter reset	MEASUREMENT	Yes	00:01:00
	Count Delta	The message count delta since last method call	MEASUREMENT	Yes	00:01:00
	Created Programmatically	Was this queue created programmatically? If Yes, the queue will not survive a restart of the application server. If No, the queue was created via a deployment XML file.	TRAIT	Yes	00:10:00
	Delivering Count	The number of messages currently being delivered	MEASUREMENT	Yes	00:01:00
	Depth	The current message count of pending messages within the queue waiting for dispatch	MEASUREMENT	Yes	00:01:00
	Depth Delta	The message count delta of pending messages since last method call	MEASUREMENT	Yes	00:01:00
	Message Count	The number of messages in the queue	MEASUREMENT	Yes	00:01:00
	Message Counter History Day Limit	This queue's message counter history day limit - <0: unlimited, =0: history disabled, >0: maximum day count	TRAIT	Yes	00:10:00
	Run State	Run State	TRAIT	Yes	00:00:30
	Scheduled Message Count	The number of scheduled messages in the queue	MEASUREMENT	Yes	00:01:00
	Time Last Update	The timestamp of the last message add	MEASUREMENT	Yes	00:01:00

The recommendation is to increase these schedules to 5 minutes to even longer or at least turn some of them off.  Additionally, much of this issue could be resolved by simply making metric collection schedules occur in a non-synchronous fashion.
Comment 2 Charles Crouch 2012-03-08 16:44:56 EST
Metric schedules were reviewed as part of JON3.0 and no metrics should be collected at an interval of 1min any longer