Bug 923210 - High CPU usage in RHQ 4.6 with Ignored resurces
Summary: High CPU usage in RHQ 4.6 with Ignored resurces
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Agent, Communications Subsystem, Inventory, Operations, Performance, Monitoring
Version: 4.6
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: RHQ 4.7
Assignee: John Mazzitelli
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: RHQ-1
TreeView+ depends on / blocked
 
Reported: 2013-03-19 12:39 UTC by Stian Lund
Modified: 2014-02-25 13:07 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-02-25 13:07:37 UTC
Embargoed:


Attachments (Terms of Use)
Command trace on the RHQ Agent. (127.00 KB, application/x-zip-compressed)
2013-03-19 12:39 UTC, Stian Lund
no flags Details

Description Stian Lund 2013-03-19 12:39:16 UTC
Created attachment 712631 [details]
Command trace on the RHQ Agent.

Description of problem:

High CPU usage in RHQ 4.6 when having Ignored resources while also having a disabled plugin of the same type.

Tested on RHEL 6.2 and on Fedora but should be OS independent.

High CPU is caused by RHQ Server constantly polling the RHQ Agents about these resources even when they are set to Ignored. 

CPU/load increase, network traffic increase as well as constant requests to DiscoveryServerService.getResourcesAsList() on the agent for the resources that are in the Ignored list.

See discussion on the commity forums, including images of CPU (User load) on RHQ server:

https://community.jboss.org/message/803572?tstart=0


Version-Release number of selected component (if applicable):

RHQ 4.6

How reproducible:

Reproducible.

Steps to Reproduce:

1. Uninventory some OS services like Cron, Cobbler, GRUB, OpenSSH, Samba, Postfix and wait until they show up in discovery queue.

2. Do an Ignore on the discovered resources.

3. Disable the Agent Plugins for these services.

4. Do an Update Plugins on the RHQ Agent(s) so the agents have only the enabled plugins.

5. Possibly do a restart of the RHQ server (not sure if needed).

Simple recreation (quote John Mazzitelli):
"I replicated this and I see some odd behavior stepping through the code using your replication steps. Things don't have to be committed into inventory first - you can just start a clean server and agent, then, ignore a resource (can be anything) and then disable that plugin that defined the type of resource you just ignored. Once you do that, this odd behavior happens."

  
Actual results:

Constant polling of Ignored resources leading to high network traffic and high CPU usage on RHQ server. Dependent on number of platforms, the load increases gradually the more agents are connected, where resources are Ignored.

Expected results:

One would expect that simply having a resource in the Ignore list and disabling the agent would not cause excessive CPU usage and contant polling these resources.






Additional info:

Comment 1 John Mazzitelli 2013-03-29 00:44:38 UTC
the latest code in branch bug/rhq-1 has this fixed.

Comment 7 John Mazzitelli 2013-04-09 21:47:49 UTC
big commit to master: a75b8c371d6f8ce096adc0ddd9bf00d2d753b500

this addresses BZ 535289 but also should fix this issue, too.

Comment 8 Heiko W. Rupp 2014-02-25 13:07:37 UTC
Was implemented in RHQ 4.7 already - closing


Note You need to log in before you can comment on or make changes to this bug.