Bug 923210 - High CPU usage in RHQ 4.6 with Ignored resurces
High CPU usage in RHQ 4.6 with Ignored resurces
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Agent, Communications Subsystem, Inventory, Operations, Performance, Monitoring (Show other bugs)
4.6
All Linux
unspecified Severity high (vote)
: ---
: RHQ 4.7
Assigned To: John Mazzitelli
Mike Foley
:
Depends On:
Blocks: RHQ-1
  Show dependency treegraph
 
Reported: 2013-03-19 08:39 EDT by Stian Lund
Modified: 2014-02-25 08:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-02-25 08:07:37 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Command trace on the RHQ Agent. (127.00 KB, application/x-zip-compressed)
2013-03-19 08:39 EDT, Stian Lund
no flags Details

  None (edit)
Description Stian Lund 2013-03-19 08:39:16 EDT
Created attachment 712631 [details]
Command trace on the RHQ Agent.

Description of problem:

High CPU usage in RHQ 4.6 when having Ignored resources while also having a disabled plugin of the same type.

Tested on RHEL 6.2 and on Fedora but should be OS independent.

High CPU is caused by RHQ Server constantly polling the RHQ Agents about these resources even when they are set to Ignored. 

CPU/load increase, network traffic increase as well as constant requests to DiscoveryServerService.getResourcesAsList() on the agent for the resources that are in the Ignored list.

See discussion on the commity forums, including images of CPU (User load) on RHQ server:

https://community.jboss.org/message/803572?tstart=0


Version-Release number of selected component (if applicable):

RHQ 4.6

How reproducible:

Reproducible.

Steps to Reproduce:

1. Uninventory some OS services like Cron, Cobbler, GRUB, OpenSSH, Samba, Postfix and wait until they show up in discovery queue.

2. Do an Ignore on the discovered resources.

3. Disable the Agent Plugins for these services.

4. Do an Update Plugins on the RHQ Agent(s) so the agents have only the enabled plugins.

5. Possibly do a restart of the RHQ server (not sure if needed).

Simple recreation (quote John Mazzitelli):
"I replicated this and I see some odd behavior stepping through the code using your replication steps. Things don't have to be committed into inventory first - you can just start a clean server and agent, then, ignore a resource (can be anything) and then disable that plugin that defined the type of resource you just ignored. Once you do that, this odd behavior happens."

  
Actual results:

Constant polling of Ignored resources leading to high network traffic and high CPU usage on RHQ server. Dependent on number of platforms, the load increases gradually the more agents are connected, where resources are Ignored.

Expected results:

One would expect that simply having a resource in the Ignore list and disabling the agent would not cause excessive CPU usage and contant polling these resources.






Additional info:
Comment 1 John Mazzitelli 2013-03-28 20:44:38 EDT
the latest code in branch bug/rhq-1 has this fixed.
Comment 7 John Mazzitelli 2013-04-09 17:47:49 EDT
big commit to master: a75b8c371d6f8ce096adc0ddd9bf00d2d753b500

this addresses BZ 535289 but also should fix this issue, too.
Comment 8 Heiko W. Rupp 2014-02-25 08:07:37 EST
Was implemented in RHQ 4.7 already - closing

Note You need to log in before you can comment on or make changes to this bug.