Red Hat Bugzilla – Bug 923210
High CPU usage in RHQ 4.6 with Ignored resurces
Last modified: 2014-02-25 08:07:37 EST
Created attachment 712631 [details]
Command trace on the RHQ Agent.
Description of problem:
High CPU usage in RHQ 4.6 when having Ignored resources while also having a disabled plugin of the same type.
Tested on RHEL 6.2 and on Fedora but should be OS independent.
High CPU is caused by RHQ Server constantly polling the RHQ Agents about these resources even when they are set to Ignored.
CPU/load increase, network traffic increase as well as constant requests to DiscoveryServerService.getResourcesAsList() on the agent for the resources that are in the Ignored list.
See discussion on the commity forums, including images of CPU (User load) on RHQ server:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Uninventory some OS services like Cron, Cobbler, GRUB, OpenSSH, Samba, Postfix and wait until they show up in discovery queue.
2. Do an Ignore on the discovered resources.
3. Disable the Agent Plugins for these services.
4. Do an Update Plugins on the RHQ Agent(s) so the agents have only the enabled plugins.
5. Possibly do a restart of the RHQ server (not sure if needed).
Simple recreation (quote John Mazzitelli):
"I replicated this and I see some odd behavior stepping through the code using your replication steps. Things don't have to be committed into inventory first - you can just start a clean server and agent, then, ignore a resource (can be anything) and then disable that plugin that defined the type of resource you just ignored. Once you do that, this odd behavior happens."
Constant polling of Ignored resources leading to high network traffic and high CPU usage on RHQ server. Dependent on number of platforms, the load increases gradually the more agents are connected, where resources are Ignored.
One would expect that simply having a resource in the Ignore list and disabling the agent would not cause excessive CPU usage and contant polling these resources.
the latest code in branch bug/rhq-1 has this fixed.
big commit to master: a75b8c371d6f8ce096adc0ddd9bf00d2d753b500
this addresses BZ 535289 but also should fix this issue, too.
Was implemented in RHQ 4.7 already - closing