Red Hat Bugzilla – Bug 743432
UI indicates failure to connect to resource even though resource reports available and is collecting metrics
Last modified: 2011-10-04 18:23:35 EDT
Description of problem:
Resource contains a yellow box indicating that a failure has occurred to connect/communicate with the resource:
The agent reported the following error on its last attempt (9/23/11, 10:25:03 AM, CDT) to connect to this resource:
Failed to start component for resource Resource[id=10060, type=JBossAS Server, key=/opt/jboss/jboss-eap-5.0/jboss-as/server/default, name=jboss-host default, parent=jboss-host, version=5.0.1].
For more details, see the stack trace.
Please make sure that the managed resource is running and that its connection properties are set correctly.
However, this message is wrong and old. Since this message was reported, the resource began working just fine and is reported as UP and is having its metrics collected.
Version-Release number of selected component (if applicable):
The primary cause of this issue is the RHQ errors table in the database contains an entry for this resource and is never cleared even if the original issue that caused the connection problem is resolved. The only time it seems that we clear this table is when the user edits the resource's connection properties and saves them.
In this case, it is suspected that the user added the resource to inventory but it JNP URL or JMX credentials were invalid. The resource was shutdown and the connection properties were updated. The error remained in the UI because the agent still could not start the resource even with the updated connection properties. However, once the resource was started back up, the agent was able to connect with no error and successfully began to collect availability and metrics. However, there was nothing on the server side that triggered the update of the error table to clear the original error condition.
The result, the user continues to see that the agent reported it was unable to communicate with the resource and the only way to clear it is to pretend to update the connection properties.