Bug 797331

Summary: RHQ discovers network adapters that don't exist.
Product: [Other] RHQ Project Reporter: Elias Ross <genman>
Component: PluginsAssignee: Heiko W. Rupp <hrupp>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 4.2CC: hrupp, jshaughn
Target Milestone: ---   
Target Release: RHQ 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-01 06:03:26 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Description Flags
rebased on 3d6614872f0bca31baf6cf76e83bba5fe02f70ed none

Description Elias Ross 2012-02-24 19:19:48 EST
Description of problem:

Hosts I manage ("Red Hat Enterprise Linux Server release 5.6 (Tikanga)") somehow 'discover' more network adapters that don't exist.

Then unfortunately RHQ shows them as unavailable.

And even removing them they are added back.

Version-Release number of selected component (if applicable):

4.2 (seen in 4.1 also)

How reproducible:


Additional info:

Likely hardware configuration related. Discovery should probably skip those that are unavailable at discovery time.
Comment 1 Mike Foley 2012-02-27 12:12:47 EST
triage 2/27/2012 mfoley, asantos, crouch, loleary
Comment 2 Elias Ross 2012-03-06 17:01:30 EST
Created attachment 568075 [details]
rebased on 3d6614872f0bca31baf6cf76e83bba5fe02f70ed
Comment 3 Heiko W. Rupp 2012-06-06 11:40:08 EDT
do you happen to have a patch against the current state of the plugin?
If so could you attach it?
If not, I will manually apply the change

Comment 4 Heiko W. Rupp 2012-07-04 03:58:29 EDT
Implemented in master 0054844  in a different way:

If the adapter is down during org.rhq.plugins.platform.NetworkAdapterComponent#start, we request the availability context to mark it as disabled. This way it is sill discovered, but not shown as down, but as disabled. If the admin plugs in a cable, the interface comes up and the admin can then set the adapter in inventory to enabled.

We could in theory check if the resource is disabled and when availability checks return UP also automatically enable it- unfortunately the availabilityContext does not yet support this.

    public AvailabilityType getAvailability() {
        if (getInfo().getOperationalStatus() == NetworkAdapterInfo.OperationState.UP) {
+            if (context.getAvailabilityContext().isDisabled())
+                context.getAvailabilityContext().enable();
            return AvailabilityType.UP;

        return AvailabilityType.DOWN;
Comment 5 Jay Shaughnessy 2013-01-18 15:56:39 EST
There was an issue with this change which I saw "hang" my windows agent for 15 minutes at Agent/PC start. I've made a supplementary commit and am setting to ON_QA.

master commit 9dc14ed91d0e3fa4260644d649503c522eea0c2e
Author: Jay Shaughnessy <jshaughn@redhat.com>
Date:   Fri Jan 18 15:01:23 2013 -0500

    It is OK to DISABLE some NetworkAdapters at component start.  But do
    this once-per-start check in a thread because for DOWN adapters this can
    be a slow call. Also the disable() method requires a server round trip.
    Together this can seemingly hang agent startup (actually plugin container
    startup) as component starts are done sequentially.
Comment 6 Elias Ross 2013-02-14 20:06:13 EST
The change still hangs up the agent, which is really annoying when restarting the agent.

On some hosts I have 4-5 disabled adapters, each check takes about a minute to check each one on RHQ agent startup.

2013-02-15 01:05:00,759 INFO  [ResourceContainer.invoker.daemon-1] (NetworkAdapterComponent)- Disabled eth5 as it was down on start
2013-02-15 01:05:00,759 ERROR [main] (InventoryManager)- Exception thrown while activating [Resource[id=166018, uuid=c2643c56-79e3-4dbb-b816-e4b8de01e20b, type={Platforms}Network Adapter, key=eth5, name=eth5, parent=st11p01ad-syslog001.apple.com]].
org.rhq.core.clientapi.agent.PluginContainerException: Failed to start component for resource Resource[id=166018, uuid=c2643c56-79e3-4dbb-b816-e4b8de01e20b, type={Platforms}Network Adapter, key=eth5, name=eth5, parent=st11p01ad-syslog001.apple.com].
        at org.rhq.core.pc.inventory.InventoryManager.activateResource(InventoryManager.java:1777)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResourceRecursively(InventoryManager.java:3106)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResourceRecursively(InventoryManager.java:3108)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResources(InventoryManager.java:3067)
        at org.rhq.core.pc.inventory.InventoryManager.initialize(InventoryManager.java:237)
        at org.rhq.core.pc.PluginContainer.startContainerService(PluginContainer.java:488)
        at org.rhq.core.pc.PluginContainer.initialize(PluginContainer.java:308)
        at org.rhq.enterprise.agent.AgentMain.startPluginContainer(AgentMain.java:1926)
        at org.rhq.enterprise.agent.AgentMain.start(AgentMain.java:660)
        at org.rhq.enterprise.agent.AgentMain.main(AgentMain.java:429)
Caused by: org.rhq.core.pc.inventory.TimeoutException: [Warning] Call to [org.rhq.plugins.platform.NetworkAdapterComponent.start()] with args [[org.rhq.core.pluginapi.inventory.ResourceContext@59ec59df]] timed out after 60000 milliseconds - invocation thread will be interrupted.
        at org.rhq.core.clientapi.agent.PluginContainerException.wrapIfNecessary(PluginContainerException.java:69)
        at org.rhq.core.clientapi.agent.PluginContainerException.<init>(PluginContainerException.java:96)
        ... 10 more
Comment 7 Jay Shaughnessy 2013-02-18 10:00:17 EST
I am not seeing this hanging behavior any longer. Are you certain you have the fix applied?
Comment 8 Elias Ross 2013-02-19 20:02:54 EST
I updated to 4.5.1 which doesn't have this fix. Somehow I thought the fix was in that version. My mistake.

I still don't think discovery should bother adding network adapters that are not enabled.

The issue of setting them disabled at start I don't really care about either as they were never enabled in the first place. It seems like two separate issues.
Comment 9 Heiko W. Rupp 2013-09-01 06:03:26 EDT
Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since.