Bug 797331
Summary: | RHQ discovers network adapters that don't exist. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | Elias Ross <genman> | ||||
Component: | Plugins | Assignee: | Heiko W. Rupp <hrupp> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.2 | CC: | hrupp, jshaughn | ||||
Target Milestone: | --- | ||||||
Target Release: | RHQ 4.5.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-09-01 10:03:26 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Elias Ross
2012-02-25 00:19:48 UTC
triage 2/27/2012 mfoley, asantos, crouch, loleary Created attachment 568075 [details]
rebased on 3d6614872f0bca31baf6cf76e83bba5fe02f70ed
Elias, do you happen to have a patch against the current state of the plugin? If so could you attach it? If not, I will manually apply the change Heiko Implemented in master 0054844 in a different way: If the adapter is down during org.rhq.plugins.platform.NetworkAdapterComponent#start, we request the availability context to mark it as disabled. This way it is sill discovered, but not shown as down, but as disabled. If the admin plugs in a cable, the interface comes up and the admin can then set the adapter in inventory to enabled. We could in theory check if the resource is disabled and when availability checks return UP also automatically enable it- unfortunately the availabilityContext does not yet support this. public AvailabilityType getAvailability() { if (getInfo().getOperationalStatus() == NetworkAdapterInfo.OperationState.UP) { + if (context.getAvailabilityContext().isDisabled()) + context.getAvailabilityContext().enable(); return AvailabilityType.UP; } return AvailabilityType.DOWN; } There was an issue with this change which I saw "hang" my windows agent for 15 minutes at Agent/PC start. I've made a supplementary commit and am setting to ON_QA. master commit 9dc14ed91d0e3fa4260644d649503c522eea0c2e Author: Jay Shaughnessy <jshaughn> Date: Fri Jan 18 15:01:23 2013 -0500 It is OK to DISABLE some NetworkAdapters at component start. But do this once-per-start check in a thread because for DOWN adapters this can be a slow call. Also the disable() method requires a server round trip. Together this can seemingly hang agent startup (actually plugin container startup) as component starts are done sequentially. The change still hangs up the agent, which is really annoying when restarting the agent. On some hosts I have 4-5 disabled adapters, each check takes about a minute to check each one on RHQ agent startup. 2013-02-15 01:05:00,759 INFO [ResourceContainer.invoker.daemon-1] (NetworkAdapterComponent)- Disabled eth5 as it was down on start 2013-02-15 01:05:00,759 ERROR [main] (InventoryManager)- Exception thrown while activating [Resource[id=166018, uuid=c2643c56-79e3-4dbb-b816-e4b8de01e20b, type={Platforms}Network Adapter, key=eth5, name=eth5, parent=st11p01ad-syslog001.apple.com]]. org.rhq.core.clientapi.agent.PluginContainerException: Failed to start component for resource Resource[id=166018, uuid=c2643c56-79e3-4dbb-b816-e4b8de01e20b, type={Platforms}Network Adapter, key=eth5, name=eth5, parent=st11p01ad-syslog001.apple.com]. at org.rhq.core.pc.inventory.InventoryManager.activateResource(InventoryManager.java:1777) at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResourceRecursively(InventoryManager.java:3106) at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResourceRecursively(InventoryManager.java:3108) at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResources(InventoryManager.java:3067) at org.rhq.core.pc.inventory.InventoryManager.initialize(InventoryManager.java:237) at org.rhq.core.pc.PluginContainer.startContainerService(PluginContainer.java:488) at org.rhq.core.pc.PluginContainer.initialize(PluginContainer.java:308) at org.rhq.enterprise.agent.AgentMain.startPluginContainer(AgentMain.java:1926) at org.rhq.enterprise.agent.AgentMain.start(AgentMain.java:660) at org.rhq.enterprise.agent.AgentMain.main(AgentMain.java:429) Caused by: org.rhq.core.pc.inventory.TimeoutException: [Warning] Call to [org.rhq.plugins.platform.NetworkAdapterComponent.start()] with args [[org.rhq.core.pluginapi.inventory.ResourceContext@59ec59df]] timed out after 60000 milliseconds - invocation thread will be interrupted. at org.rhq.core.clientapi.agent.PluginContainerException.wrapIfNecessary(PluginContainerException.java:69) at org.rhq.core.clientapi.agent.PluginContainerException.<init>(PluginContainerException.java:96) ... 10 more I am not seeing this hanging behavior any longer. Are you certain you have the fix applied? I updated to 4.5.1 which doesn't have this fix. Somehow I thought the fix was in that version. My mistake. I still don't think discovery should bother adding network adapters that are not enabled. The issue of setting them disabled at start I don't really care about either as they were never enabled in the first place. It seems like two separate issues. Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since. |