Bug 1025848

Summary: "Resource classLoader not found" error seen on upgraded agents, race condition with plugin download
Product: [Other] RHQ Project Reporter: Elias Ross <genman>
Component: AgentAssignee: Nobody <nobody>
Status: NEW --- QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.9CC: gerhard.dreschler, hrupp, jshaughn
Target Milestone: ---   
Target Release: RHQ 4.13   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Elias Ross 2013-11-01 18:00:27 UTC
Description of problem:

Example log:

2013-10-31 06:35:37,795 ERROR [InventoryManager.discovery-1] (InventoryManager)- Failed to start component for Resource[id=172928, uuid=2cfa801a-b2a6-41a2-9e1
b-5ad4afacbeb8, type={RHQAgent}Launcher Script, key=launcherscript, name=RHQ Agent Launcher Script, parent=RHQ Agent, version=4.5.1] from synchronized merge.
org.rhq.core.clientapi.agent.PluginContainerException: Could not build component for Resource [Resource[id=172928, uuid=2cfa801a-b2a6-41a2-9e1b-5ad4afacbeb8, type={RHQAgent}Launcher Script, key=launcherscript, name=RHQ Agent Launcher Script, parent=RHQ Agent, version=4.5.1]]
        at org.rhq.core.pc.inventory.InventoryManager.prepareResourceForActivation(InventoryManager.java:1815)
        at org.rhq.core.pc.inventory.InventoryManager.activateResource(InventoryManager.java:1892)
        at org.rhq.core.pc.inventory.InventoryManager.refreshResourceComponentState(InventoryManager.java:3360)
        at org.rhq.core.pc.inventory.InventoryManager.processSyncInfo(InventoryManager.java:2928)
        at org.rhq.core.pc.inventory.InventoryManager.processSyncInfo(InventoryManager.java:2935)
        at org.rhq.core.pc.inventory.InventoryManager.processSyncInfo(InventoryManager.java:2935)
        at org.rhq.core.pc.inventory.InventoryManager.synchInventory(InventoryManager.java:1231)
        at org.rhq.core.pc.inventory.InventoryManager.synchInventory(InventoryManager.java:1200)
        at org.rhq.core.pc.inventory.InventoryManager.handleReport(InventoryManager.java:1182)
        at org.rhq.core.pc.inventory.AutoDiscoveryExecutor.call(AutoDiscoveryExecutor.java:130)
        at org.rhq.core.pc.inventory.AutoDiscoveryExecutor.run(AutoDiscoveryExecutor.java:92)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.rhq.core.clientapi.agent.PluginContainerException: [Warning] Resource classLoader not found for Resource[id=172928, uuid=2cfa801a-b2a6-41a2-9e1b-5ad4afacbeb8, type={RHQAgent}Launcher Script, key=launcherscript, name=RHQ Agent Launcher Script, parent=RHQ Agent, version=4.5.1] - cannot create ResourceComponent.
        at org.rhq.core.clientapi.agent.PluginContainerException.wrapIfNecessary(PluginContainerException.java:69)
        at org.rhq.core.clientapi.agent.PluginContainerException.<init>(PluginContainerException.java:96)
        ... 20 more

Seems like the container is starting resources before the plugins are updated:

2013-10-31 00:02:01,384 INFO  [RHQ Server Polling Thread] (PluginUpdate)- {PluginUpdate.downloading}Downloading the plugin [rhq-snmp-plugin-5.0.2-SNAPSHOT.jar
]...
2013-10-31 00:02:01,774 ERROR [main] (InventoryManager)- Exception thrown while upgrading [Resource[id=172193, uuid=90aa3130-9cfe-453b-a24c-aab315af5d29, type
={RHQAgent}RHQ Agent, key=... RHQ Agent, name=RHQ Agent, parent=...., version=4.5.1]].
org.rhq.core.clientapi.agent.PluginContainerException: Could not build component for Resource [Resource[id=172193, uuid=90aa3130-9cfe-453b-a24c-aab315af5d29, 
type={RHQAgent}RHQ Agent, key=-... RHQ Agent, name=RHQ Agent, parent=..., version=4.5.1]]
        at org.rhq.core.pc.inventory.InventoryManager.prepareResourceForActivation(InventoryManager.java:1815)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResourceRecursively(InventoryManager.java:3420)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResourceRecursively(InventoryManager.java:3432)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResources(InventoryManager.java:3391)
        at org.rhq.core.pc.inventory.InventoryManager.initialize(InventoryManager.java:252)
        at org.rhq.core.pc.PluginContainer.startContainerService(PluginContainer.java:515)
        at org.rhq.core.pc.PluginContainer.initialize(PluginContainer.java:328)
        at org.rhq.enterprise.agent.AgentMain.startPluginContainer(AgentMain.java:1913)
        at org.rhq.enterprise.agent.AgentMain.start(AgentMain.java:665)
        at org.rhq.enterprise.agent.AgentMain.main(AgentMain.java:434)
Caused by: org.rhq.core.clientapi.agent.PluginContainerException: [Warning] Resource classLoader not found for Resource[id=172193, uuid=90aa3130-9cfe-453b-a24c-aab315af5d29, type={RHQAgent}RHQ Agent, key=... RHQ Agent, name=RHQ Agent, parent=..., version=4.5.1] - cannot create ResourceComponent.
        at org.rhq.core.clientapi.agent.PluginContainerException.wrapIfNecessary(PluginContainerException.java:69)
        at org.rhq.core.clientapi.agent.PluginContainerException.<init>(PluginContainerException.java:96)
        ... 10 more
2013-10-31 00:02:01,883 INFO  [RHQ Server Polling Thread] (PluginUpdate)- {PluginUpdate.downloading-complete}The plugin [snmp] has been updated at [rhq-snmp-plugin-5.0.2-SNAPSHOT.jar].

To me it seems like a race condition going on here.



Version-Release number of selected component (if applicable): 4.9 (upgraded from 4.5.1)


How reproducible: Unclear. Has happened on a number of hosts.


Steps to Reproduce:
1. Start agent in older version
2. Upgrade plugins
3. Restart agent talking to new server.

Actual results:

Exception above.


Expected results:

No exception.


Additional info: May be simply a race condition for some hosts. Not clear why plugins are not updated first, then the container started.

Comment 1 Elias Ross 2013-11-01 18:02:46 UTC
Note: this is with patch, Bug 991149

Comment 2 Elias Ross 2013-11-04 16:51:13 UTC
I'm guessing the issue is the agent waited a bit for the download to complete, saw that it started, but timed out, then the plugin container started with the plugins in the process of being downloaded. With a very busy server (maxed out on threads, for example), this can happen I suppose.

AgentMain.java:

                    boolean updating = PluginUpdate.waitForUpdateToComplete(pc_config, 30000L); // false
                    int after = plugin_dir.list().length;

                    if ((after == 0) && !updating) { // true && !false -> true
                        // still nothing and it doesn't look like we are downloading - try to update them again right now
                        // (doing this because I saw a case where the startup update somehow happened just prior to the
                        // registration finishing, so the original update was rejected by the server as "unauthorized")
                        updatePluginsNow(m_clientSender);
                        after = plugin_dir.list().length;
                    }

                    keep_waiting = ((after == 0) || (updating)); // evals false

I think it's pretty fragile to check for the size of a directory. It should be checking if the download process completed or not, not the state of the filesystem. Why can't the PluginUpdate object return the number of plugins to download and the number of plugins downloaded?

Comment 3 Elias Ross 2013-11-12 18:23:10 UTC
I'm actually thinking that it's not necessarily a plugin download race condition, but the initialization mess that I've addressed in Bug 991149  . 

        ClassLoader resourceClassloader = resourceContainer.getResourceClassLoader();
        if (resourceClassloader == null) {
            throw new PluginContainerException("Resource classLoader not found for " + resource + " - cannot create ResourceComponent.");
        }

I need to trace this better, obviously.

Comment 4 Elias Ross 2013-11-12 18:28:01 UTC
Seems like it has to do with the inventory manager having a sync issue with the server. I could reproduce the problem every time, except after uninv. the agent, the problem went away.

Comment 5 Heiko W. Rupp 2014-03-11 14:52:21 UTC
Pushing to 4.11

Comment 6 Heiko W. Rupp 2014-05-08 14:42:55 UTC
Bump the target version now that 4.11 is out.

Comment 7 Jay Shaughnessy 2014-07-07 16:02:12 UTC
Bumping again.  Is this still an issue?