Bug 1025848 - "Resource classLoader not found" error seen on upgraded agents, race condition with plugin download
"Resource classLoader not found" error seen on upgraded agents, race conditio...
Status: NEW
Product: RHQ Project
Classification: Other
Component: Agent (Show other bugs)
4.9
Unspecified Unspecified
unspecified Severity unspecified (vote)
: ---
: RHQ 4.13
Assigned To: RHQ Project Maintainer
Mike Foley
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-01 14:00 EDT by Elias Ross
Modified: 2014-10-13 08:01 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Elias Ross 2013-11-01 14:00:27 EDT
Description of problem:

Example log:

2013-10-31 06:35:37,795 ERROR [InventoryManager.discovery-1] (InventoryManager)- Failed to start component for Resource[id=172928, uuid=2cfa801a-b2a6-41a2-9e1
b-5ad4afacbeb8, type={RHQAgent}Launcher Script, key=launcherscript, name=RHQ Agent Launcher Script, parent=RHQ Agent, version=4.5.1] from synchronized merge.
org.rhq.core.clientapi.agent.PluginContainerException: Could not build component for Resource [Resource[id=172928, uuid=2cfa801a-b2a6-41a2-9e1b-5ad4afacbeb8, type={RHQAgent}Launcher Script, key=launcherscript, name=RHQ Agent Launcher Script, parent=RHQ Agent, version=4.5.1]]
        at org.rhq.core.pc.inventory.InventoryManager.prepareResourceForActivation(InventoryManager.java:1815)
        at org.rhq.core.pc.inventory.InventoryManager.activateResource(InventoryManager.java:1892)
        at org.rhq.core.pc.inventory.InventoryManager.refreshResourceComponentState(InventoryManager.java:3360)
        at org.rhq.core.pc.inventory.InventoryManager.processSyncInfo(InventoryManager.java:2928)
        at org.rhq.core.pc.inventory.InventoryManager.processSyncInfo(InventoryManager.java:2935)
        at org.rhq.core.pc.inventory.InventoryManager.processSyncInfo(InventoryManager.java:2935)
        at org.rhq.core.pc.inventory.InventoryManager.synchInventory(InventoryManager.java:1231)
        at org.rhq.core.pc.inventory.InventoryManager.synchInventory(InventoryManager.java:1200)
        at org.rhq.core.pc.inventory.InventoryManager.handleReport(InventoryManager.java:1182)
        at org.rhq.core.pc.inventory.AutoDiscoveryExecutor.call(AutoDiscoveryExecutor.java:130)
        at org.rhq.core.pc.inventory.AutoDiscoveryExecutor.run(AutoDiscoveryExecutor.java:92)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.rhq.core.clientapi.agent.PluginContainerException: [Warning] Resource classLoader not found for Resource[id=172928, uuid=2cfa801a-b2a6-41a2-9e1b-5ad4afacbeb8, type={RHQAgent}Launcher Script, key=launcherscript, name=RHQ Agent Launcher Script, parent=RHQ Agent, version=4.5.1] - cannot create ResourceComponent.
        at org.rhq.core.clientapi.agent.PluginContainerException.wrapIfNecessary(PluginContainerException.java:69)
        at org.rhq.core.clientapi.agent.PluginContainerException.<init>(PluginContainerException.java:96)
        ... 20 more

Seems like the container is starting resources before the plugins are updated:

2013-10-31 00:02:01,384 INFO  [RHQ Server Polling Thread] (PluginUpdate)- {PluginUpdate.downloading}Downloading the plugin [rhq-snmp-plugin-5.0.2-SNAPSHOT.jar
]...
2013-10-31 00:02:01,774 ERROR [main] (InventoryManager)- Exception thrown while upgrading [Resource[id=172193, uuid=90aa3130-9cfe-453b-a24c-aab315af5d29, type
={RHQAgent}RHQ Agent, key=... RHQ Agent, name=RHQ Agent, parent=...., version=4.5.1]].
org.rhq.core.clientapi.agent.PluginContainerException: Could not build component for Resource [Resource[id=172193, uuid=90aa3130-9cfe-453b-a24c-aab315af5d29, 
type={RHQAgent}RHQ Agent, key=-... RHQ Agent, name=RHQ Agent, parent=..., version=4.5.1]]
        at org.rhq.core.pc.inventory.InventoryManager.prepareResourceForActivation(InventoryManager.java:1815)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResourceRecursively(InventoryManager.java:3420)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResourceRecursively(InventoryManager.java:3432)
        at org.rhq.core.pc.inventory.InventoryManager.activateAndUpgradeResources(InventoryManager.java:3391)
        at org.rhq.core.pc.inventory.InventoryManager.initialize(InventoryManager.java:252)
        at org.rhq.core.pc.PluginContainer.startContainerService(PluginContainer.java:515)
        at org.rhq.core.pc.PluginContainer.initialize(PluginContainer.java:328)
        at org.rhq.enterprise.agent.AgentMain.startPluginContainer(AgentMain.java:1913)
        at org.rhq.enterprise.agent.AgentMain.start(AgentMain.java:665)
        at org.rhq.enterprise.agent.AgentMain.main(AgentMain.java:434)
Caused by: org.rhq.core.clientapi.agent.PluginContainerException: [Warning] Resource classLoader not found for Resource[id=172193, uuid=90aa3130-9cfe-453b-a24c-aab315af5d29, type={RHQAgent}RHQ Agent, key=... RHQ Agent, name=RHQ Agent, parent=..., version=4.5.1] - cannot create ResourceComponent.
        at org.rhq.core.clientapi.agent.PluginContainerException.wrapIfNecessary(PluginContainerException.java:69)
        at org.rhq.core.clientapi.agent.PluginContainerException.<init>(PluginContainerException.java:96)
        ... 10 more
2013-10-31 00:02:01,883 INFO  [RHQ Server Polling Thread] (PluginUpdate)- {PluginUpdate.downloading-complete}The plugin [snmp] has been updated at [rhq-snmp-plugin-5.0.2-SNAPSHOT.jar].

To me it seems like a race condition going on here.



Version-Release number of selected component (if applicable): 4.9 (upgraded from 4.5.1)


How reproducible: Unclear. Has happened on a number of hosts.


Steps to Reproduce:
1. Start agent in older version
2. Upgrade plugins
3. Restart agent talking to new server.

Actual results:

Exception above.


Expected results:

No exception.


Additional info: May be simply a race condition for some hosts. Not clear why plugins are not updated first, then the container started.
Comment 1 Elias Ross 2013-11-01 14:02:46 EDT
Note: this is with patch, Bug 991149
Comment 2 Elias Ross 2013-11-04 11:51:13 EST
I'm guessing the issue is the agent waited a bit for the download to complete, saw that it started, but timed out, then the plugin container started with the plugins in the process of being downloaded. With a very busy server (maxed out on threads, for example), this can happen I suppose.

AgentMain.java:

                    boolean updating = PluginUpdate.waitForUpdateToComplete(pc_config, 30000L); // false
                    int after = plugin_dir.list().length;

                    if ((after == 0) && !updating) { // true && !false -> true
                        // still nothing and it doesn't look like we are downloading - try to update them again right now
                        // (doing this because I saw a case where the startup update somehow happened just prior to the
                        // registration finishing, so the original update was rejected by the server as "unauthorized")
                        updatePluginsNow(m_clientSender);
                        after = plugin_dir.list().length;
                    }

                    keep_waiting = ((after == 0) || (updating)); // evals false

I think it's pretty fragile to check for the size of a directory. It should be checking if the download process completed or not, not the state of the filesystem. Why can't the PluginUpdate object return the number of plugins to download and the number of plugins downloaded?
Comment 3 Elias Ross 2013-11-12 13:23:10 EST
I'm actually thinking that it's not necessarily a plugin download race condition, but the initialization mess that I've addressed in Bug 991149  . 

        ClassLoader resourceClassloader = resourceContainer.getResourceClassLoader();
        if (resourceClassloader == null) {
            throw new PluginContainerException("Resource classLoader not found for " + resource + " - cannot create ResourceComponent.");
        }

I need to trace this better, obviously.
Comment 4 Elias Ross 2013-11-12 13:28:01 EST
Seems like it has to do with the inventory manager having a sync issue with the server. I could reproduce the problem every time, except after uninv. the agent, the problem went away.
Comment 5 Heiko W. Rupp 2014-03-11 10:52:21 EDT
Pushing to 4.11
Comment 6 Heiko W. Rupp 2014-05-08 10:42:55 EDT
Bump the target version now that 4.11 is out.
Comment 7 Jay Shaughnessy 2014-07-07 12:02:12 EDT
Bumping again.  Is this still an issue?

Note You need to log in before you can comment on or make changes to this bug.