Bug 839662

Summary: Agent went down when trying to purge dependent plugins
Product: [Other] RHQ Project Reporter: Jay Shaughnessy <jshaughn>
Component: AgentAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED WORKSFORME QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: high    
Version: 4.4CC: hrupp, lkrejci
Target Milestone: ---   
Target Release: JON 3.1.1   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-16 20:02:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jay Shaughnessy 2012-07-12 14:05:15 UTC
I just saw this in my dev environment, I'm not sure if it's just me but I thought I should write it up...

I accidentally built the linux and misc plugins (eval plugins) and deployed them.  Since this slows down my server startup I tried to get rid of them so I:

0) Agent running in Console mode with all deployed plugins and started (i.e. prompt visible)
1) In the GUI, went to admin->agent plugins
2) Selected the following plugins:

[Sudo Access, Samba, Postfix, OpenSSH, Hosts, GRUB Boot Loader, Cron, Cobbler, Aliases, Abstract Augeas Plugin, Apache HTTP Server]

3) Deleted them
4) Turned on "Show Deleted" so I could see them
5) Selected them all and Purged them all in one request
6) As the purge was taking place I noticed in the server log that my Agent had gone down:

09:13:07,110 INFO  [AgentManagerBean] Agent with name [jshaughn] just went down

This is probably normal as the agent needs to remove the purged plugins.  But it never came up. II see the following in the Agent log, which shows a dependency issue.  To get going again I had to quite the agent, manually delete the agent plugins, and restart, when then pulled the correct plugin set from the server.


2012-07-12 09:18:53,260 ERROR [RHQ Agent Prompt Input Thread] (rhq.core.pc.plugin.PluginManager)- Error initializing plugin container
java.lang.IllegalArgumentException: Plugin [Augeas] is required by plugins [[Aliases]] but it does not exist in the dependency graph yet
	at org.rhq.core.clientapi.agent.metadata.PluginDependencyGraph.getDeepDependencies(PluginDependencyGraph.java:329)
	at org.rhq.core.clientapi.agent.metadata.PluginDependencyGraph.getDeepDependencies(PluginDependencyGraph.java:342)
	at org.rhq.core.clientapi.agent.metadata.PluginDependencyGraph.getDeploymentOrder(PluginDependencyGraph.java:245)
	at org.rhq.core.pc.plugin.PluginManager.initialize(PluginManager.java:155)
	at org.rhq.core.pc.PluginContainer.startContainerService(PluginContainer.java:488)
	at org.rhq.core.pc.PluginContainer.initialize(PluginContainer.java:306)
	at org.rhq.enterprise.agent.AgentMain.startPluginContainer(AgentMain.java:1926)
	at org.rhq.enterprise.agent.AgentMain.start(AgentMain.java:660)
	at org.rhq.enterprise.agent.promptcmd.StartPromptCommand.execute(StartPromptCommand.java:57)
	at org.rhq.enterprise.agent.AgentMain.executePromptCommand(AgentMain.java:2931)
	at org.rhq.enterprise.agent.AgentMain$5.run(AgentMain.java:2839)
	at java.lang.Thread.run(Thread.java:619)
2012-07-12 09:18:53,260 FATAL [RHQ Agent Prompt Input Thread] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.startup-error}The agent encountered an error during startup and must abort
java.lang.RuntimeException: Cannot initialize the plugin container
	at org.rhq.core.pc.plugin.PluginManager.initialize(PluginManager.java:176)
	at org.rhq.core.pc.PluginContainer.startContainerService(PluginContainer.java:488)
	at org.rhq.core.pc.PluginContainer.initialize(PluginContainer.java:306)
	at org.rhq.enterprise.agent.AgentMain.startPluginContainer(AgentMain.java:1926)
	at org.rhq.enterprise.agent.AgentMain.start(AgentMain.java:660)
	at org.rhq.enterprise.agent.promptcmd.StartPromptCommand.execute(StartPromptCommand.java:57)
	at org.rhq.enterprise.agent.AgentMain.executePromptCommand(AgentMain.java:2931)
	at org.rhq.enterprise.agent.AgentMain$5.run(AgentMain.java:2839)
	at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.IllegalArgumentException: Plugin [Augeas] is required by plugins [[Aliases]] but it does not exist in the dependency graph yet
	at org.rhq.core.clientapi.agent.metadata.PluginDependencyGraph.getDeepDependencies(PluginDependencyGraph.java:329)
	at org.rhq.core.clientapi.agent.metadata.PluginDependencyGraph.getDeepDependencies(PluginDependencyGraph.java:342)
	at org.rhq.core.clientapi.agent.metadata.PluginDependencyGraph.getDeploymentOrder(PluginDependencyGraph.java:245)
	at org.rhq.core.pc.plugin.PluginManager.initialize(PluginManager.java:155)
	... 8 more
2012-07-12 09:18:53,261 INFO  [RHQ Agent Prompt Input Thread] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.shutting-down}Agent is being shut down...
2012-07-12 09:18:53,261 INFO  [RHQ Agent Prompt Input Thread] (org.rhq.core.pc.PluginContainer)- Plugin container is already shut down.

Comment 1 Charles Crouch 2012-07-12 16:32:48 UTC
9:28:46 AM) ccrouch: spinder: what was the BZ you were looking at for this?
(9:29:03 AM) jshaughn: I think https://bugzilla.redhat.com/show_bug.cgi?id=828938
(9:29:44 AM) jshaughn: I would start by having someone else try to recreate this.  As I stated in the BZ, this was my dev env
(9:30:38 AM) spinder: ccrouch: yep. it's 828938 as jshaughn mentioned. 
(9:31:44 AM) spinder: jshaughn: the error that you report I think is normal if you're trying to delete a plugin with dependencies. It shouldn't have taken down the agent though.
(9:36:51 AM) jshaughn: spinder: Taking down the plugin container would be normal as it updates the plugins, my problem was that it couldn't complete the update and/or get started again
(9:38:46 AM) spinder: jshaughn: right. I guess what I meant was the the error message from the log should occur before deletion at the UI level. Only after you'd selected all dependent plugins for deletion .. should the delete actually proceeded ... followed by successful reboot.  afaik.
(9:39:30 AM) spinder: the actual delete, with correct dependency list, should not have generated that error.  If that's any clearer.
(9:39:44 AM) jshaughn: to be clear, the delete was successful.  I did include a valid set of plugins
(9:39:59 AM) jshaughn: this happened on purge as far as I can tell
(9:40:30 AM) jshaughn: but as I mentioned above, the purge was performed immediately after the plugins showed as deleted in the GUI
(9:41:26 AM) spinder: hmm. it looks likes there are still a few problems here then. 
(9:42:21 AM) jshaughn: Someone will need to try and recreate
(9:42:45 AM) jshaughn: but I'd say so, if the agent doesn't restart the PC, that's a definite issue
(9:42:56 AM) spinder: I'll try that a little later jshaughn. It's been something I wanted to spend some more time on.

Comment 2 Charles Crouch 2012-08-02 19:31:20 UTC
Assigning to Simeon to investigate since he was looking at 828938.

Comment 3 Charles Crouch 2012-08-15 13:35:18 UTC
Re-assigning to Jay, while spinder is out.

Comment 4 Lukas Krejci 2012-08-16 15:56:16 UTC
I couldn't reproduce this but while deploying/deleting/purging plugins I instead discovered bug 848868.

Comment 5 Jay Shaughnessy 2012-08-16 16:35:01 UTC
I also have not seen this again.  I suggest it be closed, it may have been an issue in my development environment. The newly discovered issue will need to be triaged separately.

Comment 6 Jay Shaughnessy 2012-08-16 20:02:25 UTC
Closing this as neither of can reproduce.