Bug 960936

Summary: agent does not restart after plugin delete from the server
Product: [Other] RHQ Project Reporter: vlad crc <vlad.craciunoiu>
Component: AgentAssignee: Heiko W. Rupp <hrupp>
Status: CLOSED NEXTRELEASE QA Contact: Mike Foley <mfoley>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.4CC: hrupp, vlad.craciunoiu
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-05 17:38:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
agent log before shutdown
none
agent log after manual restart none

Description vlad crc 2013-05-08 10:44:04 UTC
Description of problem:
The agent does not restart itself after an agent plugin is deleted from the server.

Version-Release number of selected component (if applicable):
4.4

How reproducible:
always

Steps to Reproduce:
1. I have a Postgres server monitored
2. remove the Postgres agent plugin from server GUI
  
Actual results:
The agent is down and the platform is reported as unavailable. I need to manually restart the agent.

Expected results:
The agent should restart by itself and the Postgres server should not appear anymore in GUI.


Additional info:

What happens can be seen in the agent log, first attachment.

The agent tries to send inventory report to server, it realizes there is a resource of a type that doesn't exist anymore on server so the agent needs to be restarted to purge stale type.

Before shutting down there is an error because it cannot delete the tmp folder.

Then it tries to start again but doesn't succeed and it remains in this state and it needs to be restarted manually.

After manual restart the log sais like in the second attachment.

Comment 1 vlad crc 2013-05-08 10:45:36 UTC
Created attachment 745202 [details]
agent log before shutdown

Comment 2 vlad crc 2013-05-08 10:46:10 UTC
Created attachment 745203 [details]
agent log after manual restart

Comment 3 Heiko W. Rupp 2013-06-29 15:07:48 UTC
I fail to reproduce this issue on 4.8/current master - could you retry with that?

Those messages about tmp folders are "a known issue" - that is something known.

What I see though is this (which is harmless and in fact the plugin jar just vanishes):

2013-06-28 16:47:43,682 WARN  [RHQ Server Polling Thread] (org.rhq.enterprise.agent.PluginUpdate)- {PluginUpdate.plugin-not-on-server}The plugin [plugins/rhq-cron-plugin-4.9.0-SNAPSHOT.jar] does not exist on the Server - renaming it to [rhq-cron-plugin-4.9.0-SNAPSHOT.jar.REJECTED] so it will not get deployed by the Plugin Container.
2013-06-28 16:47:43,683 ERROR [RHQ Server Polling Thread] (org.rhq.enterprise.agent.PluginUpdate)- {PluginUpdate.plugin-rename-failed}Failed to rename illegitimate plugin [plugins/rhq-cron-plugin-4.9.0-SNAPSHOT.jar] to [rhq-cron-plugin-4.9.0-SNAPSHOT.jar.REJECTED].

This failure to rename is now fixed in master 5cd5fad9099

Comment 4 vlad crc 2013-07-05 17:37:26 UTC
I retried with 4.7 and the problem I reported is gone; it is present the one you mention, so I close the bug.