Bug 976882 - Agent upgrade fails on slow server startup
Summary: Agent upgrade fails on slow server startup
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Agent
Version: 4.7
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: RHQ 4.8
Assignee: RHQ Project Maintainer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-21 18:26 UTC by Stefan Negrea
Modified: 2013-09-11 09:53 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-09-11 09:53:43 UTC
Embargoed:


Attachments (Terms of Use)
Agent log file (4.72 MB, text/x-log)
2013-06-21 18:26 UTC, Stefan Negrea
no flags Details
storage_node_95 (168.66 KB, image/png)
2013-06-24 14:36 UTC, Armine Hovsepyan
no flags Details
storage_node_106 (156.33 KB, image/png)
2013-06-24 14:36 UTC, Armine Hovsepyan
no flags Details
rhqctl_upgrade-agent_logs (619.81 KB, image/png)
2013-06-24 14:37 UTC, Armine Hovsepyan
no flags Details

Description Stefan Negrea 2013-06-21 18:26:58 UTC
Created attachment 763933 [details]
Agent log file

Description of problem:
Upgrade from pre-4.8 to 4.8 fails when the server is slow to startup. The agent gives up on waiting the server to become available and installs old plugins causing all sorts of issues. The problem does not get corrected when the server starts properly.


How reproducible:
Every time on environments where the server is slow to startup (eg. slower processor, disk, network).

Steps to Reproduce:
1. Install a pre-4.8 RHQ environment
2. Upgrade to RHQ 4.8 
3. Check to see the RHQ Storage Node is imported

Actual results:
The agent starts but with errors. Please see attached log file. The newly installed RHQ Storage Node is not discovered and inventoried by the agent.

Expected results:
The upgrade succeeds, the agent starts and the newly installed RHQ Storage Node is inventoried automatically.


Additional info:
This issue can be fixed by clearing all plugins from the agent during the upgrade process.

Comment 1 John Mazzitelli 2013-06-21 18:30:23 UTC
I think this stems from the fact that when you upgrade an agent, the new agent gets the old agent's plugins. We don't want this. We should leave the plugins directory empty in the new agent and make it download the new plugins from the new server.

So in rhq-agent-update-build.xml, we need to remove these lines:

-      <!-- if there are plugins, keep them -->
-      <echo>Copy existing plugins from the old agent to the new agent</echo>
-      <copy todir="${_update.tmp.dir}/rhq-agent/plugins">
-        <fileset dir="${rhq.agent.update.update-agent-dir}/plugins"/>
-      </copy>

Now, when the new agent starts, it can't start the PC until it downloads the new plugins from the new server.

Comment 2 John Mazzitelli 2013-06-21 18:33:28 UTC
pushed to master: a1ae22c

Comment 3 Armine Hovsepyan 2013-06-24 14:32:16 UTC
verified.

upgrade from 4.5.1 in 10.16.23.95 and 10.16.23.106 went well - storage node was discovered and auto-inventoried, no more exceptions in the log.

Please get screenshots attached.

Comment 4 Armine Hovsepyan 2013-06-24 14:36:12 UTC
Created attachment 764657 [details]
storage_node_95

Comment 5 Armine Hovsepyan 2013-06-24 14:36:37 UTC
Created attachment 764658 [details]
storage_node_106

Comment 6 Armine Hovsepyan 2013-06-24 14:37:14 UTC
Created attachment 764659 [details]
rhqctl_upgrade-agent_logs

Comment 7 Heiko W. Rupp 2013-09-11 09:53:43 UTC
Bulk closing of old issues now that HRQ 4.9 is in front of the door.

If you think the issue has not been solved, then please open a new bug and mention this one in the description.


Note You need to log in before you can comment on or make changes to this bug.